Databricks icon

Databricks

Interact with Databricks API

Overview

The node interacts with the Databricks API, specifically supporting operations related to files stored in Unity Catalog volumes. The List Directory operation allows users to retrieve a list of files from a specified volume within a catalog and schema in Unity Catalog.

This node is beneficial when you want to programmatically explore or manage files stored in Databricks Unity Catalog volumes, for example:

  • Automating file inventory or audits.
  • Integrating file metadata retrieval into data pipelines.
  • Building workflows that depend on dynamic file listings from Databricks storage.

Properties

Name Meaning
Catalog Select a Unity Catalog to access files from.
Schema Select a schema from the chosen catalog.
Volume Select a volume from the chosen catalog and schema.
Additional Fields Optional parameters:
- Page Size Number of files to return per page (pagination control).
- Page Token Token for the next page of results (used for pagination).

Output

The output JSON contains the list of files retrieved from the specified volume. Each item in the output corresponds to a file's metadata as returned by the Databricks API for Unity Catalog volumes.

The node does not output binary data for this operation; it only returns JSON metadata about files.

Dependencies

  • Requires an API authentication token credential for Databricks with appropriate permissions to access Unity Catalog resources.
  • The node makes HTTP requests to the Databricks REST API endpoints under /api/2.1/unity-catalog/ and /api/2.0/fs/files/Volumes/.
  • The user must configure the Databricks host URL and provide a valid bearer token for authentication.

Troubleshooting

  • API Errors: If the API returns errors (e.g., 401 Unauthorized, 403 Forbidden), verify that the API token has sufficient permissions and that the host URL is correct.
  • Pagination Issues: When retrieving large directories, ensure to use the Page Token property correctly to paginate through all results.
  • Invalid Catalog/Schema/Volume: Selecting non-existent or unauthorized catalogs, schemas, or volumes will cause errors. Confirm these values exist and are accessible.
  • Network Errors: Network connectivity issues or incorrect host URLs can cause request failures. Check network settings and endpoint correctness.

Links and References

Discussion