Databricks icon

Databricks

Interact with Databricks API

Overview

The node enables uploading files to a Databricks Unity Catalog volume via its API. It is designed to facilitate file management within the Databricks environment, specifically targeting the Unity Catalog's file storage system.

Typical use cases include:

  • Automating the upload of data files or configuration files into specific volumes in Unity Catalog.
  • Integrating file uploads as part of larger ETL or data processing workflows.
  • Managing files programmatically without manual intervention through the Databricks API.

For example, you can upload a CSV file containing new customer data directly into a designated catalog/schema/volume path, making it immediately available for downstream analytics or processing.

Properties

Name Meaning
Catalog Selects the Unity Catalog to access files from. The options are dynamically loaded from the Databricks API and represent available catalogs.
Schema Selects a schema within the chosen catalog. Options depend on the selected catalog and are dynamically fetched.
Volume Selects a volume within the chosen catalog and schema. Options depend on the selected catalog and schema and are dynamically fetched.
File Name (path) Specifies the name and relative path of the file to upload, e.g., "myfile.txt" or "folder/myfile.txt".
Binary Property The name of the binary property in the input data that contains the file content to be uploaded. Defaults to "data".
Content Type The MIME type of the file being uploaded. Options include: application/octet-stream (binary), text/plain, application/json, application/xml, and image/jpeg. Defaults to application/octet-stream.
Additional Fields - Overwrite Boolean flag indicating whether to overwrite an existing file at the target location if it exists. Defaults to false.

Output

The node outputs a JSON object per item processed with the following structure:

{
  "success": true,
  "message": "File uploaded successfully to <file path>"
}
  • success: A boolean indicating whether the upload was successful.
  • message: A descriptive message confirming the upload and specifying the file path.

No binary output is produced by this operation.

Dependencies

  • Requires an API authentication token credential for Databricks with appropriate permissions to access Unity Catalog and perform file uploads.
  • The node uses the Databricks REST API endpoint /api/2.0/fs/files/Volumes/{catalog}/{schema}/{volume}/{path} for uploading files.
  • The host URL and token must be configured in the node credentials.
  • Input data must contain the binary file data under the specified binary property.

Troubleshooting

  • Common Issues:

    • Incorrect catalog, schema, or volume selection may cause API errors due to invalid paths.
    • Missing or incorrect binary property name will result in failure to retrieve file data.
    • Insufficient permissions or invalid API token will cause authorization errors.
    • Attempting to upload a file that already exists without setting the overwrite flag may cause conflicts.
  • Error Messages:

    • API Error: <status> <statusText>: Indicates an error response from the Databricks API. Check the status code and message for details. Common causes include permission issues or invalid parameters.
    • Network Error: No response received from server: Indicates connectivity problems or misconfigured host URL.
    • Unsupported Genie operation: Not relevant for this operation but indicates misuse of the node parameters.
  • Resolutions:

    • Verify catalog, schema, and volume names are correct and accessible.
    • Ensure the binary property name matches the input data.
    • Confirm API token validity and required permissions.
    • Use the overwrite option if replacing existing files is intended.

Links and References

Discussion