Databricks icon

Databricks

Interact with Databricks API

Overview

The "Get File Info" operation of the Databricks node allows users to retrieve metadata about a specific file stored within the Databricks workspace. This is useful for scenarios where you need to verify file existence, check file properties such as size or modification date, or gather information before processing or moving files programmatically.

Practical examples include:

  • Checking if a file exists at a given path before attempting to read or process it.
  • Retrieving file metadata to log or audit file usage.
  • Validating file details in automated workflows that manage data lakes or shared storage within Databricks.

Properties

Name Meaning
Path The full path to the file within the Databricks workspace whose information you want to retrieve. Example: /Volumes/my-catalog/my-schema/my-volume/directory/file.txt

Output

The output contains a JSON object with the file's metadata retrieved from the Databricks workspace. This typically includes attributes such as file size, creation and modification timestamps, file type, and possibly permissions or owner information depending on the API response.

If the node supports binary data output (not explicitly shown here), it would represent the actual file content or related binary attachments, but for this operation, the focus is on metadata only.

Dependencies

  • Requires an active connection to a Databricks workspace.
  • Needs an API authentication token credential configured in n8n to authorize requests.
  • The base URL for the Databricks API must be set correctly in the node credentials.

Troubleshooting

  • File Not Found Errors: If the specified path does not exist, the node will likely return an error indicating the file was not found. Verify the path string carefully, including case sensitivity and correct directory structure.
  • Authentication Failures: Ensure the API token credential is valid and has sufficient permissions to access the file metadata.
  • Invalid Path Format: Paths must follow the expected format for Databricks workspace files. Incorrect formatting may cause errors.
  • API Rate Limits or Network Issues: Temporary failures might occur due to network problems or API rate limiting; retrying after some time can help.

Links and References

Discussion