Databricks icon

Databricks

Interact with Databricks API

Overview

The node enables querying a vector search index within the Databricks platform. It allows users to provide a query vector and retrieve similar vectors from a specified index, optionally filtering results by relevance score or custom filter expressions. This is useful in scenarios such as semantic search, recommendation systems, or any application requiring similarity search over high-dimensional data.

For example, a user can input a vector representing a text embedding and retrieve the most semantically similar documents stored in the index.

Properties

Name Meaning
Index Name The name of the vector search index to query.
Query Vector The vector used to search for similar vectors in the index.
Number of Results Maximum number of similar vectors to return (default is 10).
Score Threshold Minimum relevance score threshold for returned results; must be between 0 and 1.
Filter Expression Optional SQL-like expression to filter the results further based on metadata or attributes.

Output

The output contains a JSON object with the query results from the vector search index. Each result typically includes the matched vector's data along with its relevance score. If binary data were involved (not indicated here), it would represent associated files or embeddings, but this operation focuses on JSON results only.

Dependencies

  • Requires an API authentication token credential configured in n8n to access the Databricks API.
  • The node interacts with the Databricks Vector Search API endpoint.
  • Proper configuration of the Databricks host URL and token is necessary.

Troubleshooting

  • Common Issues:

    • Invalid or missing API credentials will cause authentication failures.
    • Providing an incorrectly formatted query vector may lead to errors or empty results.
    • Using a non-existent index name will result in API errors indicating the index was not found.
    • Score thresholds outside the 0 to 1 range may be rejected by the API.
  • Error Messages:

    • API Error: <status> <statusText> indicates issues returned by the Databricks API, such as unauthorized access or invalid parameters.
    • Network Error: No response received from server suggests connectivity problems or incorrect host configuration.
  • Resolutions:

    • Verify API credentials and permissions.
    • Ensure the query vector is correctly formatted as a string representation expected by the API.
    • Confirm the index name exists and is accessible.
    • Adjust score threshold values to be within valid bounds.

Links and References

Discussion