Dataiku DSS icon

Dataiku DSS

Use the Dataiku DSS API

Actions364

Overview

This node integrates with the Dataiku DSS API, allowing users to perform a wide range of operations on various Dataiku DSS resources. Specifically for the Dataset resource and the Get Schema operation, the node retrieves the schema definition of a specified dataset within a project. This is useful when you want to programmatically inspect the structure of your datasets, such as columns, types, and metadata, directly from Dataiku DSS.

Common scenarios include:

  • Automating data pipeline validations by fetching dataset schemas.
  • Dynamically adapting workflows based on dataset structures.
  • Integrating dataset schema information into documentation or monitoring systems.

Example: You have a dataset named "sales_data" in project "PROJ123". Using this node, you can fetch its schema to understand the column names and types before processing or transforming the data further.

Properties

Name Meaning
Project Key The unique key identifier of the Dataiku DSS project containing the dataset.
Dataset Name The name of the dataset whose schema you want to retrieve.

These properties are required to specify which dataset's schema will be fetched.

Output

The output JSON contains the schema information of the specified dataset as returned by the Dataiku DSS API. This typically includes details such as:

  • Columns and their data types.
  • Metadata about each column (e.g., description, format).
  • Possibly partitioning or other structural information depending on the dataset type.

If the operation involves downloading files (not applicable for Get Schema), binary data would be returned; however, for the Get Schema operation, the output is purely JSON.

Example output snippet (simplified):

{
  "columns": [
    {
      "name": "customer_id",
      "type": "string",
      "description": "Unique customer identifier"
    },
    {
      "name": "purchase_date",
      "type": "date",
      "description": "Date of purchase"
    }
  ],
  "partitioning": null,
  "metadata": { ... }
}

Dependencies

  • Requires an active connection to a Dataiku DSS instance.
  • Requires valid API credentials (an API key) for authentication with the Dataiku DSS API.
  • The node uses HTTP requests to communicate with the Dataiku DSS REST API endpoints.
  • No additional external libraries beyond those bundled with n8n are required.

Troubleshooting

  • Missing Credentials Error: If the node throws an error about missing credentials, ensure that you have configured the API key credential for Dataiku DSS correctly in n8n.
  • Project Key or Dataset Name Required: Errors indicating missing "Project Key" or "Dataset Name" mean these inputs were not provided. Make sure to fill these fields.
  • API Request Failures: Network issues, incorrect server URL, or invalid API keys can cause request failures. Verify connectivity and credentials.
  • Unexpected Response Format: If the response cannot be parsed as JSON, check if the API endpoint has changed or if there is an issue with the Dataiku DSS server.
  • Permission Denied: Ensure the API key has sufficient permissions to access the specified project and dataset.

Links and References


This summary focuses on the Dataset resource with the Get Schema operation, describing how the node constructs the API request, required inputs, and expected outputs based on static analysis of the source code and provided property definitions.

Discussion