Dataiku DSS icon

Dataiku DSS

Use the Dataiku DSS API

Actions364

Overview

This node integrates with the Dataiku DSS API, enabling users to interact programmatically with various Dataiku DSS resources and operations. Specifically, for the Machine Learning - Lab resource and the Get Subpopulation Analysis of Trained Model operation, it retrieves computed subpopulation analyses for a trained machine learning model within a project.

This functionality is useful in scenarios where data scientists or ML engineers want to analyze how different subpopulations (segments) of data behave under a trained model, helping to understand model fairness, performance variations, or biases across groups.

Practical example:
You have trained a classification model on customer data and want to examine its performance across different demographic groups (e.g., age ranges, regions). Using this node operation, you can fetch detailed subpopulation analysis results from Dataiku DSS to inform further model improvements or reporting.

Properties

Name Meaning
Project Key The unique identifier of the Dataiku DSS project containing the trained model.
Analysis ID The identifier of the specific analysis context within the Machine Learning Lab.
ML Task ID The identifier of the machine learning task associated with the trained model.
Model Full ID The full identifier of the trained model for which the subpopulation analysis is requested.

These properties are required to specify the exact trained model and analysis context from which to retrieve subpopulation analysis data.

Output

The output JSON contains the subpopulation analyses computed for the specified trained model. This typically includes detailed metrics and insights about model behavior on various subgroups of the dataset.

  • The output is returned as JSON data representing the subpopulation analysis results.
  • No binary data output is expected for this operation.

Dependencies

  • Requires an active connection to a Dataiku DSS instance.
  • Requires valid API credentials (an API key credential) for authentication against the Dataiku DSS API.
  • The node expects the Dataiku DSS server URL and user API key to be configured in the credentials.

Troubleshooting

  • Missing Credentials Error: If the API credentials are not provided or invalid, the node will throw an error indicating missing credentials.
  • Required Parameter Missing: The node validates that all required parameters (Project Key, Analysis ID, ML Task ID, Model Full ID) are provided; otherwise, it throws descriptive errors.
  • API Request Failures: Network issues, incorrect URLs, or permission problems may cause API request failures. The node surfaces these errors with messages prefixed by "Error calling Dataiku DSS API".
  • Parsing Errors: If the API response is not valid JSON when expected, the node attempts to handle it gracefully but may return raw text or an error.

To resolve common issues:

  • Ensure all required input fields are correctly filled.
  • Verify API credentials and permissions.
  • Check network connectivity to the Dataiku DSS server.
  • Confirm that the specified project, analysis, ML task, and model IDs exist and are accessible.

Links and References

These links provide additional context on using the Dataiku DSS API and understanding subpopulation analyses in machine learning projects.

Discussion