Dataiku DSS icon

Dataiku DSS

Use the Dataiku DSS API

Actions364

Overview

The "Synchronize Hive Metastore" operation in the Dataset resource of this Dataiku DSS node synchronizes the Hive metastore table associated with a specified dataset. This means it updates or creates the Hive table so that its schema corresponds exactly to the dataset's schema in Dataiku DSS.

This operation is beneficial when you want to ensure that the Hive metastore reflects the current state of your dataset, especially after schema changes or dataset updates. It helps maintain consistency between Dataiku datasets and Hive tables, which is crucial for downstream processes relying on Hive metadata.

Practical example:
If you have a dataset in Dataiku DSS that you use for analytics and you want to query it via Hive or Spark SQL, running this synchronization ensures that the Hive metastore has the correct table definition matching your dataset schema.

Properties

Name Meaning
Project Key The unique key identifying the project containing the dataset.
Dataset Name The name of the dataset whose Hive metastore table should be synchronized.

Output

The output of this operation is the JSON response from the Dataiku DSS API after performing the synchronization action. Typically, this will include confirmation details about the synchronization status or any relevant metadata returned by the API.

The node does not output binary data for this operation.

Dependencies

  • Requires an active connection to a Dataiku DSS instance.
  • Requires valid API credentials (an API key) for authenticating with the Dataiku DSS API.
  • The node expects the Dataiku DSS server URL and user API key to be configured in the credentials.

Troubleshooting

  • Missing Credentials Error: If the API credentials are not provided or invalid, the node will throw an error indicating missing credentials.
  • Missing Required Parameters: The node requires both "Project Key" and "Dataset Name" to be set. Omitting either will cause an error specifying the missing parameter.
  • API Request Failures: Network issues, incorrect server URL, or insufficient permissions can cause API request failures. Check connectivity and credential permissions.
  • Unexpected Response Format: If the API returns a non-JSON response or an error message, the node attempts to parse it; failure to parse may result in raw text output or errors.

Links and References

Discussion