Apify icon

Apify

Apify API

Overview

This node interacts with the Apify API to retrieve items from datasets. It is useful for scenarios where users need to fetch structured data stored in Apify datasets, such as web scraping results or processed data collections. For example, a user can specify a dataset ID and retrieve a limited number of items in JSON or CSV format, optionally filtering or transforming the data with various parameters.

Use Case Examples

  1. Fetching the first 50 items from a dataset in JSON format for further processing.
  2. Downloading a dataset as a CSV file with a custom delimiter for reporting purposes.
  3. Retrieving only specific fields from dataset items to reduce payload size.

Properties

Name Meaning
Dataset Id The unique identifier of the dataset or a combination of username and dataset name to specify which dataset to retrieve items from.
Format The format in which the results should be returned, such as json, csv, xml, etc.
Clean If true, returns only non-empty items and skips hidden fields (fields starting with #).
Offset Number of items to skip from the start of the dataset.
Limit Maximum number of items to return from the dataset.
Fields Comma-separated list of fields to include in the output items, effectively filtering the output to only these fields.
Omit Comma-separated list of fields to exclude from the output items.
Unwind Comma-separated list of fields to unwind, turning arrays or objects into separate records merged with the parent object.
Flatten Comma-separated list of fields to flatten nested objects into flat structures with dot notation keys.
Desc If true, returns results in reverse order.
Attachment If true, forces the response to be downloaded as a file by setting the Content-Disposition header.
Delimiter Delimiter character for CSV format outputs.
Bom Include or exclude UTF-8 Byte Order Mark in text responses, especially CSV files.
Xml Root Overrides the default root element name for XML output.
Xml Row Overrides the default element name for each item in XML output.
Skip Header Row If true, skips the header row in CSV output.
Skip Hidden If true, skips fields starting with # from the output.
Skip Empty If true, skips empty items from the output.
Simplified If true, applies legacy parameters to emulate simplified results from the legacy Apify Crawler product.
Skip Failed Pages If true, skips items with errorInfo property, emulating legacy API behavior.
Use Custom Body Whether to use a custom request body instead of default query parameters.

Output

JSON

  • items - Array of dataset items retrieved from the specified dataset.

Dependencies

  • Requires Apify API credentials (API key) to authenticate requests.

Troubleshooting

  • Ensure the Dataset Id is correct and accessible with the provided API credentials to avoid authorization errors.
  • If the output is empty, check if 'clean', 'skipEmpty', or 'skipHidden' parameters are filtering out all items.
  • For CSV format, verify the delimiter and BOM settings to ensure proper file formatting.
  • If using 'simplified' or 'skipFailedPages', be aware these emulate legacy behavior and might exclude expected data.

Links

Discussion