Blab Information Extract
Extract structured data from documents/images using Upstage Information Extraction
Overview
This node integrates with the Upstage Information Extraction API to either extract structured data from documents or images based on a provided JSON schema or to generate a JSON schema from a document/image. It supports input as either binary data from a previous node or an image URL. The node is useful for automating data extraction from various document types, such as invoices, bank statements, or forms, and for dynamically generating schemas to guide extraction.
Use Case Examples
- Extract structured data from a scanned invoice image provided as binary data.
- Generate a JSON schema from a sample bank statement image URL to use for future data extraction.
- Extract data from multi-page documents by chunking pages to improve performance.
Properties
| Name | Meaning |
|---|---|
| Input Type | Specifies whether the input is binary data from a previous node or an image URL. |
| Binary Property | Name of the binary property containing the file, used when input type is binary. |
| Image URL | URL of the image to process, used when input type is URL. |
| Model | The model to use for information extraction or schema generation. |
| Guidance (optional) | Optional text instruction to influence schema generation. |
| Return | Specifies the format of the output: extracted JSON only, schema JSON only, or full API response. |
Output
JSON
extracted- The extracted structured data as JSON.model- The model used for extraction or schema generation.usage- API usage information such as token counts.full_response- The full response from the Upstage API.schema_type- The type of the generated schema (for schema generation operation).json_schema- The generated JSON schema object (for schema generation operation).raw- Raw schema data as received from the API (for schema generation operation).
Dependencies
- Upstage API accessed via HTTP with authentication using an API key credential named 'blabApi'.
Troubleshooting
- Error 'No binary data found in property' occurs if the specified binary property does not exist or is empty; ensure the binary property name matches the input data.
- Invalid JSON schema errors occur if the provided JSON schema or full response format JSON is malformed; validate JSON syntax before input.
- Image URL is required when input type is set to URL; ensure a valid URL is provided.
- API request failures may occur due to invalid credentials or network issues; verify API key and connectivity.
Links
- Upstage Information Extraction API - Official API documentation for the Upstage Information Extraction service used by this node.