Blab Document Parse

Convert documents into structured HTML/Markdown using Upstage Document Parse

Actions4

Overview

This node integrates with the Upstage Document Parse API to convert documents into structured HTML, Markdown, or text formats. It supports synchronous parsing of uploaded files, asynchronous submission and retrieval of parsing requests, and listing of all asynchronous requests. This node is useful for automating document digitization workflows, extracting structured content from PDFs, images, or other document formats, and handling large or multiple documents asynchronously.

Use Case Examples

A user uploads a PDF file to synchronously parse it into HTML for further processing.
A user submits a document asynchronously to the API and later retrieves the parsed result using the request ID.
A user lists all asynchronous document parsing requests to monitor their status.

Properties

Name	Meaning
Operation	Specifies the type of document parsing operation to perform: synchronous parse, asynchronous submit, asynchronous get result, or asynchronous list requests.
Binary Property	Name of the binary property in the input item that contains the file to be parsed. Required for synchronous and asynchronous submit operations.
Model	Selects the document parsing model to use, such as the recommended 'document-parse' or 'document-parse-nightly'.
OCR	Determines whether to perform OCR on the document before layout detection. 'Auto' applies OCR only to image documents; 'Force' always performs OCR.
Base64 Encoding Categories	Select categories of layout elements (figure, table, equation, chart) for which cropped base64 images should be returned.
Merge Multipage Tables	Whether to merge tables that span multiple pages into a single table.
Output Formats	Specifies which output formats to include in the response, such as HTML, Markdown, or plain text.
Include Coordinates	Whether to include bounding box coordinates for each layout element in the output.
Chart Recognition	Whether to enable chart recognition, converting charts into tables if true.
Return	For synchronous parse operation, specifies which part of the response to return: full response, content as HTML, Markdown, text, or elements array.
Request ID	The request ID of a previously submitted asynchronous parsing request, required for retrieving the result.

Output

JSON

request_id - The unique identifier of an asynchronous document parsing request.
submitted - Boolean indicating if the asynchronous request was successfully submitted.
html - Parsed document content in HTML format (for synchronous parse with HTML return mode).
markdown - Parsed document content in Markdown format (for synchronous parse with Markdown return mode).
text - Parsed document content in plain text format (for synchronous parse with text return mode).
elements - Array of layout elements extracted from the document (for synchronous parse with elements return mode).
error - Error message if the operation failed and continueOnFail is enabled.
statusCode - HTTP status code of the error if available.
timestamp - Timestamp when the error occurred.

Dependencies

Requires an API key credential for the Upstage Document Parse API (referred to as 'blabApi' in the node).

Troubleshooting

Error 'No binary data found in property "".' indicates the specified binary property does not exist or is empty in the input item. Ensure the correct binary property name is provided and the input item contains binary data.

Missing or invalid request ID for asynchronous get operation will cause an error. Provide a valid request ID to retrieve results.
HTTP request failures may occur due to network issues or invalid API credentials. Verify API key and network connectivity.
If the node throws errors and continueOnFail is disabled, the workflow will stop. Enable continueOnFail to handle errors gracefully and continue processing other items.

Blab Document Parse

Actions4

Overview

Use Case Examples

Properties

Output

JSON

Dependencies

Troubleshooting

Links

Discussion

Blab Document ParseInstall

Actions4

Overview

Use Case Examples

Properties

Output

JSON

Dependencies

Troubleshooting

Links

Discussion

Blab Document Parse