OpenAI GPT-5 icon

OpenAI GPT-5

Process PDFs and images with OpenAI GPT-5

Overview

This node integrates with the OpenAI GPT-5 API to process and analyze PDF files and images using advanced language models. It allows users to provide a textual prompt describing what they want the model to do with the uploaded files or images, such as extracting insights, summarizing content, or answering questions based on the file contents.

Common scenarios where this node is beneficial include:

  • Automating document analysis workflows by extracting key information from PDFs.
  • Generating summaries or insights from collections of images or scanned documents.
  • Enhancing data processing pipelines with AI-powered reasoning over mixed media inputs.
  • Combining web search capabilities with file analysis for up-to-date contextual responses.

Practical example:

  • A user uploads multiple PDF reports and images, then prompts the node: "Summarize the main findings and highlight any trends." The node processes the files with GPT-5, optionally performs web searches for current context, and returns a detailed summary with citations.

Properties

Name Meaning
Prompt The instruction or question you want GPT-5 to perform on the provided files (PDFs/images). Example: "Analyze these files and provide insights."
PDF Files PDF file IDs to process. Can be a single ID, an array of IDs, or an expression resolving to such. These files must be uploaded beforehand to obtain their IDs.
Images Image URLs or file IDs to process. Can be a single URL/ID, an array, or an expression resolving to such.
Options Collection of options controlling the GPT-5 request:
- Max Tokens Maximum number of tokens the model can generate in the response. Default is 4096.
- Model Choice of OpenAI model variant to use. Options include GPT-4.1 variants, GPT-5 variants, and advanced reasoning models like O3. Default is "GPT-5".
- Quick Response Mode Boolean flag to optimize for speed over quality. When enabled, it uses lower reasoning effort and medium search context, switching to faster model variants automatically.
- Reasoning Effort Level of reasoning effort the model should apply: Low (faster), Medium (balanced), High (maximum reasoning). Default is Medium.
- Reasoning Summary Whether and how to include a reasoning summary in the response: None, Auto (model decides), Concise, or Detailed. Default is None.
- Temperature Controls randomness of the output. Range 0 (deterministic) to 2 (creative). Default is 0.7.
- Timeout Request timeout in seconds (60 to 1800). Default is 600. Note that n8n's global execution timeout may also apply.
Web Search Collection of options to enable and configure web search integration:
- Enable Web Search Boolean to allow the model to perform web searches for current information.
- Allowed Domains Comma-separated list of domains to restrict web search results to (max 20).
- Include Sources Whether to include the list of all sources searched in the response. Default is true.
- Search Context Size Amount of context retrieved from web searches: Low, Medium (default), or High.
- User Location Approximate location details (country, city, region, timezone) to localize search results.

Output

The node outputs JSON data containing the following fields:

  • text: The main textual response generated by the GPT-5 model based on the prompt and input files.
  • model: The model variant used for the response.
  • usage: Token usage statistics returned by the API.
  • pdfCount: Number of PDF files processed.
  • imageCount: Number of image files processed.
  • timeout: Timeout setting used for the request.
  • quickMode: Whether quick response mode was enabled.
  • webSearch: (optional) Details about the web search performed, including query, status, and domains.
  • citations: (optional) Array of citation objects referencing URLs mentioned in the response text.
  • sources: (optional) List of source documents or URLs included when web search sources are requested.
  • reasoningSummary: (optional) Text summary of the model’s reasoning if requested.
  • fullResponse: The complete raw response object from the OpenAI API for debugging or extended use.

If the node processes binary data (e.g., images or PDFs), it does not output binary data directly but references files by IDs or URLs in the input. The output focuses on the textual analysis results.

Dependencies

  • Requires an API key credential for authenticating with the OpenAI GPT-5 service.
  • The node makes HTTP POST requests to the OpenAI API endpoint (default: https://api.openai.com/v1/responses).
  • Optional web search functionality requires enabling and configuring allowed domains and location parameters.
  • For longer executions, environment variable EXECUTIONS_TIMEOUT may need adjustment to avoid premature termination.

Troubleshooting

  • Timeout Errors: If the request times out, consider increasing the Timeout property, enabling Quick Response Mode for faster processing, lowering Reasoning Effort, or adjusting n8n's global execution timeout environment variable.
  • Invalid File IDs or URLs: Ensure that PDF file IDs and image URLs/IDs are valid and accessible. Incorrect or missing files will cause errors or empty results.
  • API Authentication Failures: Verify that the API key credential is correctly configured and has necessary permissions.
  • Exceeded Token Limits: If the model returns errors about token limits, reduce the Max Tokens setting or simplify the prompt.
  • Web Search Issues: If web search is enabled but no results appear, check domain restrictions, user location settings, and network connectivity.
  • Error Messages: The node surfaces HTTP error messages from the API. Common messages include timeouts, unauthorized access, or invalid parameters. Review the message details and adjust configuration accordingly.

Links and References

Discussion