pdfvector

PDF Vector node for n8n to parse any PDF, Word, Excel, or image document, extract structured data, and access millions of academic papers.

Package Information

Downloads: 307 weekly / 2,489 monthly

Latest Version: 0.8.4

Author: PDF Vector

Available Nodes

PDF Vector

Convert PDFs, Word, Excel documents, and images to clean markdown, extract structured data with AI, process invoices with specialized parsing, and search millions of academic papers across PubMed, ArXiv, Google Scholar, and more.

Documentation

n8n-nodes-pdfvector

This is an n8n community node. It lets you use PDF Vector in your n8n workflows.

PDF Vector is a powerful document processing and academic research API service. It enables you to parse PDFs and Word documents into clean Markdown, extract structured data, and search across millions of academic publications from multiple databases.

n8n is a fair-code licensed workflow automation platform.

Installation
Operations
Credentials
Compatibility
Usage
Resources
Version history
Development
License

Installation

Follow the installation guide in the n8n community nodes documentation.

Go to Settings > Community Nodes.
Select Install.
Enter n8n-nodes-pdfvector in Enter npm package name.
Agree to the risks of using community nodes.
Select Install.

Install

Operations

Document Resource

Parse Document

Extract content from PDF/Word documents and convert to clean Markdown format.

Parameters:

Document URL: Direct URL to the PDF or Word document
Use LLM:
- auto (default) - System decides if LLM parsing is needed
- never - Basic parsing only (1 credit per page)
- always - Force LLM parsing (2 credits per page)

Supported Formats:

PDF files
Word documents (.doc, .docx)

Credit Usage: 1-2 credits per page depending on LLM usage

Ask Document

Ask questions about PDF/Word documents using AI analysis to get intelligent answers.

Parameters:

Document URL: Direct URL to the PDF or Word document
Prompt: Your question about the document (1-2000 characters)

Example Questions:

"What are the key findings in this research paper?"
"Summarize the methodology section"
"What conclusions does the author draw?"
"Extract all statistical results mentioned"

Credit Usage: 3 credits per page

Academic Resource

Search Publications

Search for academic publications across multiple databases with intelligent ranking.

Parameters:

Query: Search query string
Providers: Select which academic databases to search (PubMed, Semantic Scholar, Google Scholar, ArXiv, ERIC)
Limit: Maximum results per provider (1-100, default: 50)
Offset: Skip this many results per provider
Year From/To: Filter by publication year range
Fields: Choose which fields to include in the response

Credit Usage: 2 credit per search request

Fetch Publications

Retrieve specific academic publications by their identifiers with automatic provider detection.

Parameters:

IDs: Comma-separated list of publication IDs (DOI, PubMed ID, ArXiv ID, etc.)
Fields: Choose which fields to include in the response

Supported ID Types:

DOI (e.g., 10.1038/nature12373)
PubMed ID (e.g., 12345678)
ArXiv ID (e.g., 2301.12345)
Semantic Scholar ID (e.g., 85128297772)
ERIC ID (e.g., ED123456)

Credit Usage: 2 credit per fetch request

Credentials

To use this node, you'll need a PDF Vector API key. Here's how to get one:

Sign up for a PDF Vector account
Navigate to your Dashboard
Generate a new API key (it will start with pdfvector_)
In n8n:
- Go to Credentials → Add Credential
- Select PDF Vector API from the list
- Enter your API key
- Click Save

Credentials

Compatibility

n8n version: 0.202.0 or later
Node.js version: 20.15 or later

Usage

Example: Ask Questions About a Document

This workflow shows how to use the Ask operation to get AI-powered answers about a document:

{
  "nodes": [
    {
      "name": "Ask Document",
      "type": "n8n-nodes-pdfvector.pdfVector",
      "position": [250, 300],
      "parameters": {
        "resource": "document",
        "operation": "ask",
        "url": "https://example.com/research-paper.pdf",
        "prompt": "What are the main findings and conclusions of this research?"
      }
    }
  ]
}

The response will include:

markdown: AI-generated answer to your question
pageCount: Number of pages processed
creditCount: Credits consumed

Example: Parse a PDF and Search Related Papers

This workflow demonstrates how to:

Parse a PDF document to extract its content
Use the extracted content to search for related academic papers

{
  "nodes": [
    {
      "name": "Parse PDF",
      "type": "n8n-nodes-pdfvector.pdfVector",
      "position": [250, 300],
      "parameters": {
        "resource": "document",
        "operation": "parse",
        "url": "https://example.com/paper.pdf",
        "useLLM": "auto"
      }
    },
    {
      "name": "Search Related Papers",
      "type": "n8n-nodes-pdfvector.pdfVector",
      "position": [450, 300],
      "parameters": {
        "resource": "academic",
        "operation": "search",
        "query": "={{ $json.markdown.substring(0, 200) }}",
        "providers": ["semantic-scholar", "pubmed"],
        "limit": 10,
        "offset": 0
      }
    }
  ]
}

Example: Batch Fetch Publications

Fetch multiple publications by their DOIs:

{
  "parameters": {
    "resource": "academic",
    "operation": "fetch",
    "ids": "10.1038/nature12373,10.1126/science.1234567,PMC123456"
  }
}

Response Handling

All operations return structured JSON responses. Handle errors gracefully:

// In a Function node after PDF Vector
if ($json.error) {
  throw new Error($json.error.message);
}

// For academic search - check for partial errors
if ($json.errors && $json.errors.length > 0) {
  console.warn("Some providers failed:", $json.errors);
}

return $json.results;

Resources

Version history

0.1.0 - Initial release of the PDF Vector node for n8n.

Development

Check out documentation on creating nodes for detailed information on building and developing the node.

Install dependencies:

npm install

Build the node

npm run build

Link the node to n8n from the node directory

npm link

In your ~/.n8n/nodes directory, link the node:

npm link n8n-nodes-pdfvector

Run n8n:

n8n start

Once the node is linked, you need to only rebuild and restart n8n to see the changes.

License

This project is licensed under the MIT License.

pdfvector

Package Information

Available Nodes

Documentation

n8n-nodes-pdfvector

Table of Contents

Installation

Operations

Document Resource

Parse Document

Ask Document

Academic Resource

Search Publications

Fetch Publications

Credentials

Compatibility

Usage

Example: Ask Questions About a Document

Example: Parse a PDF and Search Related Papers

Example: Batch Fetch Publications

Response Handling

Resources

Version history

Development

License

Discussion

pdfvectorInstall

Package Information

Available Nodes

Documentation

n8n-nodes-pdfvector

Table of Contents

Installation

Operations

Document Resource

Parse Document

Ask Document

Academic Resource

Search Publications

Fetch Publications

Credentials

Compatibility

Usage

Example: Ask Questions About a Document

Example: Parse a PDF and Search Related Papers

Example: Batch Fetch Publications

Response Handling

Resources

Version history

Development

License

Discussion

pdfvector