file-metadata

n8n node for extracting metadata from files (PDF, images, ebooks, archives, office docs, audio, video, markdown) with namespace support

Package Information

Released: 11/6/2025
Downloads: 17 weekly / 121 monthly
Latest Version: 1.0.10
Author: rdawgemfl

Documentation

n8n-nodes-file-metadata

n8n node for extracting metadata from files with namespace support for Qdrant filtering.

Features

  • Extract metadata from PDF files
  • Extract EXIF data from images
  • Process ebooks (EPUB)
  • Extract archive information (ZIP)
  • Parse Word documents
  • Read Excel spreadsheets
  • Get audio metadata
  • Extract video information
  • Parse markdown frontmatter
  • NEW: Automatic namespace generation for Qdrant vector store filtering

Namespace Support

This version automatically adds a namespace field to the metadata based on the document title:

  • Extracts title or info.Title from document metadata
  • Sanitizes the title to create a valid namespace (alphanumeric and underscores only)
  • Limits namespace length to 96 characters for Qdrant compatibility
  • Enables filtering with queries like:
{
  "must": [
    {
      "key": "metadata.namespace",
      "match": {
        "value": "Writing_521A_Creative_Writing"
      }
    }
  ]
}

Installation

npm install n8n-nodes-file-metadata@1.0.1

Usage

  1. Add the File Metadata Extractor node to your workflow
  2. Connect it to your document source
  3. Configure the binary property name (default: 'data')
  4. The node will output metadata including the new namespace field
  5. Use the namespace for filtering in Qdrant vector stores

Example Output

{
  "title": "Writing 521A Creative Writing",
  "author": "John Doe",
  "fileType": "PDF",
  "numberOfPages": 25,
  "namespace": "Writing_521A_Creative_Writing",
  "creationDate": "2024-01-15T10:30:00.000Z"
}

License

MIT

Discussion