schema-inferrer

An n8n community node for inferring JSON schemas from sample data using quicktype-core.

Package Information

Released: 11/12/2025
Downloads: 11 weekly / 44 monthly
Latest Version: 0.5.0
Author: Max Soukhomlinov

Documentation

n8n-nodes-schema-inferrer

An n8n community node for inferring JSON schemas from sample data using quicktype-core.

Features

  • Create Schema: Generate JSON Schema from one or multiple input JSON data items
    • Automatically merges multiple samples into a unified schema
    • Uses quicktype-core for robust schema inference
  • Generate SQL DDL: Convert JSON schemas to SQL CREATE TABLE statements
    • Supports multiple database types (PostgreSQL, MySQL, MariaDB, SQLite3, MSSQL, Oracle, CockroachDB)
    • Intelligent type mapping from JSON Schema to SQL column types
    • Automatic primary key detection or manual specification
    • Handles nullable/required fields and nested objects/arrays
    • CockroachDB uses the PostgreSQL dialect under the hood for SQL generation
    • Compact outputs to avoid large preview prompts in n8n
  • Prepare for Database: Serialize nested objects/arrays to JSON strings for PostgreSQL JSONB/JSON columns
    • Schema-based field identification
    • Prevents double-stringification
    • Optional pretty printing

Installation

Install the package in your n8n instance:

npm install n8n-nodes-schema-inferrer

Or if you're using n8n's community nodes feature, add it to your package.json:

{
  "dependencies": {
    "n8n-nodes-schema-inferrer": "^0.1.0"
  }
}

Usage

  1. Add the "Schema Inferrer" node to your workflow.
  2. Connect it to a node that outputs JSON data (one or multiple items).
  3. Execute the node to generate the inferred JSON Schema from all input items.

The node will automatically process all input items and merge them into a single unified JSON schema.

Example

Input JSON Items (from previous node)

Item 1:

{
  "id": "123",
  "name": "John Doe",
  "email": "john@example.com",
  "age": 30,
  "active": true
}

Item 2:

{
  "id": "456",
  "name": "Jane Smith",
  "email": "jane@example.com",
  "age": 25
}

Output JSON Schema (single item)

{
  "schema": {
    "$schema": "https://json-schema.org/draft/2020-12/schema",
    "type": "object",
    "properties": {
      "id": { "type": "string" },
      "name": { "type": "string" },
      "email": { "type": "string" },
      "age": { "type": "integer" },
      "active": { "type": "boolean" }
    },
    "required": ["id", "name", "email", "age"]
  }
}

Options to control output size

To prevent n8n’s “Display data?” modal when chaining multiple nodes, the node includes:

  • Minimise Output Size (default: on)
    • Trims non-essential parts of the schema for previews.
  • Include Definitions (default: off)
    • When disabled, omits the definitions block to keep results small.
  • Debug (from credentials, default: off)
    • When enabled, debug payloads are size-capped (~10KB) to avoid large items.

These options keep upstream node previews responsive even in longer workflows.

Note: The schema merges all input items, so properties that appear in all items will be marked as required, while optional properties (like active in the example above) may be marked as optional depending on their presence across samples.

Override Rules (Create Schema)

Use Override Options to remap inferred JSON Schema field types.

  • Matching precedence: first matching rule wins (top-down).
  • Paths: use dot notation like parent.child.leaf. Arrays are index-agnostic; rules traverse into items automatically.
  • Unions: if a field has multiple types, matching on any contains-type applies and sets a single type.

Quick input (comma-separated):

user.address.postcode:string->number, string->string

Advanced rules:

  • Mode: Full Path + Type or Type Only
  • Field Path (for Full Path + Type)
  • Current Type and New Type: one of string, number, integer, boolean, object, array, null

Naming Options

Control how field names are handled in the generated schema and SQL DDL.

Create Schema Operation

  • Lowercase All Fields (default: off)
    • When enabled, converts all property names to lowercase in the generated schema
    • Applies recursively to nested objects and arrays
    • Updates required arrays to match lowercased field names
    • Useful for ensuring consistent casing when generating SQL DDL later

Generate SQL DDL Operation

  • Lowercase All Fields (default: off)

    • When enabled, converts all property names to lowercase before generating SQL
    • Applies to the input schema if not already lowercased
    • Also lowercases user-provided primary key field names
    • Ensures column names match the schema when lowercasing is enabled
  • Quote Identifiers (default: off)

    • When enabled, quotes table and column names in the generated SQL
    • Preserves original case and special characters
    • Uses database-specific quoting:
      • PostgreSQL/CockroachDB: double quotes "identifier"
      • MySQL/SQLite: backticks `identifier`
      • MSSQL: square brackets [identifier]
      • Oracle: double quotes "identifier"
    • Automatically escapes inner quote characters
    • Recommended when preserving mixed case or using reserved words

Use Cases:

  • Enable "Lowercase All Fields" in both operations to ensure consistent casing from schema creation through DDL generation to data insertion
  • Enable "Quote Identifiers" when you need to preserve original field names with mixed case or special characters
  • Use both together when you want lowercase field names but need to quote them for compatibility with case-sensitive databases

Generate SQL DDL

The "Generate SQL DDL" operation converts a JSON schema to SQL CREATE TABLE statements.

Input (from previous Schema Inferrer node or any node with a schema):

{
  "schema": {
    "type": "object",
    "properties": {
      "id": { "type": "integer" },
      "name": { "type": "string" },
      "email": { "type": "string", "format": "email" },
      "age": { "type": "integer" },
      "active": { "type": "boolean" }
    },
    "required": ["id", "name", "email", "age"]
  }
}

Configuration:

  • Database Type: PostgreSQL
  • Table Name: users
  • Auto-detect Primary Key: true (will detect "id" field)
  • Naming Options (optional):
    • Lowercase All Fields: convert all field names to lowercase (default: off)
    • Quote Identifiers: quote table and column names in SQL (default: off)
  • Required Field Options:
    • Override Inferred Required: false (default; preserves inferred required)
    • Required Fields: optional comma-separated names to add as required
  • Override Options (optional):
    • Quick Rules: comma-separated rules to override schema field types before SQL generation
    • Advanced Rules: structured rule builder for more complex overrides
  • Debug (optional): enable via the Schema Inferrer Configuration credentials

Output:

{
  "sql": "create table \"users\" (\"id\" serial primary key, \"name\" varchar(255) not null, \"email\" varchar(255) not null, \"age\" integer not null, \"active\" boolean)",
  "tableName": "users",
  "databaseType": "pg"
}

The generated SQL can then be executed against your database or saved for later use.

Override Rules for SQL DDL

Override rules let you modify schema field types before SQL generation, giving you precise control over the final SQL column types.

Quick Rules Syntax:

  • Full path + type: path.to.field:type->newType
  • Type-only: type->newType
  • Multiple rules: comma-separated
  • Wildcards:
    • Contains: *part*->newType
    • Prefix: pre*->newType
    • Suffix: *suf->newType

Examples:

user.address.postcode:string->integer
createdAt:string->string
string->number
id:string->integer, age:string->integer
*created*->date-time, created*->date-time, *Date->date-time

Advanced Rules Builder: provides a structured UI for building rules with:

  • Mode: Full Path + Type or Type Only
  • Field Path: dot-notation path (e.g., user.address.postcode)
  • Current Type: the type to match
  • New Type: the type to replace with

Supported Types:

  • Basic JSON types: string, number, integer, boolean, object, array, null
  • SQL-specific types: uuid, date-time, date, time, json, jsonb, text

Behaviour:

  • First matching rule wins (top-down)
  • Dot-paths follow object nesting (e.g., parent.child.field)
  • Array items are addressed via their container path
  • Type aliases are normalised (e.g., intinteger, boolboolean)
  • Wildcards supported in Quick Rules (contains, prefix, suffix) as above

Nullability Preservation:

  • Preserve Nullability On Type Override (default: true)
    • When enabled, fields that originally allowed null values will maintain their nullability after type overrides
    • For example, if a field has type ["string", "null"] and you override it to uuid, it becomes ["string", "null"] with format uuid
    • This ensures that nullable fields in your schema remain nullable in the generated SQL DDL
    • When disabled, type overrides will strip nullability, potentially making fields NOT NULL if they're in the required array

Common Use Cases:

  • Convert string IDs to integers: id:string->integer
  • Force numeric postcodes: address.postcode:string->integer
  • Convert date strings to proper date-time: createdAt:string->date-time
  • Use UUID type for ID fields: id:string->uuid
  • Force long text fields: description:string->text
  • Store structured data as JSON: metadata:object->jsonb
  • Standardise date fields: date->date-time

Supported Database Types

  • PostgreSQL: Uses serial for auto-increment, jsonb for JSON data
  • MySQL/MariaDB: Uses int auto_increment, json for JSON data
  • SQLite3: Uses integer auto-increment, text for JSON data
  • MSSQL: Uses int identity, nvarchar(max) for JSON data
  • Oracle: Uses number, clob for JSON data
  • CockroachDB: PostgreSQL-compatible syntax (generated using Knex pg client)

Type Mapping

JSON Schema Type SQL Column Type Notes
string varchar(255) Uses text for long strings
integer integer/serial serial for primary keys
number decimal(10,2) Configurable precision
boolean boolean Database-specific
array/object jsonb/json/text Database-specific JSON support
string (format: uuid) uuid Native UUID type where supported
string (format: date-time) timestamp Native timestamp type
string (format: email) varchar(255) Standard string with validation

Prepare for Database

The "Prepare for Database" operation serializes nested objects and arrays to JSON strings for database insertion.

Use Case: When inserting data into PostgreSQL JSONB/JSON columns using n8n's PostgreSQL node with auto-mapping, nested objects need to be converted to JSON strings.

Required Input:

  • Schema (from Create Schema operation)
  • Data items to transform

Example Workflow:

  1. Create Schema → generates schema from sample data
  2. Generate SQL DDL → creates table definition
  3. PostgreSQL Execute → creates table
  4. HTTP Request → fetches data to insert
  5. Prepare for Database → serializes nested fields (references schema from step 1)
  6. PostgreSQL Insert → inserts with auto-mapping

Options:

  • Skip Already Stringified (default: true): Prevents double-stringification of fields that are already JSON strings
  • Pretty Print (default: false): Formats JSON with indentation (2 spaces) for readability
  • Strict Mode (default: false): Throws error on invalid schema instead of passing data through

Example:

Input Data:

{
  "id": 1,
  "name": "Test Customer",
  "storageRegions": [
    {"name": "ap-southeast-2", "storageProvider": 1}
  ]
}

Schema (from Create Schema):

{
  "properties": {
    "id": {"type": "integer"},
    "name": {"type": "string"},
    "storageRegions": {"type": "array"}
  }
}

Output (after Prepare for Database):

{
  "id": 1,
  "name": "Test Customer",
  "storageRegions": "[{\"name\":\"ap-southeast-2\",\"storageProvider\":1}]"
}

Now ready for PostgreSQL insertion with auto-mapping.

Debugging

To surface additional diagnostic info in the node output, create a credential of type Schema Inferrer Configuration and enable "Enable Debug Mode". When enabled:

  • Create Schema: output includes debug with quicktype options and required-field handling summary.
  • Generate SQL DDL: output includes debug with detected PK fields and the effective Knex client used (e.g., pg for CockroachDB).
  • Prepare for Database: logs when no object/array fields are found in schema.

License

Apache-2.0

Discussion