Text Embeddings
Overview
This node converts input text into numerical embeddings using Transformer.js models locally, without relying on external API calls. It supports different pre-trained embedding models to generate vector representations of text, which are useful for tasks like semantic search, clustering, recommendation systems, or similarity comparisons.
Typical use cases include:
- Generating embeddings for short texts or documents to enable semantic similarity searches.
- Creating vector representations for downstream machine learning or data analysis workflows.
- Normalizing embeddings to improve the quality of similarity calculations.
- Optionally including metadata about the generated embeddings for auditing or debugging.
For example, you can input a product description and get its embedding vector to find similar products based on semantic content.
Properties
| Name | Meaning |
|---|---|
| Text Input | The raw text string to convert into embeddings. |
| Model | The embedding model to use: - all-MiniLM-L6-v2 (Recommended): Lightweight, 384 dimensions, fast and efficient.- all-mpnet-base-v2: Higher quality, 768 dimensions, slower but more accurate. |
| Output Field | The name of the output JSON field where the embeddings will be stored. |
| Normalize Embeddings | Whether to normalize the resulting embeddings vectors (recommended for similarity calculations). |
| Include Metadata | Whether to add metadata about the embeddings such as model used, vector dimensions, normalization status, input text length, and generation timestamp. |
Output
The node outputs an array of items, each containing the original input JSON extended with:
- A new field (name configurable via Output Field) containing the embeddings vector as an array of numbers.
- Optionally, if enabled, a metadata field named
<Output Field>_metadatawith details:model: The embedding model used.dimensions: Number of elements in the embedding vector.normalized: Boolean indicating if embeddings were normalized.text_length: Length of the input text.generated_at: ISO timestamp when embeddings were created.
If an error occurs for an item (e.g., empty text), the output for that item includes an error field describing the issue.
The node does not output binary data.
Dependencies
- Uses the
@huggingface/transformerslibrary's local pipeline feature-extraction models. - No external API calls or internet connection required once models are cached.
- Requires sufficient local resources to load and run the selected transformer model.
- No special n8n credentials or environment variables needed.
Troubleshooting
Common issues:
- Empty input text: The node throws an error unless "Continue On Fail" is enabled, in which case it marks the item with an error message.
- Model loading failure: If the specified model cannot be loaded (e.g., due to missing files or incompatible environment), the node throws an error indicating failure to load the embedding model.
- Performance: Larger models (like
all-mpnet-base-v2) require more memory and CPU time; ensure your environment can handle them.
Error messages:
"Text is empty": Input text was blank or whitespace only. Provide valid text or enable "Continue On Fail"."Failed to load embedding model: ...": Indicates problems loading the chosen model. Verify model availability and environment compatibility.