GPT-Tokenizer
Encode / decodes BPE Tokens or check Token Limits before working with the OpenAI GPT models.
Overview
The GPT-Tokenizer node's "Slice to Max Token Limit" operation splits a given input string into multiple segments, each segment containing no more than a specified maximum number of tokens (as defined by OpenAI GPT tokenization rules). This is particularly useful when you need to process or send large texts to language models that have strict token limits per request. For example, if you want to summarize a long document using GPT-3/4, you can use this node to break the text into manageable chunks before sending them to the model.
Practical scenarios:
- Preprocessing long articles for chunked summarization.
- Splitting user input into safe blocks for chatbots with token constraints.
- Ensuring API requests to OpenAI do not exceed token limits.
Properties
| Name | Meaning |
|---|---|
| Input String | String to process. The text that will be split into token-limited slices. |
| Max Tokens | The max number of tokens to allow in each slice. |
| Destination Key | The key to write the results to. Leave empty to use the default destination key ("slices"). |
Output
- The output is an array of strings, each representing a segment of the original input string. Each segment contains at most the specified number of tokens.
- The array is stored in the output JSON under the key specified by Destination Key (or
"slices"if left empty).
Example output:
{
"slices": [
"First part of the text...",
"Second part of the text...",
"...etc."
]
}
Dependencies
- External library:
gpt-tokenizer(used for encoding and decoding tokens). - No external API keys or n8n-specific environment variables are required for this operation.
Troubleshooting
Common issues:
- Input String is not a string: Ensure the "Input String" property is filled with valid text.
- Input String field is empty: Provide a non-empty string to process.
- Provide Max Tokens. (bigger than 0): Set "Max Tokens" to a positive integer greater than zero.
Error messages and resolutions:
- "Input String is not a string" — Make sure your input is a text value, not a number or object.
- "Input String field is empty" — Enter some text in the "Input String" field.
- "Provide Max Tokens. (bigger then 0)" — Set "Max Tokens" to a value like 2048 or another positive integer.