WhatsApp Media Processor

Processa mídia do WhatsApp usando OpenAI

Actions3

Overview

This node, named "WhatsApp Media Processor," is designed to handle and transform WhatsApp messages and media content using OpenAI services. It supports multiple operations, including:

Dividir Mensagem (Split Message): Splits a long text message into smaller parts based on a word limit.
Processar Mídia (Process Media): Processes different types of WhatsApp media messages such as text, audio, images, and PDF documents. For example, it can transcribe audio using OpenAI's Whisper model, analyze images with GPT-4 Vision models, or extract text from PDFs.
Agrupar Mensagens Automático (Auto Group Messages): Automatically groups fragmented messages from the same phone number within a specified time window.

Practical Use Cases

Splitting large customer messages into manageable chunks for further processing or sending.
Transcribing voice notes received via WhatsApp into text.
Analyzing images sent by users to extract descriptions or visible text.
Extracting text content from PDF documents shared over WhatsApp.
Aggregating multiple short messages from a user into a single consolidated message.

This node is particularly useful in automations involving WhatsApp communication where media processing and message management are required, such as customer support bots, CRM integrations, or content analysis workflows.

Properties

Name	Meaning
Mensagem para Dividir	The text message that will be split into smaller parts.
Telefone	Phone number associated with each part of the split message.
Limite de Palavras	Maximum number of words allowed per message part when splitting the message.

These properties are specifically used when the operation Dividir Mensagem (splitMessage) is selected.

Output

The node outputs an array of JSON objects, each representing a processed message part or result. Each output item has the following structure:

{
  "phone": "string",    // Phone number associated with the message
  "message": "string"   // Processed message text or transcription/analysis result
}

For the Dividir Mensagem operation, each output item corresponds to a chunk of the original message split according to the word limit.
For Processar Mídia, the message field contains:
- The original text if it's a text message and text processing is enabled.
- Transcribed text if the message is audio and audio processing is enabled.
- Image description or extracted text if the message is an image and image processing is enabled.
- Extracted text from PDF documents if PDF processing is enabled.
For Agrupar Mensagens Automático, the output contains grouped messages concatenated from multiple inputs within the configured time window.

The node does not output binary data directly; media files are downloaded and processed internally but only textual results are emitted.

Dependencies

Requires an API key credential for OpenAI to perform audio transcription and image analysis.
Requires credentials for accessing the WhatsApp Business API, including:
- WhatsApp Business ID
- WhatsApp API Key

These credentials must be configured in n8n for the node to function correctly.

Uses external libraries:
- axios for HTTP requests to WhatsApp API and OpenAI endpoints.
- openai SDK for interacting with OpenAI services.
- pdf-parse for extracting text from PDF documents.

Troubleshooting

Common Issues

Invalid or missing WhatsApp credentials: The node requires valid WhatsApp Business ID and API key to fetch media URLs and download media. Missing or incorrect credentials will cause errors.
Expired media URLs: WhatsApp media URLs may expire quickly. If a 404 error occurs during media download, retrying the operation usually resolves the issue.
Unsupported media types: The node only supports text, audio, image, and PDF document types. Other media types will trigger an error.
Disabled processing options: If processing for a specific media type (text, audio, image, PDF) is disabled, the node returns a message indicating that processing is turned off.
OpenAI API errors: Invalid API keys or quota limits on OpenAI may cause failures in transcription or image analysis.

Error Messages and Resolutions

"Formato de webhook inválido": The incoming webhook data does not have the expected structure. Verify the webhook payload format.
"Áudio inválido", "Imagem inválida", "Documento inválido": The media object lacks a valid ID. Ensure the media was properly received.
"URL da mídia não encontrada na resposta": Failed to retrieve media URL from WhatsApp API. Check API permissions and credentials.
"URL da mídia expirada. Por favor, tente novamente.": Media URL expired; retry the request.
"Tipo de mídia '...' não suportado": Unsupported media type encountered. Only text, audio, image, and PDF are supported.
"Erro ao agrupar mensagens: ..." or "Erro ao dividir mensagem: ...": Errors during grouping or splitting operations. Check input parameters and ensure correct usage.

Links and References

If you need details about other operations or resources, please provide their names.

WhatsApp Media ProcessorInstall