WhatsApp Media Processor icon

WhatsApp Media Processor

Processa mídia do WhatsApp usando OpenAI

Overview

The WhatsApp Media Processor node processes incoming WhatsApp messages containing various media types using OpenAI's services. It supports text, audio, image, and PDF document processing by leveraging OpenAI models for transcription, image analysis, and text extraction.

Common scenarios include:

  • Automatically transcribing voice messages into text.
  • Extracting detailed descriptions and visible text from images.
  • Parsing PDF documents to extract their textual content.
  • Handling and grouping or splitting WhatsApp messages for better workflow management.

Practical examples:

  • A customer support system that receives voice notes and converts them to text for easier handling.
  • An automated moderation tool that analyzes images sent via WhatsApp for inappropriate content or extracts text from screenshots.
  • A document processing pipeline that reads PDFs sent by users and extracts relevant information automatically.

Properties

Name Meaning
Processar Texto Enable processing of text messages.
Processar Áudio Enable processing of audio messages.
Modelo para Áudio OpenAI model used for audio transcription. Options: Whisper-1
Processar Imagem Enable processing of image messages.
Prompt para Análise de Imagem Custom prompt for image analysis describing what to extract from the image.
Modelo para Imagem OpenAI model used for image analysis. Options: GPT-4 Vision, GPT-4o, GPT-4o-mini, ChatGPT-4o-latest
Qualidade da Imagem Image quality setting for processing. Options: Auto, Baixa (Low), Alta (High)
Processar PDF Enable processing of PDF documents.
Timeout (ms) Maximum wait time in milliseconds for media downloads.
Tentativas de Retry Maximum number of retry attempts on failure.

Output

The node outputs an array of JSON objects, each containing:

  • phone: The phone number associated with the message sender.
  • message: The processed content extracted or generated from the media. This can be:
    • Text content directly from text messages.
    • Transcribed text from audio messages.
    • Descriptions and extracted visible text from images.
    • Extracted text from PDF documents.
    • Informative messages if processing is disabled or media type unsupported.

Binary data is not outputted; instead, media files are downloaded internally and processed to produce textual results.

Dependencies

  • Requires an API key credential for OpenAI to access transcription and chat completion services.
  • Requires credentials for WhatsApp Business API including:
    • WhatsApp Business ID
    • WhatsApp API Key for media URL retrieval and downloads.
  • Uses external libraries:
    • axios for HTTP requests.
    • pdf-parse for extracting text from PDF documents.
    • OpenAI SDK for accessing AI models.

Troubleshooting

  • Invalid webhook format error: Occurs if the incoming WhatsApp webhook payload does not contain expected message structure. Ensure the webhook data matches WhatsApp's official format.
  • Media invalid errors: Triggered when media IDs are missing or invalid in the message. Verify that media attachments exist and are accessible.
  • Media URL not found / expired: Happens if the media URL cannot be retrieved or has expired. Check WhatsApp Business API credentials and ensure timely processing before URLs expire.
  • Download media errors: Network issues or invalid authorization can cause failures downloading media. Confirm API keys and network connectivity.
  • Unsupported media type error: If a message contains a media type other than text, audio, image, or PDF document, the node will throw an error. Only supported types should be sent.
  • OpenAI API response invalid: If the AI service returns unexpected responses, verify API key validity and usage limits.

To resolve most issues:

  • Double-check all required credentials and their permissions.
  • Validate incoming webhook payloads.
  • Monitor API rate limits and quota.
  • Adjust timeout and retry settings as needed.

Links and References

Discussion