VIRA icon

VIRA

Intelligent web automation with natural language goals

Overview

This node, named VIRA, enables intelligent web automation driven by natural language goals. Users specify a goal describing what they want to achieve on the web, optionally provide a starting URL and additional context, and the node uses a chosen large language model (LLM) provider to plan and execute web interactions to fulfill the goal. It supports multiple LLM providers including OpenAI GPT-4, Anthropic Claude, Google Gemini, and a local LLM via Ollama. The node is useful for automating complex web tasks such as data extraction, navigation, and interaction without manual scripting, leveraging AI to interpret and execute the user's intent.

Use Case Examples

  1. Extract the main heading text from a website by specifying the goal and starting URL.
  2. Navigate through a web application to gather specific data points based on a natural language description.
  3. Automate repetitive web tasks by describing the desired outcome in plain language, letting VIRA handle the execution.

Properties

Name Meaning
LLM Provider Selects the large language model provider to use for generating and executing the web automation plan.
Goal Natural language description of the task or outcome the user wants VIRA to achieve on the web.
Starting URL Optional URL where VIRA should begin the web automation; if not provided, VIRA will infer it.
Context Optional additional context or constraints to help VIRA better understand and achieve the goal.
Browser Options Settings for the browser environment, including headless mode, operation timeout, and error screenshot capture.
VIRA Options Options controlling VIRA's behavior such as self-healing broken selectors, retry attempts, verbose logging, and confidence threshold for plan acceptance.
LLM Options Parameters to configure the LLM's response behavior, including temperature and maximum tokens.

Output

JSON

  • goalAchieved - Boolean indicating if the goal was successfully achieved.
  • status - Current status of the execution (e.g., success, error).
  • data - Data extracted or generated as a result of the automation.
  • plan
    • reasoning - Explanation of the reasoning behind the generated plan.
    • confidence - Confidence score of the generated plan.
    • tasksCount - Number of tasks in the generated plan.
  • execution
    • timeMs - Time taken to execute the plan in milliseconds.
    • retryCount - Number of retry attempts made during execution.
    • resultsCount - Number of results produced by the execution.
  • healingEvents - Events where VIRA self-healed broken selectors during execution.
  • error - Error message if an error occurred during execution.
  • screenshotPath - File path to a screenshot taken on error, if enabled.
  • results - (Optional) Detailed results of the execution, included if verbose logging is enabled.
  • fullPlan - (Optional) Full plan details, included if verbose logging is enabled.

Dependencies

  • API keys or credentials for the selected LLM provider (OpenAI, Anthropic, Google, or Ollama).

Troubleshooting

  • Unsupported LLM provider error: Ensure the selected LLM provider is one of the supported options (OpenAI, Anthropic, Google, Ollama).
  • API credential errors: Verify that the correct API keys or credentials are provided for the chosen LLM provider.
  • Timeouts or failures in web automation: Adjust browser timeout settings or enable self-healing to improve robustness.
  • Errors during execution: Enable verbose logging to get detailed information for troubleshooting.
  • Screenshot on error helps diagnose issues visually if enabled.

Discussion