Actions30
- Company Actions
- Company (Related) Actions
- Data (Advanced) Actions
- Enrich Actions
- Live Web RAG Actions
- LLM Template Actions
- Other Data Actions
- Validation & Cleansing Actions
Overview
The "Live Web RAG" resource with the "Get RAG URL" operation allows users to retrieve content from a publicly accessible URL using the Bedrijfsdata API. This node is useful for extracting and processing web page data dynamically, enabling workflows that require fetching live web content for further analysis or integration.
Common scenarios include:
- Extracting cleaned HTML or markdown content from a webpage for content analysis.
- Retrieving raw or formatted snippets of text from URLs for knowledge extraction.
- Using localization options to fetch content as if browsing from a specific country or language setting.
Practical example:
- A user wants to gather clean textual content from a news article URL in Dutch, optionally receiving the content in markdown format for easy integration into documentation or reports.
Properties
| Name | Meaning |
|---|---|
| URL | (Required) The URL of the webpage you want to retrieve content from. |
| Localization Options | Optional settings to specify the browsing context: - Country (ISO 639-1): Country code to simulate browsing from (e.g., US, NL). - Language (ISO 3166-1 Alpha-2): Language code for browsing (e.g., us, nl). |
| Output Options | Options to customize the output content: - Add Cleaned HTML: Disable or enable cleaned HTML output. - Add Markdown: No markdown, CommonMark, or cleaned markdown without images/links. - Add Raw Content: Disable or enable raw content output. - Add Raw HTML: Disable or enable raw HTML output. - Max. Snippets Length: Maximum length for snippets; if >0, unique sentences are added to the result. |
Output
The node outputs JSON data containing the retrieved content from the specified URL. Depending on the selected output options, the JSON may include:
- Cleaned HTML version of the page content.
- Markdown-formatted content in different styles.
- Raw content extracted directly from the page.
- Raw HTML source of the page.
- A list of unique text snippets up to the specified maximum length.
This output enables downstream nodes to process or analyze the web content in various formats suitable for different use cases.
The node does not output binary data.
Dependencies
- Requires an active connection to the Bedrijfsdata API service.
- Needs an API authentication token configured in n8n credentials to authorize requests.
- The base URL for API requests is
https://api.bedrijfsdata.nl/v1.2. - No additional environment variables are required beyond standard API credential setup.
Troubleshooting
- Missing or invalid URL: Ensure the URL property is provided and correctly formatted. The node requires a valid, publicly accessible URL.
- API authentication errors: Verify that the API key credential is correctly set up and has sufficient permissions.
- Localization options not applied: If localization parameters do not affect results, confirm that the country and language codes are valid ISO codes.
- Empty or unexpected response: Check if the target URL is accessible and returns content. Some websites may block automated requests or require headers/user-agent adjustments (not configurable here).
- Max snippets length issues: Setting this value too high might lead to large responses; keep it reasonable to avoid performance issues.
If the node throws errors related to API request failures, inspect the error message for details and verify network connectivity and API status.
Links and References
- Bedrijfsdata API Documentation (for detailed API usage and parameters)
- ISO 639-1 Language Codes
- ISO 3166-1 Alpha-2 Country Codes