Actions4
Overview
This node enables capturing website data in various formats including screenshots, PDFs, webpage content, and metadata extraction. It is useful for automating the process of visually documenting webpages, archiving content, or gathering metadata for analysis.
Common scenarios include:
- Taking automated screenshots of webpages for monitoring visual changes.
- Generating PDF versions of webpages for offline reading or record keeping.
- Extracting raw HTML content or structured metadata from websites for data processing.
- Customizing capture options such as blocking ads, emulating devices, or bypassing bot detection to improve capture quality.
For example, a user can input a URL and request a screenshot with dark mode enabled and ad blocking active, or generate a PDF of a page with specific orientation and delay settings.
Properties
| Name | Meaning |
|---|---|
| URL | The URL of the webpage to capture. |
| Additional Options | Collection of optional settings: |
| - Best Format | Automatically select optimal image format (boolean). |
| - Block Ads | Block advertisements on the page (boolean). |
| - Block Cookie Banners | Automatically dismiss cookie consent banners (boolean). |
| - Block Trackers | Block tracking scripts (boolean). |
| - Bypass Bot Detection | Attempt to bypass bot detection systems (boolean). |
| - Dark Mode | Enable dark mode rendering for captures (boolean). |
| - Emulate Device | Specify a device name to emulate for screenshots (string, e.g., "iPhone X"). |
| - File Name | Custom filename for saved files (string). |
| - Fresh | Force a new capture ignoring any cached results (boolean). |
| - HTTP Authentication | Base64url encoded username:password for HTTP Basic Auth (string). |
| - Mobile | Emulate a mobile device (boolean). |
| - User Agent | Custom user agent string to use when loading the page (string). |
| - Wait For ID | Element ID to wait for before capturing (string). |
| - Wait For Selector | CSS selector to wait for before capturing (string). |
Output
The output JSON structure varies by operation:
- Content operation outputs a JSON object containing the URL, operation type, and the extracted webpage content and metadata returned by the external service.
- If the operation is screenshot or pdf and the output option is set to return a URL, the JSON contains the capture URL and parameters describing the capture (format, viewport size, full page, orientation).
- If binary output is requested for screenshots or PDFs, the node returns binary data representing the image or PDF file along with the JSON metadata.
- The
binaryfield contains the actual file data (image or PDF) suitable for further processing or saving.
Dependencies
- Requires an API key credential for an external capture service hosted at
https://cdn.capture.page. - The node uses this service's API to perform all capture operations.
- No other external dependencies are required.
- Proper configuration of the API key and secret in n8n credentials is necessary.
Troubleshooting
- Invalid URL error: The node validates URLs before making requests. Ensure the URL is correctly formatted and accessible.
- Unsupported operation error: Only "screenshot", "pdf", "content", and "metadata" operations are supported. Check the operation parameter.
- HTTP authentication issues: If using HTTP Basic Auth, ensure the credentials are base64url encoded correctly.
- Timeouts or failed captures: Network issues or bot detection may cause failures. Use options like "Bypass Bot Detection" or increase delay times.
- Binary data handling: When requesting binary output, ensure subsequent nodes can handle binary data properly.
If the node is set to continue on fail, errors will be returned in the output JSON under an error property instead of stopping execution.