Capture

Capture website screenshots, generate PDFs, extract content and metadata

Actions4

Overview

This node enables capturing website data in various formats including screenshots, PDFs, webpage content, and metadata extraction. It is useful for automating the process of visually documenting webpages, archiving content, or gathering metadata for analysis.

Common scenarios include:

Taking automated screenshots of webpages for monitoring visual changes.
Generating PDF versions of webpages for offline reading or record keeping.
Extracting raw HTML content or structured metadata from websites for data processing.
Customizing capture options such as blocking ads, emulating devices, or bypassing bot detection to improve capture quality.

For example, a user can input a URL and request a screenshot with dark mode enabled and ad blocking active, or generate a PDF of a page with specific orientation and delay settings.

Properties

Name	Meaning
URL	The URL of the webpage to capture.
Additional Options	Collection of optional settings:
- Best Format	Automatically select optimal image format (boolean).
- Block Ads	Block advertisements on the page (boolean).
- Block Cookie Banners	Automatically dismiss cookie consent banners (boolean).
- Block Trackers	Block tracking scripts (boolean).
- Bypass Bot Detection	Attempt to bypass bot detection systems (boolean).
- Dark Mode	Enable dark mode rendering for captures (boolean).
- Emulate Device	Specify a device name to emulate for screenshots (string, e.g., "iPhone X").
- File Name	Custom filename for saved files (string).
- Fresh	Force a new capture ignoring any cached results (boolean).
- HTTP Authentication	Base64url encoded username:password for HTTP Basic Auth (string).
- Mobile	Emulate a mobile device (boolean).
- User Agent	Custom user agent string to use when loading the page (string).
- Wait For ID	Element ID to wait for before capturing (string).
- Wait For Selector	CSS selector to wait for before capturing (string).

Output

The output JSON structure varies by operation:

Content operation outputs a JSON object containing the URL, operation type, and the extracted webpage content and metadata returned by the external service.
If the operation is screenshot or pdf and the output option is set to return a URL, the JSON contains the capture URL and parameters describing the capture (format, viewport size, full page, orientation).
If binary output is requested for screenshots or PDFs, the node returns binary data representing the image or PDF file along with the JSON metadata.
The binary field contains the actual file data (image or PDF) suitable for further processing or saving.

Dependencies

Requires an API key credential for an external capture service hosted at https://cdn.capture.page.
The node uses this service's API to perform all capture operations.
No other external dependencies are required.
Proper configuration of the API key and secret in n8n credentials is necessary.

Troubleshooting

Invalid URL error: The node validates URLs before making requests. Ensure the URL is correctly formatted and accessible.
Unsupported operation error: Only "screenshot", "pdf", "content", and "metadata" operations are supported. Check the operation parameter.
HTTP authentication issues: If using HTTP Basic Auth, ensure the credentials are base64url encoded correctly.
Timeouts or failed captures: Network issues or bot detection may cause failures. Use options like "Bypass Bot Detection" or increase delay times.
Binary data handling: When requesting binary output, ensure subsequent nodes can handle binary data properly.

If the node is set to continue on fail, errors will be returned in the output JSON under an error property instead of stopping execution.

Links and References

Capture Page API Documentation (hypothetical link)
n8n Documentation on Creating Custom Nodes
Base64url Encoding Reference

CaptureInstall