OpenGuardrails icon

OpenGuardrails

AI safety and content moderation with OpenGuardrails

Overview

This node performs AI safety and content moderation using the OpenGuardrails API. It is designed to check user input, AI/system output, or multi-turn conversations for safety issues such as prompt attacks, compliance violations, data leaks, and privacy concerns. The node is useful in scenarios where content needs to be moderated before further processing or output, ensuring that workflows handle potentially harmful or risky content appropriately. For example, it can moderate user-generated content in chatbots, filter AI-generated responses, or analyze conversation history for safety compliance.

Use Case Examples

  1. Moderate user input in a chatbot to prevent harmful or inappropriate content from being processed.
  2. Check AI-generated output before sending it to users to ensure it meets safety and compliance standards.
  3. Analyze multi-turn conversation history to detect and mitigate safety risks in ongoing interactions.

Properties

Name Meaning
Content The text content to check for safety issues, used in operations like checkContent, inputModeration, and outputModeration.
Detection Options Settings to enable or disable specific safety checks such as security threats, compliance issues, and data security, plus an optional user ID for ban policy enforcement.
Action on High Risk Defines how to handle content flagged as high risk, with options to continue with a warning, stop the workflow, or replace the content with a safe response.

Output

JSON

  • action - The action determined by the moderation result (e.g., reject, replace, continue).
  • risk_level - The risk level detected in the content (e.g., high, medium, low).
  • categories - Categories of detected issues related to the content.
  • suggest_answer - Suggested safe response if content is replaced.
  • hit_keywords - Keywords that triggered the moderation.
  • original_content - The original content that was checked.
  • processed_content - The content after moderation processing, which may be replaced if high risk.
  • was_replaced - Boolean indicating if the content was replaced with a safe response.
  • has_warning - Boolean indicating if the content has a high or medium risk warning.

Dependencies

  • OpenGuardrails API key credential

Troubleshooting

  • If the node throws an error about content being blocked, it means high-risk content was detected and the 'Stop Workflow' action is enabled; consider changing the action or reviewing the content.
  • Ensure the OpenGuardrails API key and URL are correctly configured in credentials to avoid authentication errors.
  • If the node returns unexpected results, verify that the content and detection options are correctly set and that the API service is reachable.

Links

Discussion