Tmux Agent Monitor

Monitor and analyze tmux agent activity and health

Overview

The Tmux Agent Monitor node is designed to monitor and analyze the activity and health of agents running inside tmux sessions. It provides various operations such as listing active sessions, performing health checks, collecting logs, detecting blockers (stuck or blocked agents), generating activity reports, and more.

The Detect Blockers operation specifically identifies agents that appear stuck or blocked based on inactivity thresholds and error keyword detection in their tmux window outputs. This helps users quickly pinpoint problematic agents that may require intervention, such as restarting or providing input.

Common scenarios:

  • Monitoring automated agents or scripts running in tmux sessions to ensure they are responsive.
  • Detecting when an agent is stuck due to errors or waiting for user input.
  • Proactively identifying issues before they cause workflow failures.
  • Improving operational efficiency by highlighting agents needing attention.

Practical example:
You have multiple long-running automation agents in tmux sessions. Using this node’s Detect Blockers operation, you can automatically scan all sessions or specific ones, detect if any agent has been inactive beyond a threshold or shows error keywords, and receive actionable insights to resolve those blockages.


Properties

Name Meaning
Target Sessions Comma-separated list of tmux session names to check. Leave empty to check all sessions.
Inactivity Threshold Number of seconds of inactivity after which an agent is considered blocked (default 300s).
Error Keywords Comma-separated keywords to search for in tmux window output indicating potential issues.

Output

The output JSON structure for the Detect Blockers operation includes:

  • success: Boolean indicating if the operation succeeded.
  • blockersFound: Number of detected blockers.
  • blockers: Array of objects describing each blocker with fields:
    • session: The tmux session name.
    • window: The window name within the session.
    • windowIndex: Index of the window.
    • blockerType: Type of blocker detected (error, repetitive, or waiting).
    • foundKeywords: List of error keywords found in the window content.
    • isRepetitive: Boolean indicating if repetitive output pattern was detected.
    • waitingForInput: Boolean indicating if the agent appears to be waiting for user input.
    • context: Last few lines of window output providing context.
    • suggestedAction: Recommended action string based on detected issues.
  • checkedSessions: Number of sessions checked.
  • timestamp: ISO timestamp of when the check was performed.

This output allows downstream nodes or users to programmatically assess which agents need attention and what kind of problems were detected.


Dependencies

  • Requires access to a tmux orchestrator API or environment capable of interacting with tmux sessions.
  • Optionally uses credentials containing configuration such as external script directories or project base paths.
  • No direct external API keys are mandatory but proper permissions to query tmux sessions and capture window content are required.

Troubleshooting

  • Common issues:

    • Failure to connect or retrieve tmux sessions if the orchestrator API or environment is misconfigured.
    • Empty or incomplete session/window data if target sessions do not exist or are misspelled.
    • False negatives if error keywords do not match actual error messages in tmux windows.
    • Performance delays if many sessions/windows are scanned or large amounts of window content are captured.
  • Error messages:

    • "Failed to detect blockers: <message>" indicates an issue during blocker detection, often related to tmux session retrieval or content capture.
    • "Health check failed: <message>" or similar errors in other operations suggest connectivity or permission problems.
  • Resolutions:

    • Verify tmux orchestrator API credentials and connectivity.
    • Confirm target session names are correct or leave empty to scan all.
    • Adjust error keywords and inactivity threshold to better fit your environment.
    • Ensure the node has sufficient execution time and resources for scanning.

Links and References

Discussion