AI Agent Mode

Wire up an AI agent to control desktop apps. JSON output by default.

The loop

Every AI agent should follow this loop:

1. Snapshot  →  read the UI
2. Decide    →  which element to interact with
3. Act       →  one action at a time
4. Verify    →  re-snapshot or get-value

Example: agent automates Calculator

# 1. Snapshot
agent-click snapshot -a Calculator -i -c
# → reads output, finds @e2 is "All Clear", @e5 is "7", etc.

# 2. Act
agent-click click 'id="AllClear"' -a Calculator
agent-click click 'id="Seven"' -a Calculator
agent-click click 'id="Multiply"' -a Calculator
agent-click click 'id="Eight"' -a Calculator
agent-click click 'id="Equals"' -a Calculator

# 3. Verify
agent-click text -a Calculator | grep -A1 "Edit field"
# → 56

Integration with Claude/GPT

Add to your system prompt or CLAUDE.md:

You have access to `agent-click` for Computer use. See SKILL.md for the full guide.

Core workflow: snapshot → identify refs → act → re-snapshot to verify.

Key commands:

- agent-click snapshot -a App -i -c # see interactive elements
- agent-click click @e5 # click by ref
- agent-click type "text" -s @e3 # type into element
- agent-click key cmd+k -a Slack # press key combo
- agent-click text -a App # read all text
- agent-click get-value @e5 # read element value

JSON output

All commands return JSON by default:

{ "success": true, "message": "pressed \"7\" at (453, 354)" }

Errors:

{
  "error": true,
  "type": "element_not_found",
  "message": "no element found matching selector chain"
}

Elements:

{
  "role": "button",
  "name": "Submit",
  "value": null,
  "position": { "x": 450, "y": 320 },
  "enabled": true
}

Tips for agents

Always snapshot before acting — you can't interact with what you can't see
Re-snapshot after every action — refs go stale
Use -i -c flags on every snapshot — reduces output by 10x
Use get-value to check state, not text (faster, targeted)
Use id= selectors when available — most stable
Add sleep 2-3 after page navigations
Use --expect on clicks for built-in verification
For Electron apps, CDP is automatic — no extra flags needed

PreviousBatch Execution

NextArchitecture