AI Agent Mode
Wire up an AI agent to control desktop apps. JSON output by default.
The loop
Every AI agent should follow this loop:
1. Snapshot → read the UI
2. Decide → which element to interact with
3. Act → one action at a time
4. Verify → re-snapshot or get-value
Example: agent automates Calculator
# 1. Snapshot agent-click snapshot -a Calculator -i -c # → reads output, finds @e2 is "All Clear", @e5 is "7", etc. # 2. Act agent-click click 'id="AllClear"' -a Calculator agent-click click 'id="Seven"' -a Calculator agent-click click 'id="Multiply"' -a Calculator agent-click click 'id="Eight"' -a Calculator agent-click click 'id="Equals"' -a Calculator # 3. Verify agent-click text -a Calculator | grep -A1 "Edit field" # → 56
Integration with Claude/GPT
Add to your system prompt or CLAUDE.md:
You have access to `agent-click` for Computer use. See SKILL.md for the full guide. Core workflow: snapshot → identify refs → act → re-snapshot to verify. Key commands: - agent-click snapshot -a App -i -c # see interactive elements - agent-click click @e5 # click by ref - agent-click type "text" -s @e3 # type into element - agent-click key cmd+k -a Slack # press key combo - agent-click text -a App # read all text - agent-click get-value @e5 # read element value
JSON output
All commands return JSON by default:
{ "success": true, "message": "pressed \"7\" at (453, 354)" }Errors:
{
"error": true,
"type": "element_not_found",
"message": "no element found matching selector chain"
}Elements:
{
"role": "button",
"name": "Submit",
"value": null,
"position": { "x": 450, "y": 320 },
"enabled": true
}Tips for agents
- Always snapshot before acting — you can't interact with what you can't see
- Re-snapshot after every action — refs go stale
- Use
-i -cflags on every snapshot — reduces output by 10x - Use
get-valueto check state, nottext(faster, targeted) - Use
id=selectors when available — most stable - Add
sleep 2-3after page navigations - Use
--expecton clicks for built-in verification - For Electron apps, CDP is automatic — no extra flags needed