Automating a Godot game with Claude Code via Parsec
by Fireal Software · ~7 min read
This post walks through a non-obvious use case for eyehands: giving Claude Code the ability to play a Godot game running on a different Windows machine, accessed over Parsec remote desktop. It’s a good test case because it hits two pain points that break most Python automation libraries:
- Parsec captures the pointer. Any mouse input that goes through the legacy
mouse_eventAPI silently disappears inside the Parsec client window. Pointer-lock doesn’t see it. - Godot renders its HUD on a canvas. No UI Automation tree. OCR is the only text-based path to elements.
Most tools choke on one of these. Some choke on both. eyehands handles both.
The setup
- Host machine (gaming PC): Windows 11, running the Godot game I want Claude to play. Parsec host mode enabled.
- Client machine (my laptop): Windows 11, running Parsec client, connected to the host. Claude Code running on this laptop. eyehands server running on the client — the laptop is where Claude will “click”.
The key insight: Parsec sends my local input through to the host, but only if my local input goes through the Raw Input pipeline. PyAutoGUI’s mouse_event path doesn’t — it injects into the legacy message queue, which Parsec doesn’t read from. eyehands’ SendInput-based path does, and Parsec passes it through to the host machine seamlessly.
Step 1: start eyehands on the laptop
pip install eyehands
eyehands --install-skill
eyehands
Server on http://127.0.0.1:7331. Claude Code picks up the installed SKILL.md automatically.
Step 2: launch Parsec and the game
Parsec client full-screen, connected to the host. Godot game running in the host’s foreground. Claude Code running in a terminal window on top of Parsec (or in a separate monitor).
Step 3: prompt Claude
I told Claude:
“The Godot game ‘ChipBattle’ is running in the Parsec window. Use eyehands to play it. The goal is to collect 10 coins and then exit back to the menu. Start by taking a screenshot to see the current state.”
What Claude did
Screenshot the Parsec window via /screenshot. The returned JPEG was the full 1920×1080 frame, showing the Godot game’s current state: a character on a tiled floor, three coins visible, a HUD showing “Coins: 0” in the top-left.
OCR the coin counter via /find?text=Coins. eyehands returned coordinates for the “Coins: 0” text. Claude now knew where the HUD was and could track progress.
Move toward the first coin. Claude sent POST /smooth_move with a relative delta and step count. Parsec captured the mouse move, sent it to the host, and the Godot character turned toward the coin. (Because Parsec and Godot both use Raw Input, the smooth movement translated cleanly to in-game character rotation.)
Click to collect. POST /click sent a left mouse button event. The Godot character swung its sword, hit the coin, and the HUD updated to “Coins: 1”.
Verify the state changed via /click_and_wait — actually this was a GET /latest with an If-None-Match header from the previous frame. When the HUD changed, a 200 came back, Claude re-OCRed the counter, confirmed “Coins: 1”, and moved on.
Repeat until Coins: 10. After each collection, Claude polled /latest with If-None-Match until the HUD showed a new number.
Press Escape to exit. POST /key with {"vk": 27} — Parsec passed the ESC through to the host, Godot paused and showed the menu.
Find “Main Menu” via OCR. /find?text=Main Menu returned coordinates. Claude clicked them via /click_at.
Total run time: about 4 minutes. Total token cost: ~6,000 input + ~2,000 output. About $0.12 on Sonnet.
What would have failed without eyehands
PyAutoGUI: the smooth mouse moves would have been ignored by Parsec. The character would never have turned to face the coins.
AutoHotkey with default Send: same problem — legacy input pipeline, Parsec doesn’t see it. AHK has a SendInput mode, but you have to know to use it.
Naive screenshot-only Claude: ~15–20 screenshots per coin collected = ~150 screenshots per run × 1500 tokens each = ~225,000 image tokens. Probably $3+ per run instead of $0.12.
The non-obvious details
Frame-hash polling during movement. Between sending a mouse move and seeing the game react, Claude polled /latest with If-None-Match every 100ms. When the game frame changed (character moved, coin disappeared, HUD updated), the poll returned 200 and Claude could react. While the game was idle (waiting for input), polls returned 304 and cost nothing.
OCR caching on unchanged HUDs. When Claude called /find?text=Coins twice in quick succession without the HUD changing, the second call returned the cached result instantly. EasyOCR’s 3-second cold-start load is paid once; subsequent OCR is effectively free.
Smooth moves for analog input. Godot characters using mouse-based aim expect smooth analog input, not a teleport. POST /smooth_move with {"dx": 200, "dy": 0, "steps": 20, "delay_ms": 5} produces a 100ms smooth rotation. A single-shot POST /move would have registered as a 200-pixel teleport and probably been filtered by the game.
Where it got weird
OCR-on-game-HUD can be noisy. Godot’s default bitmap fonts aren’t the easiest for EasyOCR to read reliably. I had a few false positives where the OCR returned coordinates for text that wasn’t actually “Coins”. The fix was to add bounding box constraints to /find — restricting OCR to the top-left 300×100 pixel region where the HUD actually is.
DirectInput vs RawInput in games. Some games use DirectInput specifically (older titles, some emulators). eyehands’ SendInput path works for RawInput and the legacy message queue but not for DirectInput directly. In practice Parsec converts all host-side input to Raw Input before the game sees it, so this only matters if you’re running Claude on the same machine as the game.
The bigger point
The real story here isn’t “Claude can play games now”. It’s that the same automation pattern works across a broad class of apps that most tools can’t touch: games, remote desktops, full-screen browsers, DirectX overlays, anything that lives inside a pointer-lock capture.
If you’ve wanted to automate something like that and been blocked by PyAutoGUI silently doing nothing, eyehands is probably the fix.
Install
pip install eyehands
eyehands --install-skill
eyehands
Links
- eyehands repo: https://github.com/shameindemgg/eyehands
- Parsec: https://parsec.app/
- Godot: https://godotengine.org/
*ChipBattle is a game I wrote myself for this test. If you want a repro, the source is in `examples/godot_hud_reader.py` in the eyehands repo — it reads any text-based HUD and returns coordinates, not just ChipBattle specifically.*
Give Claude eyes and hands on Windows
eyehands is a local HTTP server for screen capture, mouse control, and keyboard input. Open source with a Pro tier.
Try eyehands