Give your AI eyes, hands, and a real iPhone.
Tell your AI to send a message, test a login flow, or explore an app — it sees the screen, taps what it needs, and figures the rest out. An MCP server for macOS iPhone Mirroring, compatible with any MCP client.
Every interaction follows the same loop — observe, reason, act.
describe_screen returns every text element on the iPhone screen with exact tap coordinates.
Your LLM reads the screen, decides what to do next, and picks the right tool.
tap, type_text, swipe — the AI executes the action, then loops back to observe.
$ brew tap jfarcand/tap && brew install mirroir-mcp or via npx:
$ npx -y mirroir-mcp install or via shell script:
$ /bin/bash -c "$(curl -fsSL https://mirroir.dev/get-mirroir.sh)" Paste any of these into Claude Code, ChatGPT, Cursor, or any MCP client.
"Open Calendar, create a Dentist event next Tuesday at 2pm"
The AI launches Calendar, taps "+", fills in title, date, and time, then saves. Handles confirmation dialogs automatically.
"Open Messages, find Alice, and send 'running 10 min late'"
The AI opens Messages, scrolls to find Alice's conversation, taps the text field, types the message, and hits Send.
"Test the login screen with test@example.com / password123"
The AI opens the app, taps Email, types the address, taps Password, types it, taps Sign In, and screenshots the result.
"Start recording, open Settings, scroll to General > About, stop recording"
The AI starts a video capture, navigates through Settings menus, then stops recording and returns the file path.
When you need deterministic, repeatable testing, mirroir provides a full pipeline. Point it at any app — it autonomously discovers every reachable screen using BFS graph traversal (screens are nodes, taps are edges), then outputs a bundle of ready-to-run SKILL.md files. Edit them, test them from the CLI, diagnose failures with --agent.
A single generate_skill(action: "explore") call runs autonomous BFS traversal — exploring each screen breadth-first, replaying paths to reach child screens, building a navigation graph of the entire app.
generate_skill(action: "explore", app: "Settings") Edit the generated skill or author one from scratch — numbered steps, no coordinates.
vim login.md The AI reads the skill via get_skill, executes each step with MCP tools, and auto-compiles coordinates at the end.
get_skill → record_step → save_compiled Replay with zero OCR — pure input injection. A 10-step skill drops from 5+ seconds of OCR to under a second.
mirroir test login When a step fails, --agent compares expected vs. actual OCR and tells you the root cause and fix.
mirroir test --agent login Agent diagnosis runs in two tiers: deterministic OCR analysis first (free, no API key), then optionally an AI model for richer analysis. Supports Anthropic, OpenAI, local Ollama, and CLI agents.
When you find yourself repeating the same agent workflow, capture it as a skill. Skills are SKILL.md files — numbered steps the AI follows, adapting to layout changes and unexpected dialogs.
Steps like Tap "Email" use OCR, not coordinates.
Share them on the community repository.
The mirroir-skills repository
is an open collection of ready-made SKILL.md files — login flows, cross-app workflows, settings automation, and more.
${VAR} placeholders resolve from environment variables, so the same skill works across accounts and devices.
Install the full library as a Claude Code plugin*:
$ claude plugin add jfarcand/mirroir-skills * Also supported by GitHub Copilot.
32 tools exposed as an MCP server.
tap Tap at screen coordinatesdouble_tap Double-tap for zoom or text selectionlong_press Hold for context menusswipe Quick flick between two pointsdrag Slow drag for sliders and iconstype_text Type text via virtual keyboardpress_key Send special keys with modifiersshake Shake gesture for undo or dev menusscreenshot Capture screen as PNGdescribe_screen OCR with tap coordinatesstart_recording Begin video recordingstop_recording End recording, get file pathget_orientation Portrait or landscapestatus Connection and device readinesscheck_health Full setup diagnosticcalibrate_component Test UI component definitions against live screenlist_targets List configured automation targetslaunch_app Open app by name via Spotlightopen_url Open URL in Safaripress_home Return to home screenpress_app_switcher Show recent appsspotlight Open Spotlight searchscroll_to Scroll until element visible via OCRreset_app Force-quit app via App Switcherset_network Toggle airplane, Wi-Fi, cellularmeasure Time screen transitionsgenerate_skill Autonomous BFS exploration → navigation graph → SKILL.md bundlelist_skills List available skillsget_skill Read skill with env substitution + compilation statusrecord_step Record a compiled step during executionsave_compiled Save compiled .json for zero-OCR replayswitch_target Switch active automation targetAny MCP client that supports stdio transport — plug into your editor or build your own agent.
Giving an AI access to your phone demands defense in depth. mirroir-mcp is fail-closed at every layer.
Without a config file, only read-only tools (screenshot, describe_screen) are exposed. Mutating tools are hidden from the MCP client entirely — it never sees them.
blockedApps in permissions.json prevents the AI from interacting with sensitive apps like Wallet or Banking — even if mutating tools are allowed.
Runs as a regular user process using the macOS CGEvent API. No daemons, no kernel extensions, no root privileges — just Accessibility permissions.
{
"allow": ["tap", "swipe", "type_text", "press_key", "launch_app"],
"deny": [],
"blockedApps": ["Wallet", "Banking"]
}
Drop this in ~/.mirroir-mcp/permissions.json to control
exactly which tools your AI agent can use. Close iPhone Mirroring to kill all input instantly.
Without a config file, only read-only tools are exposed. Mutating tools require explicit opt-in. Use blockedApps in permissions.json to deny access to sensitive apps. Closing iPhone Mirroring kills all input immediately.
macOS routes HID input to the frontmost app. The server must activate iPhone Mirroring before each input. Put it in a separate macOS Space to keep your workspace undisturbed.
Yes. It operates at the screen level through iPhone Mirroring — no source code, SDK, or jailbreak required. If you can see it on screen, the AI can interact with it.
No. All input (touch and keyboard) is delivered via the macOS CGEvent API, which only requires Accessibility permissions. No kernel extensions, no root privileges, no helper daemons.
Experimental. You can add targets in .mirroir-mcp/targets.json pointing at macOS windows. The same tools work, but with limitations — both the MCP client and target window must be in the same macOS Space, and iPhone-specific tools (press_home, app_switcher, spotlight) don't apply. See the README for details.
Yes. Drop a permissions.json with allow and deny lists. Tools not in the allow list are hidden from the MCP client entirely.