I've just been discovering this pattern too. It's made a huge difference. Trying to get Claude to remote control an app for testing via the various other means was miserable and unreliable.

I got it to build an MCP server into the app that supported sending commands to allow Claude to interact with it as if it was a user, including keypresses and grabbing screenshots, and the difference was immediate and really beneficial.

Visual issues were previously one of the things it would tend to struggle with.

▲

behehebd 13 hours ago | parent [-]

How does it compare to my goto: a test suite that uses Playwright?

> Claude imolement plan.md until all unit and browser tests pass

	▲	kybernetikos 13 hours ago \| parent [-]
		I assume that this is dependent on app, and it's quite possible that your approach is best in some cases. In my case I started with something somewhat like Playwright, and claude had a habit of interacting with the app more directly than a user would be able to and so not spotting problems because of it. Forcing it to interact by pressing keys rather than delving into the dom or executing random javascript helped. In particular I wanted to be able to chat with it as it tried things interactively. This is more to help with manual tests or exploratory testing rather than classic automated testing. My current app is a desktop app, so playwright isn't as applicable.