Remix.run Logo
eterps 19 hours ago

Would love to hear more about this approach.

tptacek 18 hours ago | parent [-]

It's actually really easy in Claude Code. Get a TUI to the point where it renders something, and get Claude to the point where it knows what you want to render (draw it in ASCII like this post proposes, for instance).

Then just prompt Claude to "use tmux to interact with and test the TUI rendering", prompt it through anything it gets hung up on (for instance, you might remind Claude that it can create a tmux pane with fixed size, or that tmux has a capture-pane feature to dump the contents of a view). Claude already knows a bunch about tmux.

Once it gets anything useful done, ask it to "write a subagent definition for a TUI tester that uses tmux to exercise a TUI and test its rendering, layout, and interaction behavior".

Save that subagent definition, and now Claude can do closed-loop visual and interactive testing of its own TUI development.

electroly 17 hours ago | parent [-]

Can you explain tmux's contribution here? I'm confused why this process wouldn't work just the same if CC directly executed the program rather than involving tmux. Are you just using tmux to trick the program under test into running its TUI instead of operating in a dumb-stdout mode?

tptacek 17 hours ago | parent [-]

It allows Claude to take screenshots and generate keyboard inputs. It's like TUI Playwright.

mrstackdump 17 hours ago | parent | next [-]

Maybe I'm not understanding it (totally possible!) but could Claude just do that by reading standard out and writing to standard in?

tptacek 16 hours ago | parent | next [-]

I had a really hard time getting anything like that to work (you can't just read stdout and write stdin, because you're driving a terminal in raw mode), but it took like 3 sentences worth of Claude prompt to get Claude to use tmux to do this reliably.

alehlopeh 16 hours ago | parent | next [-]

I tell Claude code to use an existing tmux session to interact with eg a rails console, and it uses tmux send-keys and capture-pane for IO. It gets tripped up if a pager is invoked, but otherwise it works pretty well. Didn’t occur to me to tell it to take screenshots.

tptacek 16 hours ago | parent [-]

`tmux capture-pane`.

mrstackdump 16 hours ago | parent | prev [-]

I would love to see your prompt if you ever post it anywhere.

_sinelaw_ 10 hours ago | parent [-]

For Claude, it's enough to prompt "use tmux to test", that usually does the work out of the box. If colors are important I also add "use -e option with capture-pane to see colors". It just works. I used it regularly with Claude and my TUI. For other agents other than Claude I need to use a more specific set of instructions ("use send-keys, capture-pane and mouse control via tmux" etc.)

Since I have e2e tests, I only use the agent for: guiding it on how to write the e2e test ("use tmux to try the new UI and then write a test") or to evaluate its overall usability (fake user testing, before actual user testing): "use tmux to evaluate the feature X and compile a list of usability issues"

rsanheim 16 hours ago | parent | prev [-]

Also many CLIs act differently when invoked connected to a terminal (TUI/interactive) vs not. So you’d run into issues there where Claude could only test the non-interactive things.

alehlopeh 15 hours ago | parent | prev [-]

So by screenshots you mean tmux capture-pane, not actual screenshots. So in essence it is using stdout, just not Claude’s own.

wakawaka28 9 hours ago | parent [-]

"In essence" but terminals do stuff to render stdout that you do not want a LLM to have to replicate, I think. If your TUI does stuff in fullscreen or otherwise with a bunch of control codes, that is simple work for a terminal but potentially intractable for a LLM.