> Then a brick hits you in the face when it dawns on you that all of our tools are dumping crazy amounts of non-relevant context into stdout thereby polluting your context windows.

I've found that letting the agent write its own optimized script for dealing with some things can really help with this. Claude is now forbidden from using `gradlew` directly, and can only use a helper script we made. It clears, recompiles, publishes locally, tests, ... all with a few extra flags. And when a test fails, the stack trace is printed.

Before this, Claude had to do A TON of different calls, all messing up the context. And when tests failed, it started to read gradle's generated HTML/XML files, which damaged the context immensely, since they contain a bunch of inline javascript.

And I've also been implementing this "LLM=true"-like behaviour in most of my applications. When an LLM is using it, logging is less verbose, it's also deduplicated so it doesn't show the same line a hundred times, ...

> He sees something goes wrong, but now he cut off the stacktraces by using tail, so he tries again using a bigger tail. Not satisfied with what he sees HE TRIES AGAIN with a bigger tail, and … you see the problem. It’s like a dog chasing its own tail.

I've had the same issue. Claude was running the 5+ minute test suite MULTIPLE TIMES in succession, just with a different `| grep something` tacked at the end. Now, the scripts I made always logs the entire (simplified) output, and just prints the path to the temporary file. This works so much better.

▲

ViktorEE 4 hours ago | parent | next [-]

The way I've solved this issue with a long running build script is to have a logging scripts which redirects all outputs into a file and can be included with ``` # Redirect all output to a log file (re-execs script with redirection) source "$(dirname "$0")/common/logging.sh" ``` at the start of a script.

Then when the script runs the output is put into a file, and the LLM can search that. Works like a charm.

▲

majewsky 2 hours ago | parent | prev | next [-]

> Claude is now forbidden from using `gradlew` directly, and can only use a helper script we made. It clears, recompiles, publishes locally, tests, ... all with a few extra flags. And when a test fails, the stack trace is printed.

I think my question at this point is what about this is specific to LLMs. Humans should not be forced to wade through reams of garbage output either.

	▲	kimixa 2 hours ago \| parent \| next [-]
		Humans have the ability to ignore and generally not remember things after a short scan, prioritize what's actually important etc. But to an LLM a token is a token. There's attempts at effectively doing something similar with analysis passes of the context - kinda what things like auto-compaction is doing - but I'm sure anyone who has used the current generation of those tools will tell you they're very much imperfect.
	▲	adammarples 2 hours ago \| parent \| prev [-]
		Lots of tools have a --quiet or --output json type option, which is usually helpful

▲

quintu5 4 hours ago | parent | prev | next [-]

This has been my exact experience with agents using gradle and it’s beyond frustrating to watch. I’ve been meaning to set up my own low-noise wrapper script.

This post just inspired me to tackle this once and for all today.

▲

petedoyle 4 hours ago | parent | prev | next [-]

Wow, I'd love to do this. Any tips on how to build this (or how to help an LLM build this), specifically for ./gradlew?

▲

4 hours ago | parent | prev [-]

[deleted]