Remix.run Logo
jcgrillo 5 hours ago

Do you think it could do anything interesting with a highly compressed representation? CLP can apparently achieve 169x compression ratio:

https://github.com/y-scope/clp

https://www.uber.com/blog/reducing-logging-cost-by-two-order...

buryat 5 hours ago | parent [-]

interesting approach, thanks for directing me!

Since the classifier would need to have access to the whole log message I was looking into how search is organized for the CLP compression and see that:

> First, recall that CLP-compressed logs are searchable–a user query will first be directed to dictionary searches, and only matching log messages will be decompressed.

so then yeah it can be combined with a classifier as they get decompressed to get a filtered view at only log lines that should be interesting.

The toughest part is still figuring out what does "interesting" actually mean in this context and without domain knowledge of the logs it would be difficult to capture everything. But I think it's still better than going through all the logs post searching.

jcgrillo 4 hours ago | parent [-]

I like the idea of SQL as the "common tongue" because provided the query is reasonably terse it's easy for the human to verify and reason about, there's shitloads of it in the LLM's training set, and (usually) the database doesn't lie. So you've mitigated some major LLM drawbacks that way.

Another thing SQL has in it's favor is the ability with tools like trino or datafusion to basically turn "everything" into a table.

EDIT: thinking on it some more, though, at what point do you just know off the top of your head the small handful of SQL queries you regularly use and just skip the expensive LLM step altogether? Like... that's the thing that underwhelms me about all the "natural language query" excitement. We already have a very good, natural language for queries: SQL.

chickensong 3 hours ago | parent [-]

> small handful of SQL queries you regularly use

Give those queries to the LLM and enjoy your sleep while the agent works.

jcgrillo an hour ago | parent [-]

hell yeah, give it the ssh keys to and sleep all the time