| ▲ | jcgrillo 5 hours ago | |||||||||||||||||||||||||
Do you think it could do anything interesting with a highly compressed representation? CLP can apparently achieve 169x compression ratio: https://github.com/y-scope/clp https://www.uber.com/blog/reducing-logging-cost-by-two-order... | ||||||||||||||||||||||||||
| ▲ | buryat 5 hours ago | parent [-] | |||||||||||||||||||||||||
interesting approach, thanks for directing me! Since the classifier would need to have access to the whole log message I was looking into how search is organized for the CLP compression and see that: > First, recall that CLP-compressed logs are searchable–a user query will first be directed to dictionary searches, and only matching log messages will be decompressed. so then yeah it can be combined with a classifier as they get decompressed to get a filtered view at only log lines that should be interesting. The toughest part is still figuring out what does "interesting" actually mean in this context and without domain knowledge of the logs it would be difficult to capture everything. But I think it's still better than going through all the logs post searching. | ||||||||||||||||||||||||||
| ||||||||||||||||||||||||||