Remix clone Hacker News

new | show | ask | jobs Github

	▲	bravura 11 hours ago
		A rigorous approach to predicting the future of text was proposed by Li et al 2024, "Evaluating Large Language Models for Generalization and Robustness via Data Compression" (https://ar5iv.labs.arxiv.org/html//2402.00861) and I think that work should get more recognition. They measure compression (perplexity) on future Wikipedia, news articles, code, arXiv papers, and multi-modal data. Data compression is intimately connected with robustness and generalization.