Remix clone Hacker News

new | show | ask | jobs Github

	▲	msdz 2 hours ago
		They’d probably get the farthest, but they won’t pursue that because they don’t want to end up leaking the original data from training. It is possible in regular language/text subsets of models to reconstruct massive consecutive parts of the training data [1], so it ought to be possible for their internal code, too. [1] https://arxiv.org/abs/2601.02671