Remix clone Hacker News

new | show | ask | jobs Github

	▲	roadside_picnic 5 days ago
		It would be weird if the points in the embedding space where uniformly distributed but they're not. The entire role of the model in general is to project those results on to a subset of the larger space, that "makes sense" for the problem. Ultimately the projection becomes one in which the latent categories we're trying to predict (class, token, etc) become linearly separable.