Remix clone Hacker News

new | show | ask | jobs Github

	▲	vharuck 6 hours ago
		People need to be careful about buying into the shorthand lingo with LLMs. They do not learn like we do. At the lowest level, they predict which tokens follow a body of tokens. This lets them emulate knowledge in a very useful way. This is similar to a time series model of user activity: the time series model does not keep tabs on users to see when they are active, it has not read studies about user behavior, it just reflects a mathematical relationship between points of data. For an LLM and this "vague" domain expertise, even if none of the LLM's training material includes certain nuggets of wisdom, if the material includes enough cases of problems and the solutions offered by domain experts, we should expect the model to find a decent relationship between them. That the LLM has never ingested an explicit documentation of the reasoning is irrelevant, because it does not perform reasoning.
	▲	jandrewrogers 5 hours ago \| parent [-]
		The domain expertise I'm referring to isn't vague, it literally doesn't exist as training data. There are no cases of problems and solutions to study that are relevant to the state-of-the-art. In some cases this is by intent and design (e.g. trade secrets, national security, etc) long before for LLMs arrived on the scene. We even have some infamous "dark" domains in computer science where it is nearly impossible for a human to get to the frontier because the research that underpins much of the state-of-the-art hasn't existed as public literature for decades. If you want to learn it, you either have to know a domain expert willing to help you or reinvent it from first principles.