Remix clone Hacker News

new | show | ask | jobs Github

	▲	baq 2 hours ago
		they're tools. treat them as tools. since they're so general, you need to explore if and how you can use them in your domain. guessing 'they're poorly suited' is just that, guessing. in particular: > We also found that the models were highly sensitive to seemingly trivial prompt changes this is as much as obvious for anyone who seriously looked at deploying these, that's why there are some very successful startups in the evals space.
	▲	rob_c 2 hours ago \| parent [-]
		> guessing 'they're poorly suited' is just that, guessing I have a really nice bridge to sell you... This "failure" is just a grab at trying to look "cool" and "innovative" I'd bet. Anyone with a modicum of understanding of the tooling (or hell experience they've been around for a few years now, enough for people to build a feeling for this), knows that this it's not a task for a pre-trained general LLM.