You don't necessarily, but each token costs money for the AI to spit out. And probably more money when that output is used as input later. Delegating to a library makes sense financially.

▲

storus 11 hours ago | parent [-]

With local inference on pretty decent local models we have nowadays (Qwen-3.5 and better) it's not much of a concern anymore.

	▲	walthamstow 2 hours ago \| parent \| next [-]
		Sure, if you've got a £5k laptop
	▲	Bishonen88 7 hours ago \| parent \| prev [-]
		what percentage of people is using local models for anything serious? I reckon single digits if even that. And for a corporate work environment, probably close to 0.