I'm expecting someone to come up with an LLM version of the Coral USB Accelerator: https://www.coral.ai/products/accelerator

Just plug in a stick in your USB-C port or add an M.2 or PCIe board and you'll get dramatically faster AI inference.

▲

angoragoats 5 hours ago | parent [-]

I think there are drastic differences between computer vision models and LLMs that you’re not considering. LLMs are huge relative to vision models, and require gobs of fast memory. For this reason a little USB dongle isn’t going to cut it.

Put another way, there already exist add-in boards like this, and they’re called GPUs.

	▲	amelius 4 hours ago \| parent [-]
		GPUs are still software programmable. An "LLM chip" does not need that and so can be much more efficient.