Remix.run Logo
macNchz 5 hours ago

This is something I've been thinking about for a while...the current state of things really does feel kind of like the dialup era, wondering what the "broadband" era could look like. Watching tokens stream in is reminiscent of watching a jpeg load a few rows of pixels at a time, and the various different loading and connecting animations that applications implemented before things got fast enough to make them less relevant.

Some of the work in that direction like Cerebras or Taalas have been doing is an interesting glimpse of what might be possible. In the meantime it's a fun thought experiment to wonder about what might be possible if even current state of the art models were available at like, a million tokens per second at a very low cost.

gavmor 3 hours ago | parent [-]

Take a look at https://chatjimmy.ai/ -- it's running against Taalas' "hardcore" silicon model, ie a dedicated, ASIC-like chip.