It's crazily fast. But 8B model is pretty much useless.
Anyway VCs will dump money onto them, and we'll see if the approach can scale to bigger models soon.