| ▲ | nl 12 hours ago | |||||||||||||
Taalas is interesting. 16,000 TPS for Llama on a chip. | ||||||||||||||
| ▲ | Nihilartikel 2 hours ago | parent | next [-] | |||||||||||||
Neat! I had been wondering if anyone was trying to implement a model in silico. We're getting closer to having chatty talking toasters every day now! | ||||||||||||||
| ||||||||||||||
| ▲ | micw 9 hours ago | parent | prev | next [-] | |||||||||||||
On a very old model, it's more like 16.000 garbage words/s | ||||||||||||||
| ||||||||||||||
| ▲ | replete 8 hours ago | parent | prev | next [-] | |||||||||||||
Its exciting to see, but look at the die size for only an 8b model | ||||||||||||||
| ▲ | DeathArrow 9 hours ago | parent | prev [-] | |||||||||||||
I wonder how many token per seconds can they get if they put Mercury 2 on a chip. | ||||||||||||||