| ▲ | redwood 3 hours ago | |
Wow impressive. What's the story with this? | ||
| ▲ | jffry 3 hours ago | parent | next [-] | |
It's a tech demonstrator for a company that turns models into custom silicon for fast inference. In this case llama3.1-8b https://taalas.com/products/ | ||
| ▲ | hmartin 3 hours ago | parent | prev [-] | |
Taalas hardware implementation of Llama 3.1 8B They claim 16k tok/s vs Cerbras at 2k. https://taalas.com/products/ | ||