| ▲ | RedCinnabar 7 hours ago | |||||||||||||||||||||||||
Call me back when you can run these models on 16GB of RAM and any recent i5/i7. Until then, there’s no point on using these toy models. | ||||||||||||||||||||||||||
| ▲ | guax 5 hours ago | parent | next [-] | |||||||||||||||||||||||||
Its so funny, these "toy models" would be the wet dreams of researchers not 5 years ago. Progress marches without mercy. | ||||||||||||||||||||||||||
| ||||||||||||||||||||||||||
| ▲ | giancarlostoro 7 hours ago | parent | prev | next [-] | |||||||||||||||||||||||||
You need it to run in about 8 GB so you have extra space for the context window. | ||||||||||||||||||||||||||
| ▲ | jboss10 4 hours ago | parent | prev | next [-] | |||||||||||||||||||||||||
They can be ran on 32GB with 8GB VRAM. I don't think these will be on 16GB for a while. (35B MoE) | ||||||||||||||||||||||||||
| ||||||||||||||||||||||||||
| ▲ | Catloafdev 7 hours ago | parent | prev [-] | |||||||||||||||||||||||||
Hello, it's the internet calling, today is that day. https://github.com/ikawrakow/ik_llama.cpp Edit: it's gonna be slow if you're not using any VRAM. But it's possible. Software isn't going to speed that up anytime soon, it's just a hardware bandwidth limit. | ||||||||||||||||||||||||||