| ▲ | Rotary GPU: Exploring Local Execution for Large MoE Models Under Limited VRAM(arxiv.org) | |||||||||||||
| 28 points by dryarzeg 7 hours ago | 4 comments | ||||||||||||||
| ▲ | martinald 3 hours ago | parent | next [-] | |||||||||||||
Why is this a paper? It's just using the n-cpu-moe option on llama.cpp? What am I missing here? | ||||||||||||||
| ||||||||||||||
| ▲ | sandworm101 3 hours ago | parent | prev [-] | |||||||||||||
Um, doesn't the 4060 laptop card have the ability to share system memory? Wait... My mistake. Google AI says the 4060 mobile can access system memory but tech sheets say no. | ||||||||||||||