| ▲ | nickthegreek 10 hours ago | |||||||
context is always an issue with local models and consumer hardware. | ||||||||
| ▲ | pdyc 10 hours ago | parent [-] | |||||||
correct but it should be some ratio of model size like if model size is x GB, max context would occupy x * some constant of RAM. For quantized version assuming its 18GB for Q4 it should be able to support 64-128k with this mac | ||||||||
| ||||||||