get the biggest one that will fit in your vram.

trebligdivad 4 days ago | parent | next [-]

How do people deal with all the different quantisations? Generally if I see an Unsloth I'm happy to try it locally; random other peoples...how do I know what I'm getting?

(If nothing else Tongyi are currently winning AI with cutest logo)

	▲	exe34 4 days ago \| parent [-]
		personally I've only used them for toying around - but in all cases you have to test them for your use case anyway.

▲

davidsainez 4 days ago | parent | prev [-]

This is the way. I managed to run (super) tiny models on CPU only with this approach.