| ▲ | verdverm 2 hours ago | |
DGX Spark runs this sized model (I personally like qwen36moe better than gemma4moe) at speeds fast enough for interactive coding sessions. Algorithmic advances like DiffusionGemma ~4x token gen speeds (https://deepmind.google/models/gemma/diffusiongemma/) | ||