Remix.run Logo
canyon289 6 days ago

We constantly are evaluating architectures trying to assess what will work well in the open ecosystem. It's quite a vibrant space and glad you have one option that works. For this model in particular we evaluated a couple of options before choosing a dense architecture of its simplicity and finetunability.

For the other Gemma models, some the smaller sizes should work on your laptop when quantized. Does Gemma 1b and 4b not work on a quantized? It should fit the memory constraints. I use Ollama on low powered devices with 8gb and less of ram and the models load.

For TTS a colleague at HuggingFace made this bedtime story generator running entirely in browser.

https://huggingface.co/spaces/webml-community/bedtime-story-... https://www.youtube.com/watch?v=ds95v-Aiu5E&t https://huggingface.co/spaces/webml-community/bedtime-story-...

Be forewarned though this is not a good coding model out of the box. It likely could be trained to be an autocompletion llm, but with 32k context window and smaller sides its not going to be refactoring entire codebases like Jules/Gemini and other larger models can.