| ▲ | dangoodmanUT 12 hours ago | |||||||||||||||||||||||||||||||||||||
What are some of the real world applications of small models like this, is it only on-device inference? In most cases, I'm only seeing models like sonnet being just barely sufficiently for the workloads I've done historically. Would love to know where others are finding use of smaller models (like gpt-oss-120B and below, esp smaller models like this). Maybe some really lightweight borderline-NLP classification tasks? | ||||||||||||||||||||||||||||||||||||||
| ▲ | fnbr 11 hours ago | parent | next [-] | |||||||||||||||||||||||||||||||||||||
(I’m a researcher on the post-training team at Ai2.) 7B models are mostly useful for local use on consumer GPUs. 32B could be used for a lot of applications. There’s a lot of companies using fine tuned Qwen 3 models that might want to switch to Olmo now that we have released a 32B base model. | ||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||
| ▲ | schopra909 12 hours ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||
I think you nailed it. For us it’s classifiers that we train for very specific domains. You’d think it’d be better to just finetune a smaller non-LLM model, but empirically we find the LLM finetunes (like 7B) perform better. | ||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||
| ▲ | thot_experiment 5 hours ago | parent | prev [-] | |||||||||||||||||||||||||||||||||||||
I have Qwen3-30B-VL (an MoE model) resident in my VRAM at all times now because it is quicker to use it to answer most basic google questions. The type of stuff like remembering how to force kill a WSL instance which i don't do that often is now frictionless because i can just write on terminal (q is my utility)
and it will respond with "wsl --terminate <distro-name>" much faster than googleit's also quite good at tool calling, if you give it shell access it'll happily do things like "find me files over 10mb modified in the last day" etc where remembering the flags and command structure if you're not doing that action regularly previously required a google or a peek at the manpage i also use it to transcribe todo lists and notes and put them in my todo app as well as text manipulation, for example if i have a list of like, API keys and URLs or whatever that i need to populate into a template, I can just select the relevant part of the template in VSCode, put the relevant data in the context and say "fill this out" and it does it faster than i would be able to do the select - copy - select - paste loop, even with my hard won VIM knowledge TL;DR It's very fast (90tok/s) and very low latency and that means it can perform a lot of mildly complex tasks that have an obvious solution faster than you. and fwiw i don't even think sonnet 4.5 is very useful, it's a decent model but it's very common for me to push it into a situation where it will be subtly wrong and waste a lot of my time (of course that's colored by it being slow and costs money) | ||||||||||||||||||||||||||||||||||||||