▲ | punitvthakkar 4 days ago | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
So far I've not run into the kind of use cases that local LLMs can convincingly provide without making me feel like I'm using the first ever ChatGPT from 2022, in that they are limited and quite limiting. I am curious about what use cases the community has found that work for them. The example that one user has given in this thread about their local LLM inventing a Sun Tzu interview is exactly the kind of limitation I'm talking about. How does one use a local LLM to do something actually useful? | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | narrator 4 days ago | parent | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
I have tried a lot of different LLMs and Gemma3:27b on a 48gb+ Macbook is probably the best for analyzing diaries and personal stuff you don't want to share with the cloud. The China models are comically bad with life advice. For example, I asked Deepseek to read my diaries and talk to me about my life goals and it told me in a very Confucian manner what the proper relationships in my life were for my stage of life and station in society. Gemma is much more western. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | crazygringo 4 days ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
I see local LLM's being used mainly for automation as opposed to factual knowledge -- for classification, summarization, search, and things like grammar checking. So they need to be smart about your desired language(s) and all the everyday concepts we use in it (so they can understand the content of documents and messages), but they don't need any of the detailed factual knowledge around human history, programming languages and libraries, health, and everything else. The idea is that you don't prompt the LLM directly, but your OS tools make use of it, and applications prompt it as frequently as they fetch URL's. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | dxetech 4 days ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
There are situations where internet access is limited, or where there are frequent outages. An outdated LLM might be more useful than none at all. For example: my internet is out due to a severe storm, what safety precautions do I need to take? | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | vorticalbox 4 days ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
I keep a lot of notes, all my thoughts feelings both happy and sad, things I’ve done, etc. in obsidian. These are deeply personal and I don’t want this going to a cloud provider even if they “say” they don’t train on my chats. I forget a lot of things so I feed these into chromeDB and then use a LLM to chat with all my notes. I’ve started using abliterated models which have their refusal removed [0] Other use case is for work. I work with financial data and I have created an mcp that automates some of my job. Running model locally allows me to not worry about the information I feed it. [0] https://github.com/Sumandora/remove-refusals-with-transforme... | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | jondwillis 4 days ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
I use, or at least try to use local models while prototyping/developing apps. First, they control costs during development, which depending on what you're doing, can get quite expensive for low or no budget projects. Second, they force me to have more constraints and more carefully compose things. If a local model (albeit something somewhat capable like gpt-oss or qwen3) can start to piece together this agentic workflow I am trying to model, chances are, it'll start working quite well and quite quickly if I switch to even a budget cloud model (something like gpt-5-mini.) However, dealing with these constraints might not be worth the time if you can stuff all of the documents in your context window for the cloud models and get good results, but it will probably be cheaper and faster on an ongoing basis to have split the task up. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | dragonwriter 4 days ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Well, a lot of what is possible with local models depends on what your local hardware is, but docling is a pretty good example of a library that can use local models (VLMs instead of regular LLMs) “under the hood” for productive tasks. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | ivape 4 days ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
I use Claude code in the terminal only mostly to figure out what to commit along with what to write for the commit message. I believe a solid 7-8b model can do this locally. So, that’s at least one small highly useful workflow robot I have a use for (and very easy to cook up on your own). I also have a use for terminal command autocompletion, which again, a small model can be great for. Something felt kind really wrong about sending entire folder contents over to Claude online, so I am absolutely looking to create the toolkit locally. The universe off offline is just getting started, and these big companies literally are telling you “watch out, we save this stuff”. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | rukuu001 4 days ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
I'm running Gemma3-270M locally (MLX). I got a Python script that pulls down emails based on a whitelist and summarises them. The 270M model does a good job of this. This is running in a terminal. It means I barely look at my email during the day. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | luckydata 4 days ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Local models can do embedding very well, which is useful for things like building a screenshot manager for example. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | bityard 4 days ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
I use a local LLM for lots of little things that I used to use search engines for. Defining words, looking up unicode symbols for copy/paste, reminders on how to do X in bash or Python. Sometimes I use it as a starting point for high-level questions and curiosity and then move to actual human content or larger online models for more details and/or fact-checking if needed. If your computer is somewhat modern and has a decent amount of RAM to spare, it can probably run one of the smaller-but-still-useful models just fine, even without a GPU. My reasons: 1) Search engines are actively incentivized to not show useful results. SEO-optimized clickbait articles contain long fluffy, contentless prose intermixed with ads. The longer they can keep you "searching" for the information instead of "finding" it, the better is for their bottom line. Because if you actually manage to find the information you're looking for, you close the tab and stop looking at ads. If you don't find what you need, you keep scrolling and generate more ad revenue for the advertisers and search engines. It's exactly the same reasons online dating sites are futile for most people: every successful match made results in two lost customers which is bad for revenue. LLMs (even local ones in some cases) are quite good at giving you direct answers to direct questions which is 90% of my use for search engines to begin with. Yes, sometimes they hallucinate. No, it's not usually a big deal if you apply some common sense. 2) Most datacenter-hosted LLMs don't have ads built into them now, but they will. As soon as we get used to "trusting" hosted models due to how good they have become, the model developers and operators will figure out how to turn the model into a sneaky salesman. You'll ask it for the specs on a certain model of Dell laptop and it will pretend it didn't hear you and reply, "You should try HP's latest line of up business-class notebooks, they're fast, affordable, and come in 5 fabulous colors to suit your unique personal style!" I want to make sure I'm emphasizing that it's not IF this happens, it's WHEN. Local LLMs COULD have advertising at some point, but it will probably be rare and/or weird as these smaller models are meant mainly for development and further experimentation. I have faith that some open-weight models will always exist in some form, even if they never rival commercially-hosted models in overall quality. 3) I've made peace with the fact that data privacy in the age of Big Tech is a myth, but that doesn't mean I can't minimize my exposure by keeping some of my random musings and queries to myself. Self-hosted AI models will never be as "good" as the ones hosted in datacenters, but they are still plenty useful. 4) I'm still in the early stages of this, but I can develop my own tools around small local models without paying a hosted model provider and/or becoming their product. 5) I was a huge skeptic about the overall value of AI during all of the initial hype. Then I realized that this stuff isn't some fad that will disappear tomorrow. It will get better. The experience will get more refined. It will get more accurate. It will consume less energy. It will be totally ubiquitous. If you fail to come to speed on some important new technology or trend, you will be left in the dust by those who do. I understand the skepticism and pushback, but the future moves forward regardless. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | jeffybefffy519 4 days ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Gemma3 is pretty useful on a long haul flight without internet | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | ActorNightly 3 days ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Smaller models require a lot more direction, a.k.a system prompt engineering, and sometimes custom wrappers . For example Gemma models are very eager to generate code even if you tell them not to. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | bigyabai 4 days ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Qwen3 A3B (in my experience) writes code as-good-as ChatGPT 4o and much better than GPT-OSS. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | mentalgear 4 days ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
something like rewind or openRecall can use local LLMs for on-device semantic search. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | segmondy 4 days ago | parent | prev [-] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
The same way you use a cloud LLM. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|