The wait is finally over. One or two iterations, and I’ll be happy to say that language models are more than fulfilling my most common needs when self-hosting. Thanks to the Gemma team!

▲

vunderba 2 days ago | parent | next [-]

Strongly agree. Gemma3:27b and Qwen3-vl:30b-a3b are among my favorite local LLMs and handle the vast majority of translation, classification, and categorization work that I throw at them.

▲

curioussquirrel 2 hours ago | parent | next [-]

Give Gemma 31B a shot for translation, it does a very good job at that given its size.

▲

misiti3780 2 days ago | parent | prev [-]

what HW are you running them on ? are you using OLLAMA ?

	▲	vunderba 2 days ago \| parent [-]
		I'm using the default llama-server that is part of Gerganov's LLM inference system running on a headless machine with an nVidia 16GB GPU, but Ollama's a bit easier to ease into since they have a preset model library. https://github.com/ggml-org/llama.cpp

▲

adamtaylor_13 2 days ago | parent | prev | next [-]

What sort of tasks are you using self-hosting for? Just curious as I've been watching the scene but not experimenting with self-hosting.

▲

vunderba 2 days ago | parent | next [-]

Not OP but one example is that recent VL models are more than sufficient for analyzing your local photo albums/images for creating metadata / descriptions / captions to help better organize your library.

▲

kejaed 2 days ago | parent [-]

Any pointers on some local VLMs to start with?

	▲	vunderba 2 days ago \| parent \| next [-]
		The easiest way to get started is probably to use something like Ollama and use the `qwen3-vl:8b` 4‑bit quantized model [1]. It's a good balance between accuracy and memory, though in my experience, it's slower than older model architectures such as Llava. Just be aware Qwen-VL tends to be a bit verbose [2], and you can’t really control that reliably with token limits - it'll just cut off abruptly. You can ask it to be more concise but it can be hit or miss. What I often end up doing and I admit it's a bit ridiculous is letting Qwen-VL generate its full detailed output, and then passing that to a different LLM to summarize. - [1] https://ollama.com/library/qwen3-vl:8b - [2] https://mordenstar.com/other/vlm-xkcd
	▲	canyon289 2 days ago \| parent \| prev [-]
		You could try Gemma4 :D

▲

ktimespi 2 days ago | parent | prev | next [-]

For me, receipt scanning and tagging documents and parts of speech in my personal notes. It's a lot of manual labour and I'd like to automate it if possible.

▲

ezst 2 days ago | parent [-]

Have you tried paperless-ngx, a true and tested open source solution that's been filling this niche successfully for decades now?

	▲	codethief a day ago \| parent [-]
		They, too, offer integrations for LLMs these days, presumably for better OCR and classification.

▲

mentalgear 2 days ago | parent | prev | next [-]

Adding to the Q: Any good small open-source model with a high correctness of reading/extracting Tables and/of PDFs with more uncommon layouts.

	▲	mh- a day ago \| parent [-]
		I haven't tried it yet, but I bookmarked this recently: https://github.com/opendataloader-project/opendataloader-pdf

▲

BoredPositron 2 days ago | parent | prev | next [-]

I use local models for auto complete in simple coding tasks, cli auto complete, formatter, grammarly replacement, translation (it/de/fr -> en), ocr, simple web research, dataset tagging, file sorting, email sorting, validating configs or creating boilerplates of well known tools and much more basically anything that I would have used the old mini models of OpenAI for.

▲

irishcoffee 2 days ago | parent | prev [-]

I would personally be much more interested in using LLMs if I didn’t need to depend on an internet connection and spending money on tokens.

▲

kolja005 a day ago | parent | prev | next [-]

I would be inclined to agree with this except that my "most common needs" keeps expanding and increasing in difficulty each year. In 2023 and 2024, most of my needs were asking models simple questions and getting a response. They were a drop-in replacement for Stack Overflow. I think the best open source models today that I can run on my laptop serve that need.

Now that coding agents are a thing my frame of reference has shifted to where I now consider a model that can be that my most common need. And unfortunately open models today cannot do that reliably. They might, like you said, be able to in a year or two, but by then the cloud models will have a new capability that I will come to regard as a basic necessity for doing software development.

All that said this looks like a great release and I'm looking forward to playing around with it.

▲

dakolli a day ago | parent | prev [-]

The wait is finally over... then proceeds to say I actually need to wait two more iterations... Classic LLM user who's fried their brain.

Also, ya'll have been saying the wait is over for 3 years, or open source LLMs that compete with foundation models are just months away! Its simply never going to happen, because honestly they wouldn't give those away and you're living in a fantasy land if they're going to give you the ability to out compete themselves.

	▲	originalvichy a day ago \| parent [-]
		Take a walk outside.