I had no idea they had their own cloud offering, I thought the whole point of Ollama was local models? Why would I pay $20/month to use small inferior models instead of using one of the usual AI companies like OpenAI or even Mistral? I'm not going to make an account to use models on my own computer.

▲

mchiang 11 hours ago | parent | next [-]

Fair question. Some of the supported models are large and wouldn't fit on most local devices. This is just the beginning, and Ollama does not need to exclude cloud hosted frontier models either with the relationship we've built with the model providers. We just have to be mindful and understand that Ollama stands with developers, and solve the needs.

https://ollama.com/cloud

▲

sorenjan 11 hours ago | parent | next [-]

> Some of the supported models are large and wouldn't fit on most local devices.

Why would I use those models on your cloud instead of using Google's or Anthropic's models? I'm glad there are open models available and that they get better and better, but if I'm paying money to use a cloud API I might as well use the best commercial models, I think they will remain much better than the open alternatives for quite some time.

▲

mchiang 10 hours ago | parent | next [-]

When we started Ollama, we were told how open-source (open-weight wasn't a term back then) will always be inferior to the close-sourced models. This was 2 years ago (Ollama's birthday is July 18th, 2023).

Fast forward to now, open models are quickly catching up, and at a significantly lower price point for most and can be customized for specific tasks instead of being general purpose. For general purpose models, absolutely the closed models are currently dominating.

▲

typpilol 10 hours ago | parent [-]

Ya a lot of ppl don't realize you could spend 2k on a 5090 to run some of the large models.

Or spend 20 a month for models even a 5090 couldn't run. And not have to spend your own electricity, hardware, maintenance, updates etc.

▲

oytis 9 hours ago | parent [-]

20 a month for a commercial model is price dumping financed by investors. For ollama it's hopefully a sustainable price.

	▲	theshrike79 8 minutes ago \| parent [-]
		The 20 a month models definitely aren't sustainable. This is why everyone needs to get every flavour and speedrun building all the tools they need when the infinite money faucets are turned off. At some point companies will start raising prices or moving towards per-token pricing (Which is sustainable, but expensive).

▲

ineedasername 10 hours ago | parent | prev [-]

A person can use Google’s Gemma models on ollama’s cloud and possibly pay less. And have more quality control that way (and other types of control I guess) since there is no don’t need to wonder if a recent model update or load balance throttling impacted results. Your use case doesn’t generalize.

▲

disiplus 10 hours ago | parent | prev [-]

hi, to me this sounds like you are going into the direction of openrouter.

▲

kordlessagain 9 hours ago | parent | prev | next [-]

You make an account to use their hosted models AND to have them available via the Ollama API LOCALLY. I'm spending $100 on Claude and $200 on GPT5, so $20 bucks is NOTHING and totally worth having access to:

Qwen3 235b

Deepseek 3.1 671b (thinking and non thinking)

Llama 3.1 405b

GPT OSS 120b

Those are hardly "small inferior models".

What is really cool is that you can set Codex up to use Ollama's API and then have it run tools on different models.

▲

mrheosuper 20 minutes ago | parent | next [-]

If you are on $100 tier Claude, what makes you think the $20 Tier Ollama is enough for you ?

	▲	theshrike79 5 minutes ago \| parent [-]
		If your workflow is general enough, you can (and should) switch between models. They all have different styles and blind spots. Like I had Codex + gpt-5-codex (20€ tier) build me a network connectivity monitor for my very specific use case. It worked, but had some really weird choices. Gave it to Claude Code (20€ tier again) and it immediately found a few issues and simplifications.

▲

n4bz0r 6 hours ago | parent | prev [-]

Has anyone tried the hosted models? How do they compare to GPT-5?

I was thinking about trying ChatGPT Pro, but I seem to have completely missed that they bumped the price from $100 to $200. It was $100 just a while ago, right? Before GPT-5, I assume.

▲

ricardobeat 11 hours ago | parent | prev | next [-]

For models you can't run locally like gpt-oss-120b, deepseek or qwen3-coder 480b. And a way for them to monetize the success of Ollama.

▲

dcreater 10 hours ago | parent | prev | next [-]

Yeah it's been a steady pivot to profitable features. Wonderful to see them build a reputation through FOSS and codebase from free labor to then cash in.

▲

kergonath 10 hours ago | parent | next [-]

As long as the software that runs locally gets maintained (and ideally improved, though if it is not I’ll simply move to something else), I find it difficult to be angry. I am more annoyed by software companies that offer a nerfed "community edition" whose only purpose is to coerce people into buying the commercial version.

▲

dcreater 10 hours ago | parent [-]

> software companies that offer a nerfed "community edition" whose only purpose is to coerce people into buying the commercial version.

This is the play. Its only a matter of time till they do it. Investors will want their returns

▲

Imustaskforhelp 8 hours ago | parent [-]

pardon me but is Ollama a company though? I didn't knew that actually.

And are they VC funded? Are they funded by Y-combinator or anything else..

I just thought it was a project by someone to write something similar to docker but for LLM's and that was its pitch for a really really long time I think

	▲	dcreater 8 hours ago \| parent [-]
		Yup thats exactly what I thought as well. I also found out late and to much surprise that its a VC backed startup: https://www.ycombinator.com/companies/ollama

▲

all2 10 hours ago | parent | prev | next [-]

What sort of monetization model would you like to see? What model would you deem acceptable?

	▲	dcreater 9 hours ago \| parent \| next [-]
		Ollama , the local inference platform, stays completely local. Maintained by a non-profit org with dev time contributed to by a for-profit company. That company can be VC backed and can make their cloud inference platform. And can use ollama as its backed, as a platform to market etc. But keep it as a separate product (not named ollama). This is almost exactly how duckdb/motherduck functions and I think theyre doing an excellent job. EDIT: grammar and readability
	▲	troyvit 9 hours ago \| parent \| prev \| next [-]
		If I were them I'd go whole-hog on local models and: * Work with somebody like System76 or Framework to create great hardware systems come with their ecosystem preinstalled. * Build out a PaaS, perhaps in partnership with an existing provider, that makes it easy for anybody to do what Ollama search does. I'm more than half certain I could convince our cash strapped organization to ditch elastic search for that. * Partner with Home Assistant, get into home automation and wipe the floor with Echo and its ilk (yeah basically resurrect Mycroft but add whole-house automation to it). Each of those are half-baked, but it also took me 7 minutes to come up with them, and they seem more in line with what Ollama tries to represent than a pure cloud play using low-power models.
	▲	Cheer2171 7 hours ago \| parent \| prev [-]
		Have ollama server support auth / API keys (closed as out of scope) and monetize the way everyone else does around SSO.

▲

Cheer2171 7 hours ago | parent | prev [-]

What reputation? People who actually know how to develop software or work with LLMs know ollama is a child's tricycle and to run the hell away from what is just a buggy shell around other people's inference engines.

Ollama is beloved by people who know how to write 5 lines of python and bash to do API calls, but can't possibly improve the actual app.

	▲	dcreater 6 hours ago \| parent [-]
		Thats what I thought so as well - that it was for people like me who arent professional SWEs and thus im sad to see them go this way. But what ive found is people are using it for "on-prem" style deployment, have no idea if this is common but I wouldnt be surprised given the reality of AI startups + the abundance of ollama in training dataset leading to relatively greater vibe coding success rate

▲

zmmmmm 8 hours ago | parent | prev [-]

a lot of "local" models are still very large to download and slow to run on regular hardware. I think it's great to have a way to evaluate them cheaply in the cloud before deciding to pull down the model to run locally.

At some level it's also more of a principle that I could run something locally that matters rather than actually doing it. I don't want to become dependent on technology that someone could take away from me.