I do all of my “AI” development on top of AWS Bedrock that hosts every available model except for OpenAIs closed source models that are exclusive to Microsoft.

It’s extremely easy to write a library that makes switching between models trivial. I could add OpenAI support. It would be just slightly more complicated because I would have to have a separate set of API keys while now I can just use my AWS credentials.

Also of course latency would be theoretically worse since with hosting on AWS and using AWS for inference you stay within the internal network (yes I know to use VPC endpoints).

There is no moat around switching models unlike Ben says.

▲

bambax 13 hours ago | parent | next [-]

openrouter.ai does exactly that, and it lets you use models from OpenAI as well. I switch models often using openrouter.

But, talk to any (or almost any) non-developer and you'll find they 1/ mostly only use ChatGPT, sometimes only know of ChatGPT and have never heard of any other solution, and 2/ in the rare case they did switch to something else, they don't want to go back, they're gone for good.

Each provider has a moat that is its number of daily users; and although it's a little annoying to admit, OpenAI has the biggest moat of them all.

▲

raw_anon_1111 13 hours ago | parent | next [-]

Non developers using Chatbots and being willing to pay is never going to be as big as the enterprise market or BigTech using AI in the background.

I would think that Gemini (the model) will add profit to Google way before OpenAI ever becomes profitable as they leverage it within their business.

Why would I pay for openrouter.ai and add another dependency? If I’m just using Amazon Bedrock hosted models, I can just use the AWS SDK and change the request format slightly based on the model family and abstract that into my library.

	▲	bambax 11 hours ago \| parent [-]
		You don't need openrouter if you already have everything set up in your own AWS environment. But if you don't, openrouter is extremely straightforward, just open an account and you're done.

▲

redwood 6 hours ago | parent | prev | next [-]

All google needs to do is bite the bullet on the cost and flip core search to AI and immediately dominate the user count. They can start by focusing first on questions that get asked in Google search. Boom

	▲	raw_anon_1111 6 hours ago \| parent [-]
		Core search has been using “AI” since they basically deprioritized PageRank. I think the combination of AI overviews and a separate “AI mode” tab is good enough.

▲

EmiDub 8 hours ago | parent | prev [-]

How is the number of users a moat when you are losing money on every user?

	▲	WalterSear 2 hours ago \| parent \| next [-]
		Inference is cash positive: it's research that takes up all the money. So, if you can get ahold of enough users, the volume eventually works in your favour.
	▲	raw_anon_1111 8 hours ago \| parent \| prev [-]
		A moat involves switching costs for users. It’s not related to profitability

▲

spruce_tips 12 hours ago | parent | prev | next [-]

I agree there is no moat to the mechanics of switching models i.e. what openrouter does. But it's not as straightforward as everyone says to switch out the model powering a workflow that's been tuned around said model, whether that tuning was purposeful or accidental. It takes time to re-evaluate that new model works the same or better than old model.

That said, I don't believe oai's models consistently produce the best results.

▲

raw_anon_1111 12 hours ago | parent [-]

You need a way to test model changes regardless as models in the same family change. Is it really a heavier lift to test different model families than it is to test going from GPT 3.5 to GPT 5 or even as you modify your prompts?

▲

spruce_tips 11 hours ago | parent [-]

no, i dont think it's a heavier lift to test different model families. my point was that swapping models, whether that's to different model families or to new versions in the same model family, isn't straightforward. i'm reluctant to both upgrade model versions AND to swap model families, and that in itself is a type of stickiness that multiple model providers have.

maybe another way of saying the same thing is that there is still a lot of work to make eval tooling a lot better!

	▲	DenisM 5 hours ago \| parent [-]
		Continuous eval is unavoidable even absent model changes. Agents are keeping memories, tools evolve over time, external data changes, new exploits are being deployed, partner agents do get upgraded. Theres too much entropy in the system. Context babysitting is our future.

▲

biophysboy 14 hours ago | parent | prev [-]

Have you noticed any significant AND consistent differences between them when you switch? I frequently get a better answer from one vs the other, but it feels unpredictable. Your setup seems like a better test of this

▲

raw_anon_1111 14 hours ago | parent | next [-]

For the most part, I don’t do chatbots except for a couple of RAG based chatbots. It’s more behind the scenes stuff like image understanding, categorization, nuanced sentiment analsys, semantic alignment, etc.

I’ve created a framework that lets me test the quality in automated way between prompt changes and models and I compare costs/speed/quality.

The only thing that requires humans to judge the qualify out of all those are RAG results.

▲

biophysboy 14 hours ago | parent [-]

So who is the winner using the framework you created?

▲

raw_anon_1111 14 hours ago | parent [-]

It depends. Amazon’s Nova Light gave me the best speed vs performance when I needed really quick real time inference for categorizing a users input (think call centers).

One of Anthropics models did the best with image understanding with Amazon’s Nova Pro being slightly behind.

For my tests, I used a customer’s specific set of test data.

For RAG I forgot. But is much more subjective. I just gave the customer an ability to configure the model and modify the prompt so they could choose.

	▲	biophysboy 14 hours ago \| parent [-]
		Your experience matches mine then... I haven't noticed any clear, consistent differences. I'm always looking for second opinions on this (bc I've gotten fairly cynical). Appreciate it

▲

kevstev 13 hours ago | parent | prev [-]

checkout https://poe.com - it does the same thing. I agree with your assessment though, while you can get better answers from some models than others, being able to predict in advance which model will give you the better answer is hard to predict.