Remix.run Logo
bel8 7 hours ago

They know Google has a ton of data to train LLMs on.

Recently I have been asking YouTube's new AI about some videos ("when is Steam metrics mentioned in the video?" for example), which means they also index videos. This is an unthinkable amount of data.

I'm actually impressed at how bad Alphabet is with LLMs since they invented the thing as we know AND have all the data to train on, yet OpenAI and Anthropic are eating their pie.

mitchell_h 7 hours ago | parent | next [-]

I use anthropic's models daily, and sometimes switch to Gemini. Google is losing the marketing front BADLY, but their AI service is surprisingly great. It's far cheaper than anthropic for one. and for my kind of research it's just better.

lemax 6 hours ago | parent | next [-]

I'm quite certain that Google's AI services are likely the most used in the world right now by virtue of having the widest distribution. It's in the search box. It's on your Android phone. Just because they aren't the preferred coding or research agent does not mean they are losing - that's a pretty small slice.

dansquizsoft 4 minutes ago | parent [-]

It can be everywhere, but that doesn't mean users are paying or even value it.

enos_feedler 7 hours ago | parent | prev | next [-]

who cares about marketing when you have distribution? Probably a smart move to pump dollars into the product and not the marketing.

jgalt212 3 hours ago | parent [-]

in high margin businesses, customer acquisition is everything.

SOLAR_FIELDS 2 hours ago | parent [-]

If your product becomes commoditized, it’s no longer a high margin business

mandeepj an hour ago | parent [-]

> If your product becomes commoditized

Depends on the product - whether protein bars, salty chips, cellular service, or IPhone or something else. If your product has a flavor, it’s never going to get commoditized. Coke still tastes better than Pepsi.

tempest_ 6 hours ago | parent | prev | next [-]

I have not tried the Gemini CLI in a few months but when I did it was a shit show.

Google makes it very hard to use their shit and it was full of bugs.

Anthropic's current run is based entirely around Claude Code in this space and the last time I used the gemeini-cli it wouldnt give me access to the latest models and I was paying them for the privilege

Tuna-Fish 6 hours ago | parent [-]

Google trashed the Gemini CLI client and replaced it with agy (antigravity), which is written in go and is much nicer.

throwawaycan 20 minutes ago | parent | next [-]

Interesting you say that. Every user I speak to says antigravity cli is missing lots of features and Gemini cli was working quite well. Same for me.

tempest_ 4 hours ago | parent | prev [-]

So they did.

https://github.com/google-gemini/gemini-cli/discussions/2727...

I get the complaints in that thread but I still think it is hilarious. That repo is a gong show to random shit and perhaps one of the best worst examples of "opensource" LLM development.

vasco 7 hours ago | parent | prev | next [-]

Is it? My mom and all her friends use "the intelligence". What is it? Gemini, because it's on their android phone.

sumoboy 7 hours ago | parent | prev [-]

I think Google is a bit sandbagging here knowing they have all the data and likely better models hiding. My theory is it's a bit of not disrupting the stock market direction by exposing whose really the boss. If they can do it cheaper, faster, and better, people start asking questions, especially with upcoming IPO's.

mgfist 7 hours ago | parent | next [-]

This makes no sense. Google is beholden to its own shareholders, not the markets at large.

In any case, it's well known that devs in Google have liked anthropic/openai models for coding more than gemini, so unless they're hiding their best models from the people within, I think it's just the case that they're behind.

hattmall 6 hours ago | parent | next [-]

It's more that they know they can eventually clone any successes the other companies have and steal their market share. Their really is no moat. In a more normal environment they would be buyout candidates but that's a bit too far gone at this point, so you just let them run until they are out of gas and Google can benefit from any advances without upfronting the cost.

Even with anthropics record breaking revenue growth I don't see how the pure AI companies can sustain, but the catch-22 is that any obvious pivot proves that. This puts the more traditional tech companies in position to ride the back of the wave until the growth curve tops.

JumpCrisscross 6 hours ago | parent [-]

> they know they can eventually clone any successes the other companies have

Google has gone all in on AI. To the point of challenging their own core product. Apple is waiting and seeing. Google is building and distributing, albeit with terrible marketing.

gizajob 4 hours ago | parent | next [-]

Apple isn’t waiting and seeing on the hardware side, only implementing AI on the software side, which there doesn’t seem to be much of a demand for them to do. Apple are well set for on-device LLMs and agents with their Mx Max cpu/gpu, and their wait on the rest is saving them hundreds of billions by not burning all their profitability to the ground building Nvidia-filled datacenters the same as everyone else, which is why Google is now having to hunt for extra money by raising capital like this.

0gs 6 hours ago | parent | prev [-]

search is not their core product though, it's ads. they ain't challenging anything.

lukeschlather 6 hours ago | parent | prev | next [-]

Coding is a pretty small slice of the markets in play. Google's models are driving cars right now. Using coding agents doesn't give much insight into performance in the broader world; I would assume assume Google is performing better in general even if Claude or Codex is currently outperforming for coding.

s1artibartfast 4 hours ago | parent | prev [-]

Google also owns 15% of anthropic.

WarmWash 3 hours ago | parent | prev [-]

It's important to remember that the cloud division, rapidly becoming Google's golden goose, does not give one fuck about Gemini and would happily sell out all of Gemini's compute to Anthropic and OAI if given the opportunity.

marcus_holmes 2 hours ago | parent | prev | next [-]

Kodak problem. Kodak invented the digital camera but their revenue came from making photographic film. They were unable to take advantage of their invention because it would cannibalise their revenue. That didn't stop other people and the revenue died anyway.

Google's main revenue is ads based on search. LLMs are a competitor to search. Creating better LLMs will cut into search volumes.

In any large organisation this is extraordinarily difficult to manage - they have to incentivise the new tech that is actively harming the current revenues, while maintaining as much of the old revenues as possible, without creating internal conflict between these two parts of the organisation that will kill it.

Though in fairness to Google they do seem to realise this and are trying to adapt - they're letting the LLM folks mess with search. It'll be interesting to see how this goes.

wrsh07 29 minutes ago | parent [-]

This is a sensible-seeming take at first blush, but it doesn't hold up to any scrutiny (or maybe my scrutiny is faulty - you tell me!)

Sundar and many of his executives have certainly read or heard of The Innovator's Dilemma, and I expect they're all moderately paranoid that it will be their downfall.

Also, that's not it. Google has a great ai app called Gemini where they have at various points hosted the top ai image generation model (certainly for speed, and for a while for accuracy) and have innovated with features like deep research

They are monetizing their ai conversations more effectively than OpenAI could dream of via ads and chat in Google search.

They are heavily investing in compute and talent.

When they've added llm results to Google search it has _increased_ engagement and re-engagement.

What part of the competition are they blissfully ignoring?

(I have counter arguments to some of these points, but I would rather hear other people's)

onlyrealcuzzo 4 hours ago | parent | prev | next [-]

I wouldn't be surprised if Google's logs alone are a substantial portion of all data created daily...

monkpit 2 hours ago | parent [-]

Do they even do logging in the traditional sense? Surely they have some bespoke googly solution.

jonwachob91 7 hours ago | parent | prev | next [-]

I've also asked the youtube ai about when some things are mentioned in videos, and upon verification the ai is just hallucinating.

tekacs 7 hours ago | parent | prev | next [-]

I don't think they 'index' videos, per se. They just point the model at the video's transcript on demand when you ask a question, I believe. Doesn't change any of your conclusions, though. You're absolutely right, they have an absolute ton of data.

f0rgot 4 hours ago | parent | prev | next [-]

Are you sure it’s not using transcripts? That would be equally useful but technologically less impressive.

tayo42 3 hours ago | parent [-]

Turning all of those annoyingly verbose and long YouTube videos into text that could be searched, summarized and referenced easily would be amazing

ecommerceguy 4 hours ago | parent | prev | next [-]

pretty sure its only for videos with cc enabled.

CamperBob2 7 hours ago | parent | prev | next [-]

Not only that, but the same webmasters who try to shoo AI crawlers away actively court Google's bots.

cj 7 hours ago | parent [-]

Really? Every business owner I know outside of HN wants to be discoverable by LLMs.

iamacyborg 7 hours ago | parent [-]

Being discoverable is one thing, having your content stolen wholesale is another

losvedir 4 hours ago | parent | next [-]

Most of the economy is not journalists or people who sell "content" online. In most cases I can think of - retailer, restaurant, hotel, plumber, any local small business, they want their content ingested. That means the AI chatbot knows about them and they can be in answers potentially.

Polizeiposaune 7 hours ago | parent | prev | next [-]

And having your content rendered inaccessible to humans by a DDoS attack from overly aggressive webcrawlers that ignore robots.txt is yet another.

7 hours ago | parent | prev [-]
[deleted]
MagicMoonlight 7 hours ago | parent | prev [-]

Everyone mocked them for paying for YouTube for years with no real income. Now it’s the most valuable data source in the world.