Remix.run Logo
repeekad 6 hours ago

I once asked one of the original YouTube infra engineers “will you ever need to delete the long tail of videos no one watches”

They said it didn’t matter, because the sheer volume of new data flowing in growing so fast made the old data just a drop in the bucket

jl6 3 hours ago | parent | next [-]

One day, it will matter. Not even Google can escape the consequences of infinite growth. Kryder's Law is over. We cannot rely on storage getting cheaper faster than we can fill it, and orgs cannot rely on being able to extract more value from data than it costs to store it. Every other org knows this already. The only difference with Google is that they have used their ad cash generator to postpone their reality check moment.

One day, somebody is going to be tasked with deciding what gets deleted. It won't be pretty. Old and unloved video will fade into JPEG noise as the compression ratio gets progressively cranked, until all that remains is a textual prompt designed to feed an AI model that can regenerate a facsimile of the original.

asah 3 hours ago | parent | next [-]

You can see how Google rolls with how they deleted old Gmail accounts - years of notice, lots of warnings, etc. They finally started deletions recently, and I haven't heard a whimper from anyone (yet).

flux3125 2 hours ago | parent [-]

The problem is that some content creators have already passed away (and others will pass away by then), and their videos will likely be deleted forever.

shevy-java an hour ago | parent | next [-]

That may be, but I assume for videos that had some viewership base, there may be a consideration. E. g. if a video was viewed 20 million times, it may be worth more than one that was viewed only 5 times.

eMPee584 11 minutes ago | parent [-]

I've stumbled upon very valuable content with very low view numbers - the algorithms spiral around spectacularity and provocation, not quality or insight.

zaik 2 hours ago | parent | prev | next [-]

Hopefully the deletion will not affect videos with thousands of views, even if the account is lost.

loloquwowndueo an hour ago | parent [-]

Sweet summer child.

CuriouslyC an hour ago | parent [-]

Goog is 100% not going to delete anything that is driving any advertising at all. The videos are also useful for training AI regardless, so I expect the set of stuff that's deleted will be a VERY small subset. The difference with email is that email can be deduplicated, since it's a broadcast medium, while video is already canonical.

I expect rather than deleting stuff, they'll just crank up the compression on storage of videos that are deemed "low value."

dessimus 2 hours ago | parent | prev [-]

Monuments erode away and memories of those enshrined are lost time as well, nothing lasts forever.

bentcorner 34 minutes ago | parent | next [-]

    I met a user from an antique land
    Who said: Two squares of a clip of video
    Stand in at the end of the search. Near them,
    Lossly compressed, a profile with a pfp, whose smile,
    And vacant eyes, and shock of content baiting,
    Tell that its creator well those passions read
    Which yet survive, stamped on these unclicked things,
    The hand that mocked them and the heart that fed:
    And on the title these words appear:
    "My name is Ozymandias, Top Youtuber of All Time:
    Look on my works, ye Mighty, and like and subscribe!"
    No other video beside remains. Round the decay
    Of that empty profile, boundless and bare
    The lone and level page stretch far away.
herodoturtle 5 minutes ago | parent | prev [-]

Like tears in rain <3

dyauspitr 37 minutes ago | parent | prev [-]

It depends. At the rough 2 PB of new data they get a day that’s about 10 sq ft of physical rack space per day. Each data center is like 500,000 sq feet so each data center can hold 120 years of YouTube uploads. They’re not going to have to restrict uploads anytime soon.

MagicMoonlight 7 minutes ago | parent | prev | next [-]

Now that they can harvest it all for AI training, that decision was the cheapest and greatest thing they ever did.

Imagine trying to pay for all that content, nobody on earth would be able or willing to supply it.

arjie 6 hours ago | parent | prev | next [-]

Videos do disappear, though. https://www.reddit.com/r/DataHoarder/comments/1ioz4x1/is_it_...

Searching hn.algolia.com for examples will yield numerous ones.

https://news.ycombinator.com/item?id=23758547

https://bsky.app/profile/sinevibes.bsky.social/post/3lhazuyn...

Kwpolska 5 hours ago | parent [-]

Of course videos disappear for copyright, ToS violations, or when the uploaders remove them. They do not disappear just because nobody watched them.

Gigachad 3 hours ago | parent | next [-]

There’s a whole activity around discovering random 15 year old videos with almost no views. It’s usually some random home video

leephillips 41 minutes ago | parent | prev [-]

They also disappear when the government of Pakistan tells Google to erase them: https://lee-phillips.org/youtube/

ntoskrnl_exe 3 hours ago | parent | prev | next [-]

Wouldn't it also be a performance nightmare?

The energy bill for scanning through the terabytes of metadata would be comparable to that of several months of AI training, not to mention the time it would take. Then deleting a few million random 360p videos and putting MrBeast in their place would result in insane fragmentation of the new files.

It might really just be cheaper to keep buying new HDDs.

dev1ycan 3 hours ago | parent | next [-]

This is why they removed searching for older videos (specific time) and why their search pushes certain algorithmic videos, other older videos when found by direct link are on long term storage and take a while to start loading.

joecool1029 2 hours ago | parent [-]

I’m pretty sure this is the real reason why they changed old unlisted videos to being marked private: https://blog.youtube/news-and-events/update-youtube-unlisted...

stogot 3 hours ago | parent | prev | next [-]

S3 allows delete and is efficient here. I’m sure Google can figure it out

They allow search by timestamp, I’m sure YouTube can write algo to find zero <=1 view

moffkalast 3 hours ago | parent | prev [-]

Besides with their search deteriorating to the point where a direct video title doesn't result in a match, nobody can see those videos anyway and they don't have to cache them.

sfn42 3 hours ago | parent [-]

It's not just the search deteriorating. The frontend is littered with bugs. If you write a comment and try to highlight and delete part of that comment, it'll often delete the part you didn't highlight. So apparently they implemented their own textfield for some reason and also fucked it up. It's been like that for years.

The youtube shorts thing is buggy as shit, it'll just stop working a lot of the time, just won't load a video. Some times you have to go back and forth a few times to get it to load. It'll often desync the comments from the video, so you're seeing comments from a different video. Some times the sound from one short plays over the visuals of another.

It only checks for notifications when you open the website from a new tab, so if you want to see if you have any notifications you have to open youtube in a new tab. Refreshing doesn't work.

Seems like all the competent developers have left.

r_lee 2 hours ago | parent [-]

and if you do a hard refresh on the webapp, it literally takes like 10 seconds for the homepage to load

sfn42 2 hours ago | parent [-]

Yeah, one that I forgot to mention is if you pause a youtube short and go to a different tab, the short will unpause in the background, or it might change to an entirely different short and start playing that.

wasmainiac 6 hours ago | parent | prev [-]

I wonder if that still holds true? The volume of videos increases exponentially especially with AI slop, I wonder if at some point they will have to limit the storage per user, with a paid model if you surpass that limit. Many people who upload many videos I guess some form of income off YouTube so it wouldn’t that be that big of a deal.

weird-eye-issue 6 hours ago | parent | next [-]

What they said only holds true because the growth continues so that the old volume of videos doesn't matter as much since there's so many more new ones each year compared to the previous year. So the question is more about whether or not it will hold true in the long term, not today

raincole 3 hours ago | parent [-]

The framing here is really weird. The volume of videos increasing isn't 'growth.' Videos are inventory for Youtube. They're only good when people (without adblocks!) actually watch them.

weird-eye-issue an hour ago | parent | next [-]

Growth in this context is that there are a larger volume of videos each year. So each year a single video is exponentially a smaller and smaller percentage of the total.

amelius 3 hours ago | parent | prev [-]

^ This.

pogue 6 hours ago | parent | prev | next [-]

I assume it's an economics issue. As long as they continue making money off the uploads to a higher extent than it costs for storage, it works out for them.

throw_await 6 hours ago | parent [-]

Do they make a profit nowadays

rezonant 4 hours ago | parent [-]

Likely yes, with a margin of perhaps 38%

https://news.ycombinator.com/item?id=34268536

ranger_danger 6 hours ago | parent | prev [-]

I wonder if anyone has ever compiled a list of channels with abnormally large numbers of videos? For example this guy has over 14,000:

https://www.youtube.com/@lylehsaxon

HeliumHydride 5 hours ago | parent [-]

There is a channel with 2 million videos: https://www.youtube.com/@RoelVandePaar/videos One with 4 million videos: https://www.youtube.com/@NameLook

buenzlikoder 5 hours ago | parent | next [-]

NameLook puts a whole new meaning to "low effort videos"

wellf 5 hours ago | parent | prev [-]

First one has transcribed stack overflow to YT by the look of it