Remix.run Logo
Terr_ 11 hours ago

> generating a 10-second AI video costs roughly 160 times more than generating an equivalent amount of text

Hold up, "equivalent" how? It can't be based on "cost" of generation, or else it would be a 1x factor, by definition. Perhaps "costs" in this case refer to the unprofitable gap between revenues and expenses?

> Table 2

Weird, so it looks like some person just arbitrarily decided that 1K GPT-4 text tokens "is equivalent to" 10s of Sora 2 video?

That doesn't seem very rigorous.

motbus3 10 hours ago | parent | next [-]

Let me type and think

(I put it in Gemini for English translation) The 1080p and most expensive tier is 0.70 USD per second. Since Sora 2 runs at 30 FPS, each second of video costs roughly 2.3c per frame. While a single 1920x1080 static image is 765 tokens, video models use spacetime compression. Instead of a raw 22,950 tokens per second (765 tokens x 30 frames), a second of 1080p video equates to roughly 10,000 'latent tokens' due to temporal redundancy. Adding 20 tokens per second of audio, we get roughly 10,020 tokens per second of output. At $0.70 per second for ~10,020 tokens, the cost is approximately $0.00007 per token for Sora 2. 10 seconds of Sora 2 video would cost $7.00 for roughly 100,200 tokens. In comparison, GPT-5.4-pro at 15 USD per 1M output tokens costs $0.000015 per token. To generate 100,200 tokens of text, it would cost only $1.50. This puts Sora 2 at roughly 4.6x more expensive than GPT-5.4-pro per token generated. However, if we ignore video compression and treat every frame as a unique 1080p image (765 tokens each), Sora 2 becomes roughly 30x more expensive in terms of raw computational effort per frame

trillic 11 hours ago | parent | prev | next [-]

It's a well known fact that 1 Picture == 1000 words.

goodmythical 10 hours ago | parent | next [-]

I've often used this in silly pseudo-proofs demonstrating that words have little to no value.

Given that a picture is worth 1000 words, a film (being a string of pictures) at 24fps is 129600 pictures in 90 minutes, and viewing a film might cost $15: a word can be rented for $0.000116 or at a rate of roughly 86 words per penny.

This also tracks well with paperback novels as 70k words would be a little over $8 and 100k words would be just under $12.

That said, I have nothing but the vaguest sense of what an average movie or book costs these days. Are movies $15? Does walmart still have the $5 bin?

What about books? I know that the last time I was in a book store I was somewhat shocked by the prices but that was years ago.

Although, the local used good probably still sells both media for $1/ea. If that's the case, there's an easy frugality argument in the 90 minute movie being worth ~130k words against most novels topping out under 100k.

CrzyLngPwd 10 hours ago | parent | prev | next [-]

30 pictures a second for reasonable video, haha

Just burn money.

latexsalesman2 10 hours ago | parent | prev [-]

[dead]

PaulHoule 11 hours ago | parent | prev | next [-]

Well I guess you could say there is some amount of text that entertains you as much as a 10s Sora video. Judged in terms of time a fast reader might read 50 words in 10s and that is what, 100 tokens? If somebody wants to fudge that up by a factor of 10 (picture is worth a thousand words or something) you get where they are.

Now personally I am not entertained by motion-for-the-sake-of-motion Instagram reels, they actually make me queasy despite having a cast iron stomach and having taught myself to not get sick in VR. So if that's 10s of entertainment, leave me out. I don't care if Tom Cruise is whaling on Brad Pitt or the other way around for that matter, but boy do I want to see the body thetans burst ouf of Cruise's body when OTIII goes horribly wrong.

My reaction to the article was funny. I mean, I saw that 160x thing and thought it was bogus, and of course it is all AI generated and poorly formatted to boot but I did like the overall message. It does remind me of the early 2010s when a lot of sites with photo-based content (including mine) were going out of business because the revenue wasn't enough to pay the hosting costs and a few newcomers like Instagram were survivors and Google was obviously cleaning up with video on YouTube. From the viewpoint of business models for AI video I think there are two questions:

(i) how many times can you get people to watch the same video, i mean, no matter how expensive it is, if you get enough views/ad impressions/other revenue you are OK

(ii) how does it compete with some other way to generate the video?

The picture that the $20 subscription costs $65 to serve doesn't sound too crazy to me. I mean, there might be somebody who can get 3x the value out of a 10s Sora video than somebody else or they could get the cost down by a factor of 1/3.

Aedelon 10 hours ago | parent | next [-]

[dead]

cindyllm 7 hours ago | parent | prev [-]

[dead]

Aedelon 11 hours ago | parent | prev [-]

[dead]