Remix.run Logo
motbus3 10 hours ago

Let me type and think

(I put it in Gemini for English translation) The 1080p and most expensive tier is 0.70 USD per second. Since Sora 2 runs at 30 FPS, each second of video costs roughly 2.3c per frame. While a single 1920x1080 static image is 765 tokens, video models use spacetime compression. Instead of a raw 22,950 tokens per second (765 tokens x 30 frames), a second of 1080p video equates to roughly 10,000 'latent tokens' due to temporal redundancy. Adding 20 tokens per second of audio, we get roughly 10,020 tokens per second of output. At $0.70 per second for ~10,020 tokens, the cost is approximately $0.00007 per token for Sora 2. 10 seconds of Sora 2 video would cost $7.00 for roughly 100,200 tokens. In comparison, GPT-5.4-pro at 15 USD per 1M output tokens costs $0.000015 per token. To generate 100,200 tokens of text, it would cost only $1.50. This puts Sora 2 at roughly 4.6x more expensive than GPT-5.4-pro per token generated. However, if we ignore video compression and treat every frame as a unique 1080p image (765 tokens each), Sora 2 becomes roughly 30x more expensive in terms of raw computational effort per frame