▲ | simonw a day ago | |||||||
I got Claude 4 Opus to summarize this thread on Hacker News when it had hit 319 comments: https://gist.github.com/simonw/0b9744ae33694a2e03b2169722b06... Token cost: 22,275 input, 1,309 output = 43.23 cents - https://www.llm-prices.com/#it=22275&ot=1309&ic=15&oc=75&sb=... Same prompt run against Sonnet 4: https://gist.github.com/simonw/1113278190aaf8baa2088356824bf... 22,275 input, 1,567 output = 9.033 cents https://www.llm-prices.com/#it=22275&ot=1567&ic=3&oc=15&sb=o... | ||||||||
▲ | mrandish a day ago | parent | next [-] | |||||||
Interesting, thanks for doing this. Both summaries are serviceable and quite similar but I had a slight preference for Sonnet 4's summary which, at just ~20% of the cost of Claude 4 Opus, makes it quite the value leader. This just highlights that, with compute requirements for meaningful traction against hard problems spiraling skyward for each additional increment, the top models on current hard problems will continue to cost significantly more. I wonder if we'll see something like an automatic "right-sizing" feature that uses a less expensive model for easier problems. Or maybe knowing whether a problem is hard or easy (with sufficient accuracy) is itself hard. | ||||||||
| ||||||||
▲ | swyx a day ago | parent | prev | next [-] | |||||||
analysis as the resident summaries guy: - sonnet has better summary formatting "(72.5% for Opus)" vs "Claude Opus 4 achieves "72.5%" on SWE-bench". especially Uncommon Perspectives section - sonnet is a lot more cynical - opus at least included a good performance and capabilities and pricing recap, sonnet reported rapid release fatigue - overall opus produced marginally better summaries but probably not worth the price diff i'll run this thru the ainews summary harness later if thats interesting to folks for comparison | ||||||||
▲ | a day ago | parent | prev [-] | |||||||
[deleted] |