Remix.run Logo
oliver236 3 hours ago

isn't this insane? why aren't people freaking out? the jump in capability is outrageous. anyone?

RivieraKid an hour ago | parent | next [-]

I've been increasingly "freaking out" since about 3 - 4 years ago and it seems that the pessimistic scenario is materializing. It looks like it will be over for software engineers in a not so distant future. In January 2025 I said that I expect software engineers to be replaced in 2 years (pessimistic) to 5 years (optimistic). Right now I'm guessing 1 to 3 years.

kypro an hour ago | parent [-]

I assure you it will soon become very clear that mass job losses are one of the least concerning side effects of developing the magic "everything that can plausibly been done within the constraints of physics is now possible" machine.

We're opening a can of worms which I don't think most people have the imagination to understand the horrors of.

MattRix 43 minutes ago | parent [-]

yeesh yep, though it's more Pandora's Box than a can of worms, since it can't exactly be closed once it's opened

Eufrat 2 hours ago | parent | prev | next [-]

Anthropic needs to show that its models continually get better. If the model showed minimal to no improvement, it would cause significant damage to their valuation. We have no way of validating any of this, there are no independent researchers that can back any of the assertions made by Anthropic.

I don’t doubt they have found interesting security holes, the question is how they actually found them.

This System Card is just a sales whitepaper and just confirms what that “leak” from a week or so ago implied.

nsingh2 3 hours ago | parent | prev | next [-]

It's going to be expensive to serve (also not generally available), considering they said it's the largest model they've ever trained.

I suspect it's going to be used to train/distill lighter models. The exciting part for me is the improvement in those lighter models.

azan_ an hour ago | parent | next [-]

What's interesting is that scaling appears to continue to pay off. Gwern was right - as always.

AstroBen 2 hours ago | parent | prev [-]

It seems inevitable that costs will come down over time. Expensive models today will be cheap models in a few years.

mofeien 3 hours ago | parent | prev | next [-]

I am freaking out. The world is going to get very messy extremely quickly in one or two further jumps in capability like this.

RivieraKid an hour ago | parent [-]

Messy in a way that would affect you?

anuramat 3 hours ago | parent | prev | next [-]

"some model I don't get to use is much better at benchmarks"

pick one or more: comically huge model, test time scaling at 10e12W, benchmark overfit

estearum 3 hours ago | parent [-]

So... you're not excited because it might take a few months before we can use it or something? I don't get your comment.

RivieraKid an hour ago | parent | next [-]

Whether you're excited depends on what do you do for living and how close you are to financial independence.

estearum an hour ago | parent [-]

I agree there are other valid reasons not to be excited about this, I just can't make sense of the ones provided above.

randomgermanguy 2 hours ago | parent | prev [-]

I think the general question is if they'll release it at all, haven't yet read anything stating that they would

estearum 2 hours ago | parent [-]

Well let me introduce people to a few brand new concepts:

https://en.wikipedia.org/wiki/Capitalism

https://en.wikipedia.org/wiki/Race_to_the_bottom

https://en.wikipedia.org/wiki/Arms_race

Of course they'll release it once they can de-risk it sufficently and/or a competitor gets close enough on their tail, whichever comes first.

yrds96 2 hours ago | parent | prev | next [-]

I think there's no SOA advance on this one worthy of "freaking out".

Looks like they just built a way larger model, with the same quirks than Claude 4. Seems like a super expensive "Claude 4.7" model.

I have no doubts that Google and OpenAI already done that for internal (or even government) usage.

2 hours ago | parent [-]
[deleted]
nozzlegear 2 hours ago | parent | prev | next [-]

Freak out about what? I read the announcement and thought "that's a dumb name, they sure are full of themselves" – then I went back to using Claude as a glorified commit message writer. For all its supposed leaps, AI hasn't affected my life much in the real except to make HN stories more predictable.

oliver236 2 hours ago | parent [-]

LOL!

RobertDeNiro 2 hours ago | parent | prev | next [-]

Well for one, it’s a PDF

dysoco 3 hours ago | parent | prev | next [-]

Wait until you see real usage. Benchmark numbers do not necessarily translate to real world performance (at least not by the same amount).

risyachka an hour ago | parent | prev [-]

the time to freak out was 2 years ago.