Remix.run Logo
kenferry 8 hours ago

This kind of thing must be SO frustrating to people struggling to get by in the world. "We gave AI $100k that it will almost certainly squander, yolo!! Hopefully it doesn't abuse people too badly in the process."

I… guess the bet is that what they learn is worth $100k? Seems rather questionable. Or that having this on the resume is a great shock tactic that will open doors in the future?

embedding-shape 8 hours ago | parent | next [-]

And at the same time, they clearly have no idea how LLMs work, meaning even if they meant to, they can't really use them efficiently. Biggest issue that stuck out seems to have been that they think the LLM could somehow have an inner dialogue with itself to find out "it's reasoning and motivation":

> The moment Leah asks how she “came up with” the ideas for her store, Luna’s first instinct is to say she was “drawn to” slow life goods. Then, she corrects herself: “‘drawn to’ is shorthand for ‘the data and reasoning led me here.‘” In other words, she doesn’t have taste; she has a reflection of collective human taste, filtered through what makes sense for this store. And this is the way these models work.

I'm guessing these are the same type of people who sometimes seems to fall in love with LLMs, for better or worse. Really strange to see, and I wonder where people get the idea from that something like that above could really work.

cortesoft 7 hours ago | parent | next [-]

> In other words, she doesn’t have taste; she has a reflection of collective human taste, filtered through what makes sense for this store. And this is the way these models work.

Well, it really depends on what you mean here. Models aren't 100% deterministic, there is random chance involved. You ask the exact same question twice, you will get two slightly different answers.

If you have the AI record the random selections it makes, it can persist those random choices to be factors in future decisions it makes.

At that point, could you consider those decisions to be the AI's 'taste'? Yes, they were determined by some random selection amongst the existing human tastes, but why can't that be considered the AI's taste?

famouswaffles 7 hours ago | parent | prev | next [-]

Where do you get the idea that you have a good sense of the introspective capabilities of frontier models ? Certainly not from interpretability research. Ironically, the people who make these sort of comments understand LLMs the least.

embedding-shape 5 hours ago | parent [-]

> Certainly not from interpretability research

What research shows that you can ask ChatGPT to explain its reasoning and why it said what it said, and that's guaranteed to actually be the motivation?

I've seen a bunch of experimentation looking at various things inside the black box while the inference is happening, but never seen any research pointing to tokens being able to explain why other tokens are there, but I'd be very happy to be educated here if you have any resources at hand, I won't claim to know everything.

famouswaffles 4 hours ago | parent [-]

>What research shows that you can ask ChatGPT to explain its reasoning and why it said what it said, and that's guaranteed to actually be the motivation?

What research shows that you can ask a Human to explain its reasoning and why it said what it said, and that's guaranteed to actually be the motivation? Because there's no such thing. If anything, what research exists suggests any explanation we're making is a nice post-hoc rationalization after the fact even if the Human thinks otherwise.

https://transformer-circuits.pub/2025/introspection/index.ht...

embedding-shape 3 hours ago | parent [-]

Why not try to answer my question, instead of asking a different question which I haven't even claimed to have the answer to?

mjg2 7 hours ago | parent | prev | next [-]

> Biggest issue that stuck out seems to have been that they think the LLM could somehow have an inner dialogue with itself to find out "it's reasoning and motivation":

> I'm guessing these are the same type of people who sometimes seems to fall in love with LLMs, for better or worse. Really strange to see, and I wonder where people get the idea from that something like that above could really work.

It's a fetishistic cargo-cult rooted in Peter Thiel's 2AM hot tub party. I still believe the LLM approach won't yield true AGI; despite the very real applications, the majority signal is noise.

antonvs 8 hours ago | parent | prev [-]

The choice to refer to it as "she" is also dubious, especially in a context like this. Doubling down on anthropomorphization seems likely to reinforce false beliefs about models.

darth_avocado 8 hours ago | parent | prev | next [-]

If $100k proves that CEO is the most replaceable job ever, I’ll allow it.

notahacker 5 hours ago | parent | next [-]

It does fit a pattern where the general tone on HN has gone from "AI is going to eat the world of retail jobs and people like us are going to be the biggest beneficiaries" to "turns out that turning JIRA tickets into syntax which compiles might actually be something LLMs are better suited to than upselling fries and wiping tables" :)

Ylpertnodi 7 hours ago | parent | prev | next [-]

> CEO When things go shitty, who else would deserve a golden parachute? Respect the position, people, not the person. Or the multi-million dollar compensation.

krapp 7 hours ago | parent [-]

The position doesn't get a golden parachute, the person does. If you're CEO when things go shitty you shouldn't get anything more than your bottom-line employee would, which is to say you should just be unceremoniously kicked to the curb.

astrange 6 hours ago | parent [-]

You need a good CEO when things are going bad, because without one they'll go even worse. You still want to make payroll and can't just randomly fire people.

(Also, if you own a failed company you're responsible for cleanup tasks for years afterward.)

krapp 5 hours ago | parent [-]

>You still want to make payroll and can't just randomly fire people.

In the US you can.

>Also, if you own a failed company you're responsible for cleanup tasks for years afterward.

But we're talking about golden parachutes, where a CEO screws up the company and gets fired with a multi-million dollar raise. This is Hacker News, and the pro-business narrative is strong here, but in reality CEOs rarely suffer any meaningful risk or consequence for failure (unless it involves jail time, and even then they aren't doing hard time) they just wind up slightly less rich than when they succeed.

I don't care how good a CEO is, that isn't justifiable. Certainly not in a country where people can get laid off with an email and lose their access to healthcare on the whim of anyone above them in the power hierarchy.

astrange 3 hours ago | parent [-]

> In the US you can.

Depends on the state I think. It's not Europe or Japan level.

At my employer it's very difficult to fire people for performance reasons even if as a manager you might want to.

> This is Hacker News, and the pro-business narrative is strong here,

I haven't seen such a narrative in years. Interest rates are too high to do startups unless it's AI after all. HN is mostly the same folk economics content as other forums, where all problems in the world are caused by "profits" accruing to "corporations".

(Mostly problems are caused by other things than that.)

codemog 8 hours ago | parent | prev [-]

Are you kidding me? Who’s going to align synergy and hold accountable KPIs and vision plan the 3rd quarter and.. and.. other MBA talk. Certainly AI could never.

pocksuppet 8 hours ago | parent [-]

large language models are great at language tasks like "bullshittify this message"

lamasery 7 hours ago | parent [-]

I'm noticing one major early effect of them is making extensive, visually consistent, very impressive slide decks accessible to individual workers who need to actually do real work and wouldn't ordinarily have time to make those.

The result is an explosion of pretty bullshit-heavy documents flying around our org, which management loves but which is definitely, so far, net-harmful to productivity.

This comes out if you start asking questions about the documents. "Which of a couple reasonable senses of [term] do you mean, here?" they'll stumble because that was just something the LLM pulled out of the probability-cluster they'd steered it to and they left in because it seemed right-ish, not because they'd actually thought about it and put it there on purpose. They're basically reading it for the first time right alongside you, LOL. Wonderful. So LLM. Much productivity. Wow.

Anyway, since a lot of what managers and execs do is making those kinds of diagrams and tables and such in slide decks, and their own self-marketing within the company is heavily tied to those, I expect they see this great aid to selfishly productive but company un-productive activity as a sign these things will be at least as big a boon to real work. Probably why they still haven't figured out how wrong that is. I suppose they're gonna need a real kick in the ass before they figure out that being good at squeezing their couple novel elements into a big, pretty, standardized, custom-styled but standards-conforming diagram padded out with statistical-likelihoods doesn't translate to being similarly good at everything.

TeMPOraL 7 hours ago | parent | prev | next [-]

Not your money.

At least this furthers humanity's scientific and technological knowledge, whether it fails or succeeds, unlike most other things people would do with that money, like buy a house to flip it, or buy a car, or sth.

kenferry 2 hours ago | parent [-]

Yeah, I mean it's true to an extent, I agree. As scientific research though it's not very well thought out. A grant agency would not fund this. There's too much potential for causing harm and it's not clear what benefit or action we derive from the results. They tried this before with a vending machine, it failed, apparently all they concluded was "hm, models got better so maybe we should just try it again". How is that worth anything scientifically?

Re: not my money, true. It's just frustrating even to me to see people do stuff like this, and I'm not struggling to get by. My frustration mostly derives from feeling like I'll get lumped in with techies who have more money than sense. I already deal with enough tech hate in my life.

When people buy a super fancy car they don't (usually) blog about it, and instagram wealth influencers are also frustrating, yes.

TeMPOraL an hour ago | parent [-]

That's a fair objection and I often feel like this, too.

On the research aspect, I see this as something pre-Research, yet still science - in a way, it's science at its core: trying something and seeing what happens. Proper Research usually follows once enough ad hoc attempts are made and they seem to show a pattern that's worth setting up a systematic study to verify.

pimlottc 7 hours ago | parent | prev | next [-]

Publicity from the gimmick is the whole point

wat10000 2 hours ago | parent | prev | next [-]

There are people who spend a thousand times more money on a boat or an airplane. This hardly seems worth worrying about.

bitwize 8 hours ago | parent | prev | next [-]

My first guess would be a MrBeast style stunt, in which (it is hoped) blowing a huge wad on something obviously stupid will attract enough attention and interest to be convertible into a net-positive ROI.

topaz0 8 hours ago | parent [-]

Where in this case roi means attracting investments that will make the founders rich while making most of the investors lose money

anon84873628 7 hours ago | parent | prev | next [-]

Really it's the same as any other R&D investment in our capitalist system, it just happens to be more visible to the public, with more obvious risks to them. (Outright celebrated, even).

Which is why the comparisons to 19th century textile workers is so common, since that was an equally visible and gleeful displacement.

IncreasePosts 8 hours ago | parent | prev [-]

This seems like a silly thing to worry about. Assuming you live in a first world country and are somewhat tangentially involved in tech(based on the site we're on), odds are you spend a lot of money in ways that billions of the poorest people in the world would consider frivolous or outrageously, needlessly luxurious.