Remix.run Logo
rvnx 3 days ago

Sadly to go beyond an exercise, having the money is really what you need if you actually want LLMs now, not time.

Nowadays training very powerful LLMs is easy because all the tooling, source-codes, training datasets, and teaching agents are available.

Getting access to dozens of millions of USD or more is not easy, and for big players this is a just drop in their ocean.

contrast 3 days ago | parent | next [-]

You seem to be talking about a production-grade model rather than building an LLM as an exercise? Or if not, why do you disagree with the article's example of building a small LLM for $100?

rvnx 3 days ago | parent [-]

I think I should have replied as a totally separate comment. This is my mistake.

It is nice that the author shared the results of his exercise / experiment. Just got sad as I was reminded (when the 100 USD were mentioned) that all this game is 90%+ about money and hardware rather than skills.

That being said I really like the initiative of the author.

jbs789 3 days ago | parent | next [-]

I understand the emotional aspect of feeling like it’s out of reach for you.

Thing is, if you focus on your own skill development and apply it at even a small scale, very few people do that. Then you go for a job and guess what, the company has resources you can leverage. Then you do that, and ultimately you could be in a position to have the credibility to raise your own capital.

Play the long game and do what you can do now.

rvnx 2 days ago | parent [-]

Not at all. The majority with the current AI craze not really about credibility or skills. It's like a kitchen.

Take a genius chef but give him rotten ingredients. He sweats, he tries, but the meal is barely edible. That's the $100 exercise, but only experts recognize the talent behind.

Take an unskilled cook but give him A5 Wagyu and prepared truffles. The result tastes amazing to the average person who will claim the chef is great (the investors).

It's about access to capital and selling a story ('ex'-Googler doesn't make you competent), not skills.

Great chefs in dark alleys go unnoticed.

Mediocre tourist traps near the Eiffel Tower are fully booked.

Look at Inflection AI. Average results, yet massive funding. They have the "location" and the backing, so they win. It's not about who cooks better; it's about who owns the kitchen but who sells a dream that tomorrow the food will be better.

We don't talk about small funding, we talk about 1.3 billion USD, just for that specific example, yet a tourist trap (using name-dropping / reputation instead of talent)

Snake-oil is rewarded as much as, or even more than real talent; a lot of people cannot see the difference between a chef and the ingredients, this is what I think is sad.

meehai 3 days ago | parent | prev | next [-]

it's skills first and then money and hardware for scale

A more skilled person that understands all the underlying steps will always be more efficient in scaling up due to knowing where to allocate more.

basically... you always need the skills and the money is the fine tuning.

DeathArrow 3 days ago | parent | prev [-]

That is true for many kinds of software where you need a big amount of resources. No matter how skilled I am, I cannot build Facebook, Google, Photoshop alone. But a tiny version of it just to learn? Why not!

victorbjorklund 3 days ago | parent [-]

You could 100% build Facebook. You don’t need any hardcore hardware before you have many users.

victorbjorklund 3 days ago | parent | prev | next [-]

Totally. While the LLM:s today are amazing it is a bit sad that you can’t build SOTA models on your own (vs a few years ago where someone with the skills and access to a dataset could build a state of art models)

Chabsff 3 days ago | parent [-]

In the grand scheme of things, we've only had about a quarter century where you needed a *very* specific kind of problem where prosumer hardware wasn't adequate across computer science as a whole.

It's kind of amazing we got that at all for a while.

djmips 2 days ago | parent [-]

If you discard the early days of gigantic expensive computers. I guess it's come full circle after a fashion.

YouAreWRONGtoo 3 days ago | parent | prev [-]

[dead]