My problem with LLMs (apart from philosophical aspects and economical impact) is that it would be unlikely for any of us to be able to train something functional locally (toy-like LLMs -- sure, but something really useful -- no). Apart from that it requires immense computing power, it also requires a dataset which is for the most part is obtained illegally.

▲

kibwen 11 hours ago | parent | next [-]

This seems overly pessimistic.

I may personally be of modest intelligence, but to acquire the intelligence that I do have, I did not need to train on every book ever written, every Wikipedia article ever written, every blog post ever written, every reference manual ever written, every line of code ever written, and so on. In fact, I didn't train on even 1% of those materials, or even 0.00000000001% of those. The texts themselves were demonstrably not a prerequisite for intelligence.

At minimum, given that it only took me about 20 years of casual observation of my surroundings to approximate intelligence, this is proof positive that the only "dataset" you need is a bunch of sensors and the world around you.

And yes, of course, the human brain does not start from zero; it had a few million years of evolution to produce a fertile plot for intelligence to take root. But that fundamental architecture is fairly generic, and does not at all seem predicated on any sort of specific training set. You could feasibly evolve it artificially.

▲

krupan 10 hours ago | parent | next [-]

What does this even have to do with the parent? Your capabilities have nothing to do with LLM capabilities. The two work in completely different ways. The reason LLMs work is because they are huge and have been trained on vast amounts of data, full stop. Sure, there's potential someday to get something useful using less data, but we aren't there.

	▲	avadodin 9 hours ago \| parent [-]
		You are right on the limitations of the architecture but I wouldn't call LLMs huge. Flagship models maybe but that's just because they don't scale very well. A universal translator with image and voice recognition and a decent breadth of encyclopedic knowledge in only a small fraction of an English Wikipedia dump(6GB/20+GB) is not "huge". It is probably closer to the theoretical limit than anyone could have expected.

▲

_heimdall 11 hours ago | parent | prev [-]

You're also embodied and experiencing the world around you with more senses than only the ability to read text.

	▲	rogerrogerr 11 hours ago \| parent [-]
		> the only "dataset" you need is a bunch of sensors and the world around you.

▲

dlcarrier 11 hours ago | parent | prev | next [-]

Not the whole thing, at least with current technology, but LoRAs are really good at fine tuning, and can be generated in a few hours on high-end gaming computers, so as long as the base model is in your language, you likely have enough spate computing power, in whatever electronics you own, to train a few LoRAs a month.

In the future, when regular home computers have the capabilities of modern servers, we'll be able to train the entire LLM at home.

▲

pronik 10 hours ago | parent | prev | next [-]

There is so much technology that we are unable to reproduce locally, I don't think LLMs are in any way different. There will be large LLM manufacturers, small LLM manufacturers, LLM artisanals, LLM enthusiasts and of course LLM consumers, just like with everything.

▲

krupan 10 hours ago | parent | prev | next [-]

And this is important because even though you are running a model locally, it's still a proprietary model. You have no say in what it was trained on, how that training data is labeled, what the guardrails are, what biases it might have, none of that.

▲

woah 9 hours ago | parent | prev | next [-]

Can you make your own CPU, locally?

▲

Ucalegon 11 hours ago | parent | prev | next [-]

Depends on the domain. There are plenty of different use cases where the data needed for training is available for personal, or non-commercial, use. At that point, it does come down to compute/time to do the training, which if you are willing to wait, consumer grade hardware is perfectly capable of developing useful models.

▲

RataNova 10 hours ago | parent | prev | next [-]

That's a fair concern, but I'd separate training from inference here

▲

cyanydeez 11 hours ago | parent | prev [-]

That sounds like government. So your problem is mostly that you expect to have a collective social effort, but not enough to pay for it as a public good.