Remix.run Logo
weego 7 days ago

There's no value in Amazon burning money to 'compete' when there no clear endgame. Right now the competition seems to be who can burn a a hundred billion dollars the fastest.

Once a use case and platform has stabilized, they'll provide it via AWS, at which poiny the SME market will eat it up.

bbarnett 7 days ago | parent [-]

Not only that, but all the compute spent, and hardware bought, will be worthless in 5 years.

Just the training. Training off of the internet! Filled with extremists, made up nuttery, biased bs, dogma, a large portion of the internet is stupids talking to stupids.

Just look at all the gibberish scientific papers!

If you want a hallucination prone dataset, just train on the Internet.

Over the next few years, we'll see training on encyclopedias and other data sources from pre-Internet. And we'll see it done on increasingly cheaper hardware.

This tiny branch of computer sciences is decades old, and hasn't even taken off yet. There's plenty of chance for new players.

wiredpancake 7 days ago | parent [-]

How exactly do you foresee "pre-internet" data sources being the future of AI.

We already train on these encyclopedias, we've trained models on massive percentages of entire published book content.

None of this will be helpful either, it will be outdated and won't have modern findings, understandings. Nor will it help me diagnose a Windows Server 2019 and a DHCP issue or similar.

bbarnett 7 days ago | parent [-]

We're certainly not going to get accurate data via the internet, that's for sure.

Just taking a look at python. How often does the AI know it's python 2.7 vs 3? You may think all the headers say /usr/bin/python3, but they don't. And code snippets don't.

How many coders have read something, then realised it wasn't applicable to their version of the language? My point is, we need to train with certainty, not with random gibberish off the net. We need curated data, to a degree, and even SO isn't curated enough.

And of course, that's even with good data, just not categorized enough.

So one way is to create realms of trust. Some data trusted more deeply, others less so. And we need more categorization of data, and yes, that reduces model complexity and therefore some capabilities.

But we keep aiming for that complexity, without caring about where the data comes from.

And this is where I think smaller companies will come in. The big boys are focusing in brute force. We need subtle.

ipaddr 7 days ago | parent [-]

New languages will emerge or at least versions of existing languages till come with codenames. What about Thunder python or uber python for the next release.