Remix.run Logo
infotainment 19 hours ago

I’m glad to see a new platform that isn’t completely locked down, allowing analysis like this.

The trend toward everything being a walled garden is unfortunate.

afavour 17 hours ago | parent | next [-]

I’m conflicted. I agree with everything you say but I’m concerned about Bluesky eventually being flooded with AI posts trained on its public dataset. Being open could very easily lead to downfall.

tptacek 17 hours ago | parent [-]

I feel like LLM models have had the opportunity to be trained on a sufficient amount of social media posts at this point that it's unlikely to matter.

afavour 17 hours ago | parent [-]

I’m curious to see how things look, say, ten years from now. The way people use social networks, the language they use, even the memes they trade in changes over time. I can absolutely imagine an out of date AI giving themselves away by repeating todays equivalent of “rawr xD” to a future audience.

ilrwbwrkhv 18 hours ago | parent | prev [-]

[flagged]

nl 18 hours ago | parent | next [-]

This is wonderful! Openness means open to all.

And Open data decreases barriers to entry.

If all data was in a state where people had to pay for it then only large companies can use it. With open data it's very foreseeable in 10 years time it will be very likely a hobbyist can train a performant LLM at home from scratch.

jph00 18 hours ago | parent | prev | next [-]

Actually, since this isn’t locked up by the big copyright holders, we can all use it and profit.

mplewis 18 hours ago | parent [-]

How will you use it to profit? You don’t have sweetheart cloud deals on ML training clusters. This benefits big players, not us.

nl 18 hours ago | parent | next [-]

"us" is relative.

There are plenty of people on HN who have their own ML training clusters and aren't really big tech. For example natfriedman has https://andromeda.ai/

And right now, today I can fine tune LLMs on this scale of data at home. In 5 or 10 years people will be able to training from scratch at home.

Computational resource barriers are temporary. Licensing is forever.

tomrod 18 hours ago | parent | prev [-]

Most impactful ML can be created on colab. Not Chatgpt, but most of the stuff not on the long tail.

unshavedyak 18 hours ago | parent | prev [-]

Not all the profit, really. All would imply there was no value to begin with. I get the dislike, but i still comment on the open web because it has value to me. I'm still willing to answer questions on SO/reddit/etc because it has value to at least one (and hopefully more) people. That hasn't changed.

Not sure what to say about companies making money off of my data.. but the posting itself doesn't seem to be that much of a negative.

Thoughts? I see this sentiment a lot and it almost feels like "open" is bad these days. If anything i feel it almost is more important than ever.. as we're on the cusp of no need to ever go to forums/interact/etc.