Remix.run Logo
giancarlostoro 3 hours ago

Everyone seems to be leaving GitHub, and forgetting the entire spirit of what git is in my eyes. Git was always meant to be decentralized, the problem here is that all the tooling around git was centralized to GitHub because it was a cleaner experience, they scaled nicely, and were properly maintained. I would prefer to still see mirrors on GitHub that are auto-synched because I've seen projects for years either self-host or go somewhere niche, then the GitHub mirror dies or is removed, and said projects go poof to the sands of time for one reason or another, completely gone. Everyone seems to be picking some random git host alternative, and some of them are really simple to use.

Git is decentralized, GitHub is just another place you can host your code in, but you can push your code to multiple remote servers.

bayindirh 3 hours ago | parent | next [-]

While I'm not forgetting the spirit of what Git is, I'm also remembering how GitHub used "all open repositories" to train their first Copilot without telling anyone.

So, no thanks. I'll not be committing any personal code there anymore.

And no, I don't care for the social aspects either. Discoverability, stars, and AI bot powered issue bombardment.

I'm fine like this.

Also, remember, "Open Source is not about You".

thisislife2 an hour ago | parent | next [-]

I completely share your sentiment about feeling irked about open source code being used to train commercial AI models. However, I think the battle is already lost - the nature of copyright and open source code philosophy (currently) means that there isn't any real way of preventing your code being used to train AI. Look at the legal precedents being set in courts where many of the BigTechs have actually pirated copyrighted media to train their AI, and the court has said "that's acceptable". (Ofcourse, the actual act of piracy - like Facebook did by downloading copyrighted material through torrents - may not be legal, but the courts may be lenient here too as there seems to be an undercurrent of government approval to do anything to win the "AI Race").

And, even if you move your repository somewhere else, can you really prevent anyone from uploading it to Github? To do so, you may have to create your open source license.

lelanthran 34 minutes ago | parent [-]

> However, I think the battle is already lost - the nature of copyright and open source code philosophy (currently) means that there isn't any real way of preventing your code being used to train AI.

Laws should make it a double-edged sword, make distillation explicitly legal.

Not much else they can do.

chrischen an hour ago | parent | prev | next [-]

What exactly did they train? Copilot is powered by claude, gemini, or ChatGPT these days.

Did they train autocomplete? I mean the code is open source so anyone can scrape it and train it too. I'm kind of glad they did train it because otherwise we'd still be stuck with Apple level AI models right now.

The whole reason we have so many models, including open weight models, that are all competitive with each other is because the data is free and anyone can be training off it. If the goal was to monetize the source code I guess the authors shouldn't make it open source.

chris_money202 12 minutes ago | parent | next [-]

Yeah have to agree here, Github Copilot itself doesn't have any first party models they use the frontiers. So, they didn't "train" using public repos but they probably allowed (or didn't prevent) the frontiers from pulling the repos along with the rest of the internet when creating their models.

skinfaxi an hour ago | parent | prev [-]

> "GitHub Copilot is powered by generative AI models developed by GitHub, OpenAI, and Microsoft. It has been trained on natural language text and source code from publicly available sources, including code in public repositories on GitHub."

https://azure.microsoft.com/en-us/products/github/copilot#fa...

PaulKeeble 2 hours ago | parent | prev | next [-]

It did so in direct violation of the licenses of the code held there as well and then sold code snippets they had no rights to and still do.

mannanj 11 minutes ago | parent | prev | next [-]

The training on "all open repositories" is the only training we heard about. I wouldn't be surprised if it wasn't beneath these greedy companies to train on other data, and respond "oops! we didn't that would happen" (by which they mean get found out).

Leaving is still the right move. But this applies to all centralized large services: Our use of Google and Google Drive, any Microsoft products, Adobe products, etc.

saurik 2 hours ago | parent | prev | next [-]

I mean, I never put my code on GitHub, but other people put it there, as they use GitHub: you can't not use GitHub. (Hell: even closed source projects, even ones that were never distributed even as a binary, if the code leaks, end up mirrored on GitHub.)

dandellion 2 hours ago | parent | prev | next [-]

Don't forget a achievement badges.

locknitpicker an hour ago | parent | prev [-]

> (...) I'm also remembering how GitHub used "all open repositories" to train their first Copilot without telling anyone.

This is a silly opinion to hold, isn't it? I mean, you release projects under a license with the express purpose of freely distributing your code among anyone in the world that may have any interest whatsoever, and even allow they themselves to share it with anyone they feel fit. But you are somehow outraged if people actually use said code?

Please make it make sense.

dylan604 43 minutes ago | parent | next [-]

Because there's no way the code is distributed properly according to any of the OSS licenses. In fact, it claims authorship with nonsense bylines saying the LLM wrote it.

locknitpicker 39 minutes ago | parent [-]

> Because there's no way the code is distributed properly according to any of the OSS licenses.

What are you talking about? There is no distribution, only read access.

lelanthran 27 minutes ago | parent | prev [-]

> This is a silly opinion to hold, isn't it? I mean, you release projects under a license with the express purpose of freely distributing your code among anyone in the world that may have any interest whatsoever, and even allow they themselves to share it with anyone they feel fit. But you are somehow outraged if people actually use said code?

You're making things up: the outrage is not that people used it, it's that the licence requires attribution at least, and opening the derivative product at worst. Token providers that trained on open source did neither.

> Please make it make sense.

I am skeptical that you didn't know the reason for the outrage because it's been repeated in every single thread where this was discussed.

I myself repeated it multiple times each time this feigned confusion you display appears.

Like I am doing now, yet again.

marxism 10 minutes ago | parent | prev | next [-]

Something nobody's really calling out: Forgejo is genuinely hackable. I just added a "showcase" mode to my instance: private repos can show their README and root file listing publicly (so I can advertise that a project exists and what it does), but viewing actual code, cloning, issues, PRs are all locked behind group membership.

About an hour of work, small and frankly trivial diff: https://peoplesgrocers.com/code/forks/forgejo/pulls/1

I didn't have to fight the architecture at all, the seams were right where I needed them. Added migration adding a boolean column to the repo config table, a few tweaks in permission middleware, and voila, it just worked. Really excellent decoupling in the Forgejo codebase [1]

You can't do anything like this with GitHub. That's the actual freedom! Separate from the where-do-I-host-my-git question. There is a big difference between software that "sure technically I can change it since I have access to the source" vs software that's been constructed specifically to be customized and changed.

[1] Permission checks live in obvious places, the template system let me modify UI without touching unrelated code. Someone (many someones) clearly cared a lot about keeping this codebase modifiable by outsiders, and it shows. That's hard to do and should be more celebrated.

gchamonlive 3 hours ago | parent | prev | next [-]

Yes, but GitHub is more than just git. The most important aspect of the platform that everybody seems to forget is the social component and how easy it made to create a persistent, off-site repository and collaborate across repos.

MrFurious 3 hours ago | parent [-]

The "social component" is a big problem in actual FOSS.

rapnie 3 hours ago | parent | next [-]

People forget what FOSS is, and you get a world of unclear expectations. FOSS is code + a copyright license. How the code is created is an entirely different matter, and where FOSS projects often fall short. As FOSS projects come Forgejo is well-organized around a community governance model.

brunoborges 2 hours ago | parent [-]

Indeed, the fact that maintainers didn't have until only recently the control for disabling Pull Requests tab in a GitHub repo, is what drove a lot of issues in FOSS collaboration over the past decade.

FOSS and open source licenses never ever granted entitlement for contributors to have their proposals reviewed/merged by maintainers. Neither it ever offered entitlement for users to ask for free support.

FOSS is about giving people access to source code so they can do with it whatever they want, and maintainers/authors should have always had the ability to "publish and forget" the source code, without having to deal with those "entitlements".

marcosdumay 32 minutes ago | parent | prev | next [-]

Yes, what's one more reason to abandon the largest platform.

locknitpicker 41 minutes ago | parent | prev | next [-]

> The "social component" is a big problem in actual FOSS.

You're confusing things. The "social component" refers to people interacting with each other. Such as two developers working on a bug or a feature. Or a tester reporting a bug.

This is a big part of actual professional software development work.

bbor an hour ago | parent | prev [-]

IDK, it's hard to criticize the community too much given how wildly, absurdly successful it is. If I arrived on Earth yesterday and you tried to tell me how much software is Free/free in an otherwise-capitalist economy, I wouldn't believe you!

I really really am not trying to start a political argument, but just as food for thought: this is exactly why I have faith in socialism (read: 'prosocial institutions and norms'). And whether socialism is eu- or dys-topian, it certainly cannot happen in the first place without a "social component"!

limagnolia 3 hours ago | parent | prev | next [-]

Forgejo is doing a lot of work to make the tooling decentralized, too. They are using open protocols and standards to link self hosted forges together.

hperrin 3 hours ago | parent [-]

I can’t wait for federation in Forgejo. With that, there’s honestly no reason not to host your own forge.

donmcronald 10 minutes ago | parent | next [-]

I would love to see it happen, but an internal service vs something exposed to the internet can be challenging.

I think services like Cloudflare could play a role if they were able to provide some kind of forward auth and preferential treatment of core users during overload. My self hosted systems would have to be the source of truth and Cloudflare would have to be replaceable for me to consider using it.

Think along the lines of automated pre-auth that coordinates with the origin based on some standard.

Ritewut 2 hours ago | parent | prev [-]

The reason will be that not everyone wants to deal wit maintaining a self-hosted box.

trueno 2 hours ago | parent | next [-]

my eyes have been glazing over it feels like our infra/devops dudes have proverbially given up and they're just looking to buy cloud services to do everything now. security guy looks like he wants to jump off a bridge and i keep trying to nudge them into waking up to not needing 99.9% uptime we'll settle with 95% uptime and no one needs to be on call, and you can go to sleep at night knowing all the code lives behind your damn fort knox firewall company intranet and 75 layers of authentication.

it's interesting because the more paid services these guys bring on board the more complex the security shit gets for them. the head of our IT is a fucking lunatic though and he is steering shit towards utter disaster, he's obsessed with being the guy who picks the next cloud service that "makes things so much better".

my small team is actually considering just getting some mac minis and making a cluster of servers. we decided we don't need infinite uptime for hosting m-f office tools and we can just ... not interface with our infra/devops guys who have lost their damn minds and say no to everything now. they're supposed to be the compute tower under the tragedy known as TBM and they haven't approved a single VM in like 2 years.

lelanthran 23 minutes ago | parent [-]

What would you use a cluster of mac minis for?

I mean, if you're going that far, a couple of refurbished servers gives you far more compute and far more capacity and much better maintainability.

hirako2000 2 hours ago | parent | prev [-]

it's just a few clicks, starting at 2 bucks a month.

https://www.pikapods.com/apps

the__alchemist 9 minutes ago | parent | prev | next [-]

I do not mean for this to come across as a nit, but think it's worth stating explicitly:

> Everyone seems to be leaving GitHub

A small minority is leaving Github; this group is more likely to write articles about the choice than those who still use Github.

Daviey 31 minutes ago | parent | prev | next [-]

This was the original model of launchpad.net, it was supposed to be a hub of Foss that pulled in from the decentralised VCS's, and provide them all via bzr.

But bzr lost the battle, Canonical was slow to adopt Git, lack of investment in the platform, so it was another lunch that got taken from them.

perkovsky 3 hours ago | parent | prev | next [-]

I agree with this. Moving the git repo is easy, moving the whole project surface is the hard part.

Issues, releases, CI, docs, security advisories, search and discoverability all tend to get coupled to GitHub over time.

For open-source projects, I like the idea of self-hosted as the source of truth, but still keeping a read-only GitHub mirror so people can actually find it.

giancarlostoro 3 hours ago | parent | next [-]

...Maybe that's the answer, we need a "hub" for the smaller missing things to start, you pop in your git repository when you join, and it can sit as a thin layer over your repo with issues, releases, etc... Sounds like a lot of work, but doing it piecemeal would do it.

I think trying to re-host git itself might be more trouble than its worth. My kingdom for someone to build this so I don't have to use ADO boards anymore.

radlad 2 hours ago | parent [-]

Like some kind of UI over a database scraped by code which understands Github, Forgejo, Gitlab, sr.ht, etc?

One issue is that issues tend to be monotonically increasing numbers, and references to old issues vs. new issues get confusing over time.

cmrdporcupine an hour ago | parent | prev [-]

The ideal situation is to eliminate thinking that the thought process for "actually finding" a project == GitHub.

We let Microsoft parasitize our brains with this. The software community has long had alternate forums. GitHub isn't even a particularly good one, and it's recently just become a swamp of generated content, fake stars, and mining your content.

In the last couple months at least once a week I get some LLM generated phishing spam from some bot that "found your projects on GitHub and want to collaborate" etc.

And it's well documented now how you can just go out and "buy" GitHub stars.

Please. Cut the umbilical.

pixlmint 3 hours ago | parent | prev | next [-]

GitHub centralizes 2 things: Authentication, as well as Repository Hosting.

Does the code really need to be hosted in a central location like this? (Clearly not, which is why people are leaving GitHub in the first place)

But the one part GitHub provides that's genuinely valuable is the social aspect, and when you get a PR from a user named torvalds you can trust that this is in fact Linus. This isn't the case with more distributed systems.

That's why I'd really like to see some entity handle just the auth/identity providing. Forgejo/ Gitea/ Gitlab instances can then choose to use that. Then, for example if you want to take on another contributor and they have their own forgejo instances, you can invite them through this provider, when they fork your repo it ends up in their own forgejo, and they can easily create PR's into your repo.

chris_money202 8 minutes ago | parent | next [-]

I would argue GitHub does a lot more centralization than just those two. It's an entire developer platform centered around Git. It does hundreds of other things that some developers use, and some don't.

mjw1007 2 hours ago | parent | prev | next [-]

GitHub also centralises abuse detection. I'm not thinking about sophisticated attacks here so much as dealing with plain old spam. That's fairly easy to deal with on a tiny scale, and possible on a huge scale, but it's a great pain at a medium scale.

Ritewut 2 hours ago | parent | prev | next [-]

Tangled is working on something like that. I believe they are federating on the @protocol.

https://tangled.org/

Zambyte an hour ago | parent [-]

I am very active on bsky and I also use some other ATProto applications like tangled. I think this is the first time I have seen anyone refer to ATProto with an '@'

hooverd a minute ago | parent [-]

[delayed]

giancarlostoro 2 hours ago | parent | prev | next [-]

> That's why I'd really like to see some entity handle just the auth/identity providing. Forgejo/ Gitea/ Gitlab instances can then choose to use that. Then, for example if you want to take on another contributor and they have their own forgejo instances, you can invite them through this provider, when they fork your repo it ends up in their own forgejo, and they can easily create PR's into your repo.

Agree, I feel like a true alternative should focus on this missing piece to bridge the gap.

ndriscoll 2 hours ago | parent [-]

The "missing" piece is just everyone implementing OAuth Dynamic Client Registration. Then kernel.org could be its own OAuth provider, and Linus could log into someone's Forgejo with his kernel.org login.

Just like "log in with Google", you should be able to do "log in with OAuth", you type your email or domain (or your browser fills it), and it triggers a redirect flow for login. Then people can use GitHub or Google or Apple or their own provider, just like email. Every email provider could also be an OAuth provider.

bombcar 2 hours ago | parent | prev | next [-]

GitHub is to git like Reddit was to forums. Centralized usernames and such were very nice, but it also has downsides that we’re now living with.

GitHub is still really, really nice in that it’s five seconds to throw up a repo that’s accessible worldwide (98% of the time lol) and everyone’s on there. Whatever replaces it (just like whatever replaces twitter) may be better in many ways, but it will be “worse” in others, even if just in splintering.

lorecore 2 hours ago | parent | prev [-]

Signed commits could solve this in a more decentralized way if people post their public keys on their own domains.

skydhash 2 hours ago | parent [-]

Own domains is the real deal. My preffered model is tarball releases with checksums, or better yet, with signatures (like remind[0] or msmtp[1]). Such pages are trivial to host properly and loads quickly.

[0]: https://dianne.skoll.ca/projects/remind/

[1]: https://marlam.de/msmtp/download/

dewey 3 hours ago | parent | prev | next [-]

I don't think anyone is forgetting that, but most people don't care that much about the decentralized part. They care about it being user friendly, free and for companies if it has all the enterprise features / SSO etc. that they need.

mamcx 2 hours ago | parent | prev | next [-]

"Git is decentralized"

Because is a kind of filesystem.

How a TEAM operate IS NOT.

And that is the point of Github.

There is no escape to the coordination problem!

(And if you say mails, patches, and other asynchronous ways: same thing, more complex)

_flux 3 hours ago | parent | prev | next [-]

I think you're forgetting issue tracking and CI.

shimman 3 hours ago | parent [-]

Forgejo has both these things, I'd even argue Forgejo has a better runner than GitHub actions as it's less resource heavy and easier to debug when issues arise (only ran into one, and it was self inflicted).

_flux 2 hours ago | parent | next [-]

I have no trouble believing it is better :), but it is not as easy to mirror a Github issues, or CI configuration, to Forgejo or back as it is to handle the git side.

I think Radicle is interesting. It doesn't solve the CI bit, at least not yet, but I suppose it's possible to hook up some local runner for it.

There's also a bug tracker which I believe was called bug, but I can't find it ;), that tries to bridge different issue trackers and providing offline mode for working with them.

People of course also love free CI capacity where they can run even untrusted code, so in that sense Microsoft resources might be difficult to compete against.

treyd 2 hours ago | parent | prev [-]

I really wish people would drop the GHA model because it's so bad and insecure by design. GitLab's CI is miles better and easier to use.

shimman 2 hours ago | parent [-]

True but GitLab is going to run into the same issues as GitHub, maybe even worse because GitLab doesn't have a trillion dollar multinational benefactor. Public corporation and developer tooling has never boded well, a current look at GitLab reflects this sentiment perfectly.

Which is why we should always champion FOSS for dev tooling as it's the only way a community can have a say in an industry dominated by unregulated tech behemoths.

locknitpicker an hour ago | parent | prev [-]

> Everyone seems to be leaving GitHub, and forgetting the entire spirit of what git is in my eyes.

And here lies your misconception: services such as GitHub are really not about git. That's a red herring. It's not about tooling either. People use services such as GitHub because of things like issue management, access control, release management, project pages, and CICD integration. You click on a button and you create a repository that's automatically added to your organization, with all access controls sorted out. You click on a button and you grant read access to someone. You click on a button and you onboard a whole team.

Underneath it all, it's completely irrelevant if you are even using Git. Some people even use github's CLI interface instead. Does it matter if it's git or not? Do you even care?

I have personal projects hosted and mirrored across GitHub, Gitlab, and BitBucket. That works, but only as far as backups are concerned. Even in projects that onboarded onto a third party CICD system, git is really not the reason for picking one service over another.