My fear of this sort of thing happening is why I don't use github or gitlab.com for primary hosting of my source code; only mirrors. I do primary source control in house, and keep backups on top of that.

It's also why nothing in my AWS account is "canonical storage". If I need, say, a database in AWS, it is live-mirrored to somewhere within my control, on hardware I own, even if that thing never sees any production traffic beyond the mirror itself. Plus backups.

That way, if this ever happens, I can recover fairly easily. The backups protect me from my own mistakes, and the local canonical copies and backups protect me from theirs.

Granted, it gets harder and more expensive with increasing scale, but it's a necessary expense if you care at all about business continuity issues. On a personal level, it's much cheaper though, especially these days.

▲

wewewedxfgdf 3 days ago | parent | next [-]

I once said to the CTO of the company I worked for "do we back up our source code"?

He said, "no, it's on github".

I said no more.

▲

gerdesj 3 days ago | parent | next [-]

I would suggest your CTO needs some gentle reminders about risk management and how the cloud really works.

If your boss is that daft, its probably a sign to bail out. Remember to do your own personal due diligence. Due dill is not something that you just do for someone else: do your own! Do your own personal risk assessment. If code was lost, who would be found accountable? You or them?

EDIT: PS - I am a CTO ...

▲

wewewedxfgdf 3 days ago | parent [-]

But it's github ...... it can't be lost.

Unless github closes the account, or a hacker gets access, or a rogue employee gets mad and deletes all, or some development accident results in repo deletion, or etc etc.

▲

SanjayMehta 3 days ago | parent | next [-]

Or if the one employee who created the account and was paying for it on his personal credit card and then got laid off.

And no one else in the company knew what GitHub was.

	▲	robotnikman 6 hours ago \| parent [-]
		Is there a story behind this oddly specific comment?

▲

weikju 3 days ago | parent | prev [-]

Gotta add AI agents to that list

▲

OutOfHere 3 days ago | parent | prev | next [-]

I don't understand how the VC financiers got so rich while being so stupid as to hire such stupid people at the executive level, e.g. your CTO.

▲

pintxo 3 days ago | parent [-]

If only 10 out of 100 of your investments make it, does it matter if one of the 90 failed because of lacking backups? Their risk strategy is diversification of investments. Not making each investment itself bulletproof.

	▲	OutOfHere 3 days ago \| parent [-]
		Yes, it is in effect a gamble. The issue is that this strategy doesn't really prove profitable for the majority of VCs. Less than 30% of VCs get to a unicorn or IPO deal. 46% of VCs don't profit at all. This is as per the recent post "(Only) half of senior VCs make at least one successful deal.". I am even ignoring the ones who drop out and don't contribute to the active statistics. The strategy is about as silly as having ten babies and expecting that one of them will make it. It is what you would expect out of the worst poverty-ridden parts of Africa. An alternative is to select and nurture your investments really well, so the rate of success is much higher. I'd like to see the script be flipped, whereby 90% of investments go on to becoming profit making, secondarily with their stable cash income being preferred to big exits.

▲

Dylan16807 3 days ago | parent | prev | next [-]

If nobody has the repo checked out, what are the odds it's important?

▲

bryant 3 days ago | parent | next [-]

> If nobody has the repo checked out, what are the odds it's important?

Oh boy.

Tons of apps in maintenance mode run critical infrastructure and see few commits in a year.

▲

Dylan16807 3 days ago | parent [-]

And the people using it multiple times a year delete it afterwards?

▲

RealStickman_ 3 days ago | parent | next [-]

relying on random local copies as a backup strategy is not a strategy.

	▲	bufferoverflow 3 days ago \| parent [-]
		[dead]

▲

shakna 3 days ago | parent | prev | next [-]

They often only have a binary that you would have to reverse engineer. Source code gets lost.

To step outside just utility programs, the reason why Command & Conquer didn't have a remaster was:

> I'm not going to get into this conversation, but I feel this needs to be answered. During this project of getting the games on Steam, no source code from any legacy games showed up in the archives.

▲

bryant 3 days ago | parent | prev | next [-]

> And the people using it multiple times a year delete it afterwards?

The people wouldn't, but in the environments I'm thinking of, security policies might.

What you're leaning into is a high-risk backup strategy that would rely mostly on luck to get something remotely close to the current version back online. It's pretty reckless.

	▲	darkwater a day ago \| parent [-]
		> The people wouldn't, but in the environments I'm thinking of, security policies might. In environments that go so far (deleting local checkouts of code out of security concerns), I bet they do have a mirror/copy of the version controlled code.

▲

Lammy 3 days ago | parent | prev [-]

More like “none of the people who worked on it are at the company any more”

▲

NewJazz 3 days ago | parent | prev | next [-]

Devs clean up their workstation sometimes. You can get fancy about deleting build artifacts or just remove the whole directory. Devs move to new machines sometimes and don't always transfer everything. Devs leave.

Software still runs, and if you don't have the source then you'll only have the binary or other build artifact.

▲

burnt-resistor 3 days ago | parent | prev | next [-]

Popularity != importance. There is plenty of absolutely critical FOSS code that receives very little maintenance and attention, yet is mission critical to society functioning efficiently. And the same happens in organizations too, with say their bootloader for firmware of hardware products.

▲

OutOfHere 3 days ago | parent | prev [-]

You clearly haven't worked much with code over many years. When laptops change, not all existing projects get checked out.

In fact, in VSCode, one can use a project without cloning and checking it out at all.

▲

Dylan16807 3 days ago | parent [-]

Honestly I'm just really wondering what the odds are. In particular for code that made it onto git.

	▲	OutOfHere 3 days ago \| parent [-]
		Over the long term, the odds reach 100% that it won't be checked out. That's because people mostly only work on newer projects. As for mature older projects, even if they're running in production, cease to see many/any updates, and so they don't get cloned on to newer laptops. This doesn't mean that the older projects are now less important, because if they ever need to be re-deployed to production, only the source code will allow it.

▲

simondotau 3 days ago | parent | prev | next [-]

If the repo is on GitHub and two or more developers keep reasonably up-to-date checkouts on their local computers, the “3-2-1” principle of backups is satisfied.

Additionally to that, if any of those developers have a backup strategy for their local computer, those also count as a backup of that source code.

	▲	wewewedxfgdf 3 days ago \| parent \| next [-]
		CTO explaining that to the CEO when your source code is completely gone: CTO: "I know our entire github repo is deleted and all our source code is gone and we never took backups, but I'm hoping the developers might have it all on their machines." CEO: "Hoping developers had it locally was your strategy for protecting all our source code?" CTO: "It's a sound approach and ticks all the boxes." CEO: "You're fired." Board Directors to CEO: "You're fired."
	▲	Tohsig 3 days ago \| parent \| prev \| next [-]
		Technically true, but only if we consider dev checkouts as "backups". In the majority of cases they probably are, but that's not guaranteed. The local repo copy could be in a wildly different state than the primary origin, a shallow clone, etc... While the odds of that mattering are very low, they're not zero. I personally prefer to have a dedicated mirror as a failsafe.
	▲	koonsolo 3 days ago \| parent \| prev [-]
		The benefit of DVCS. Losing the source code from github when it's all on local computers is the least of problems.

▲

burnt-resistor 3 days ago | parent | prev | next [-]

LMAO. Must be one of those MBA CTOs. At least mirror the crown jewels to bitbucket, Tarsnap, or somewhat else that has 2 weeks - 3 months worth of independent copies made daily.

If not MBA, the problem may also stem from the gradual atrophy and disrespect shown towards the sysadmin profession.

▲

klysm 3 days ago | parent | prev | next [-]

I mean it’s also on everybody’s laptop. Recovering from GitHub going away would be trivial for me

▲

bufferoverflow 3 days ago | parent | prev [-]

[dead]

▲

burnt-resistor 3 days ago | parent | prev | next [-]

Exactly. Techofeudal overlords can switch off all "your" stuff at any time. Always have a personal and a business disaster recovery plan including isolated backups (not synchronized replication) on N >= 2 separate services/modalities.

Options to consider for various circumstances include:

- Different object storage clouds with different accounts (different names, emails, and payment methods), potentially geographically different too

- Tarsnap (while using AWS under the hood but someone else's account(s))

- MEGA

- Onsite warm and/or cold media

- Geographically separate colo DR site, despite the overly-proud trend of "we're 100% (on someone else's SPoF) cloud now"

- Offsite cold media (personal home and/or IronMountain)

▲

cnst 3 days ago | parent | prev | next [-]

How do you distinguish a mirror from not a mirror on GitHub?

I often have my git configured to push to multiple upstreams, this means that basically all of your mirrors can be primaries.

This is a really good part about GitHub. Every copy is effectively a mirror, too, and it's cryptographically verified as well, so, you don't have to worry about the mirror going rogue without anyone noticing.

	▲	floating-io 3 days ago \| parent [-]
		I use GitLab locally and push only to that. GitLab itself is configured to mirror outbound to the public side of things. In a collaborative scenario, doing it that way makes sure everything is always properly synchronized. Some individual's lacking config can't break things.

▲

ransom1538 3 days ago | parent | prev | next [-]

IMHO Lawyers get creative, a github account can show a ton of work activity, nda voilations, etc. Your "private repro" is just a phone call away from being a public repro.

	▲	roncesvalles 3 days ago \| parent [-]
		Your whole GitHub account is a phone call away from being suspended due to frivolous IP/DMCA/what-have-you claims.

▲

nucleardog 3 days ago | parent | prev [-]

> Granted, it gets harder and more expensive with increasing scale, but it's a necessary expense if you care at all about business continuity issues. On a personal level, it's much cheaper though, especially these days.

I don't go as far as "live mirror", but I've been advocating _for years_ on here and in meatspace that this is the most important thing you can be doing.

You can rebuild your infrastructure. You cannot rebuild your user's data.

An extended outage is bad but in many cases not existential. In many cases customers will stick around. (I work with one client that was down over a month without a single cancellation because their line-of-business application was that valuable to their customers.)

Once you've lost your users' data, they have little incentive to stick around. There's no longer any stickiness as far as "I would have to migrate my data out" and... you've completely lost their trust as far as leaving any new data in your hands. You've completely destroyed all the effort they've invested in your product, and they're going to be hesitant to invest it again. (And that's assuming you're not dealing with something like people's money where losing track of who owns what may result in some existence-threatening lawsuits all on its own.)

The barrier to keeping a copy of your data "off site" is often fairly small. What would it take you right now to set up a scheduled job to dump a database and sync it into B2 or something?

Even if that's too logistically difficult (convincing auditors about the encryption used or anything else), what would it take to set up a separate AWS account under a different legal entity with a different payment method that just synced your snapshots and backups to it?

Unless you're working on software where people will die when it's offline, you should prioritize durability over availability. Backups of backups is more important than your N-teir Web-3 enterprise scalable architecture that allows deployment over 18*π AZs with zero-downtime failover.

See, as a case study, Unisuper's incident on GCP: https://www.unisuper.com.au/about-us/media-centre/2024/a-joi...