Remix.run Logo
perihelions 5 days ago

I wouldn't put it past the US to coerce Microsoft into injecting malicious payloads into these types of projects. EFF is putting complete trust in Microsoft's infrastructure: there's no out-of-band verification not served up by Microsoft itself (is there? It's just GitHub.com's TLS, and in-band SHA-1 hashes stored in the repo itself, which Microsoft controls; it can serve whatever bytes it wants, or different bytes on different requests...)

Microsoft has billions of dollars in US intelligence-cloud contracts and should leap at a chance to get an edge in on those. They've done things like this before; they provided incredible (and illegal!) cooperation with the NSA back at the time of the Snowden Leaks[0].

[0] https://www.theguardian.com/world/2013/jul/11/microsoft-nsa-... ("Microsoft handed the NSA access to encrypted messages" (2013))

throw0101d 4 days ago | parent | next [-]

> I wouldn't put it past the US to coerce Microsoft into injecting malicious payloads into these types of projects. EFF is putting complete trust in Microsoft's infrastructure: there's no out-of-band verification not served up by Microsoft itself

Isn't a git commit trail basically a Merkle tree of checksums? If any developer tried to do a pull or fetch they'd suddenly get a bunch of strange commit messages, wouldn't they?

Also: code signing is / can become a thing.

untitaker_ 4 days ago | parent | next [-]

I think GP is talking about a scenario where Microsoft would serve either malicious source tree or binaries to just one user, not all of them. that would be fairly hard to detect. but in such scenarios we'd also have to start asking questions about the state of the entire CA ecosystem.

tstenner 4 days ago | parent [-]

Or detected easily with package builders like Arg Linux's makepkg that ship a hash along with the source URL. As soon as one user gets a different file, he has an alert and the compromised package for later analysis

untitaker_ 4 days ago | parent [-]

like I said, if you assume your adversary is the US government then they might as well start issuing rogue TLS certs to target individuals.

stephen_g 4 days ago | parent | prev | next [-]

It'd be a lot of trouble to interfere with the source, yes.

I think the release files is the place they could most easily tamper - generally they're stored on Github infra so the files could be changed, and the checksum on the download page also altered (or different files and different checksums provided to different people if targeted).

Unless the builds are totally reproducible it'd be tricky to catch.

philihp 4 days ago | parent [-]

Possible, yes, but pretty damming to Microsoft's reputation if proof that their infrastructure has been compromised and anyone realizes it's happening. This sort of thing killed Sourceforge when they started shipping adware bundled into installers of the programs they distributed.

type0 3 days ago | parent [-]

You can't compare it sourceforge, MS is too big to fail

some_furry 4 days ago | parent | prev | next [-]

> Also: code signing is / can become a thing.

To that end, I started a project last month so that code signing can be done in multiple geographical locations at once: https://github.com/soatok/freeon

therein 4 days ago | parent | prev | next [-]

GP was probably referring to the binary releases on the GitHub repo.

perihelions 4 days ago | parent | prev | next [-]

I don't know why you'd trust a checksum structure your adversary has complete control over.

That Merkle tree prevents the naive case where the adversary tries to serve a version of a repo, to a client who already has an older version, differing in a part the client already has. (The part the client has local checksums for). They shouldn't do that. The git client tells the server what commits it doesn't have, so this is simple to check.

Code signing could be a safeguard if people did it, but here they don't so it's moot. I found no mention of a signing key in this repo's docs.

The checksum tree could be a useful audit if there were a transparency log somewhere that git tools automatically checked against, but there isn't so it's moot. We put full trust in Microsoft's versions.

Lots of things could be helpful, but here and now in front of us is a source tree fully in Microsoft's control, with no visible safeguards against Microsoft doing something evil to it. Just like countless others. It's the default state of trust today.

bbarnett 4 days ago | parent | next [-]

Lots of things could be helpful, but here and now in front of us is a source tree fully in Microsoft's control, with no visible safeguards against Microsoft doing something evil to it. Just like countless others

But it's written in rust.

Aloisius 4 days ago | parent | prev | next [-]

> The git client tells the server what commits it doesn't have, so this is simple to check.

That won't work. The first thing the client does is ask the server for list of references with their oids (ls-refs). It only asks for oids and reports what oids it has after the server responds.

You'd need another way to identify that the client asking for references was the same one you vended the tampered source tree to, otherwise, you'd need to respond with the refs' real oids and the fetch would fail since there's no way to get from the oid the user has to the real one.

cyberpunk 4 days ago | parent [-]

Or use signed commits?

marginalia_nu 4 days ago | parent | prev | next [-]

Because the developers have just that on their local machine...?

Git is a distributed vcs after all. Every checkout is its own complete git "hub".

perihelions 4 days ago | parent [-]

Because GitHub can serve different bytes to different people. You log in as one of the project's devs, you get your own consistent, correct view of your project; some other people get malware instead. How do you reconcile the full picture? No one distrusts GitHub. There's no public log which git tools generically check against to see if GitHub is attempting something evil, the way they do with certificate transparency. GitHub is the public log.

Git may be designed as a distributed VCS; and it'd be a different situation if it were used that way in practice. For many projects, GitHub has a full MITM. They could even—forget about the checksums—bifurcate the views in between devs—accept commits from one dev, send over those commits with translated Merkle trees to another dev who has a corrupted history, and they'd never figure it out.

BobaFloutist 4 days ago | parent | next [-]

What happens when a dev tries to patch a bug in the malware and nobody can tell what the hell they're talking about?

saagarjha 4 days ago | parent | prev [-]

Yes, but the moment you try to push your local git will complain that you are not aligned with the upstream repo.

perihelions 4 days ago | parent [-]

Not so. GitHub would remember who you are; advertise to you and to you only a set of fake checksums consistent with your fake view of the repo. Your git client would see nothing amiss—your local fake checksums are consistent with the fake checksums the server sent you. Having accepted your push, the server would ignore the fake checksums, extract the content of your patch, apply it to the genuine repo, and compute a new set of checksums, extending the other checksum tree as if you had pushed directly to it. That's what an MITM is.

saagarjha 4 days ago | parent [-]

This falls apart instantly if you share a hash with anyone else, though. Which is exactly what happens when you send in a PR

account42 4 days ago | parent [-]

Most projects on GitHub have you submit PR's via GitHub infrastructure so they have total control over who sees what there as well.

rstuart4133 4 days ago | parent | prev [-]

> I don't know why you'd trust a checksum structure your adversary has complete control over.

I think the point is they don't have complete control over it. Sure, they have complete control over the version that is on GitHub. But git is distributed, and the developers will have their own local copies. If Microsoft screwed with the checksums, and git checks them. The next developer pull or push would blow up.

perihelions 4 days ago | parent [-]

> "The next developer pull or push would blow up."

If they're pushing or pulling to/from GitHub, then GitHub has a total MITM and is able to dynamically translate checksum trees in between devs' incompatible views of the repo.

cycomanic 4 days ago | parent [-]

I don't understand. Can you explain how that would work? I thought the checksums are calculated on the contents, so how can they translate checksum trees that remain valid without changing the content (or vice versa)? This is my naive understanding, so I might be completely wrong, hence I ask.

perihelions 4 days ago | parent [-]

That they'd change the content is the point—offer malware content for select targets, with corresponding malware checksums that are consistent with that malware and its entire history.

Those checksums would seem valid to the victims, as they're a self-consistent history of checksum trees they got directly from GitHub. The devs would be working with different checksum trees. GitHub would maintain both versions, serving different content and different checksums depending on who asks.

rstuart4133 3 days ago | parent [-]

This seems to boil down to them keeping two repositories - presenting one to the logged in dev, and one to the public.

That might work for a while if dev isn't active. He would, for example have to not notice there was a new release, with an incremented version number that triggers updates. Even that doesn't work forever. Down stream dev's often look at the changes - for example a Debian maintainer usually runs his eye over the changes.

But if the dev is active this is going to be noticed pretty quickly. The branches will diverge, commit messages, feature announcements, bug reports, line numbers not matching up. It would require a skilled operator to keep them loosely in sync, and that's the best they could do.

Either way, sooner or later Microsoft's subterfuge would be discovered, and that is the death knell for this scenario. The outrage here and elsewhere would boil over. Open source would leave github en masse, Microsoft's reputation would be destroyed, they would lose top engineers. I don't have a high opinion of Microsoft's technical skills and leadership as they have been consistently demonstrated themselves to be inconsistent and unreliable. But the company too large and too successful to be psychotic. The shareholders, customers, and lawyers would have someones guts for garters if they pulled a stunt like that.

RS-232 4 days ago | parent | prev [-]

Technically a Merkle DAG

goku12 4 days ago | parent [-]

Both are correct. The commit history is a Merkle DAG. The tree under each commit is a Merkle tree.

aduffy 4 days ago | parent | prev | next [-]

You’re welcome to read the code yourself once you check it out, it’s not very big. Supply chain attacks are a thing but I don’t think this is one.

untitaker_ 4 days ago | parent | prev | next [-]

I don't think there are many options to host sourcecode and binaries in a way that is safe against an adversary like the US, and especially in such a way that technically illiterate users are protected. Because you'd have to assume that CAs are not off-limits either then.

4 days ago | parent | prev [-]
[deleted]