Remix.run Logo
__MatrixMan__ 6 days ago

Its not a bad idea, might help in certain cases.

But the real solution to this kind of attack is to stop resolving packages by name and instead resolve them by hash, then binding a name to that hash for local use.

That would of course be a whole different, mostly unexplored, world, but there's just no getting around the fact that blindly accepting updated versions of something based on its name is always going to create juicy attack surface around the resolution of that name to some bits.

mort96 6 days ago | parent | next [-]

The problem here isn't, "someone introduced malware into an existing version of a package". The problem is, "people want to stay up to date, so when a new patch version is released, everyone upgrades to that new patch version".

__MatrixMan__ 6 days ago | parent [-]

The problem is that they implicitly do so. If they had to enter the hash of the latest and greatest version, the onus would be on them at that time to scrutinize it. At worst the spread of the malicious package would be slowed, and at best it would be stopped.

mort96 5 days ago | parent [-]

Surely you'd achieve the same thing by making people manually enter a new version number?

I'm not inherently against the idea of specifying a hash, it would protect against NPM hosting infrastructure being compromised, but again, that's not what we're seeing here

__MatrixMan__ 5 days ago | parent [-]

If you end up with bits that hash to 0xabc123 and I end up with bits that hash to 0x456def and we both think we installed fooapp version 7.8.9, there's nothing about the version number that tells us which one of us has been hacked.

But if we both attempt to install 0x456def, it's clear that whoever has 0xabc123 is in trouble. This is especially important in cases where you might need a package while you're on a separate network partition than npm. Any peer can provide it and you know it hasn't been tampered with because if it had been it would have a different hash.

mort96 5 days ago | parent [-]

> If you end up with bits that hash to 0xabc123 and I end up with bits that hash to 0x456def and we both think we installed fooapp version 7.8.9, there's nothing about the version number that tells us which one of us has been hacked.

If this happens, then either the NPM registry has been compromised or my system's 'npm' program or your system's 'npm' program has been compromised.

If the discrepancy is due to one of us running a malicious 'npm', then that malicious 'npm' CLI program could've just not flagged the signature mismatch, so specifying a hash doesn't help us anything.

If both of us are using a non-compromised 'npm' program, and the NPM registry isn't compromised, then we will never be in a situation where my "fooapp version 7.8.9" and your "fooapp version 7.8.9" differs. If a malicious actor compromises an account with publish access for the fooapp package, then all the malicious actor can do is publish a new "fooapp version 7.8.10" that has malware. That's what has happened in every single one of these high profile NPM hacks. You can't retroactively change old versions of your own packages. So to protect against this kind of attack (which, again, means every single NPM hack to date), just not auto-upgrading between minor versions is enough.

To protect against a compromised NPM registry, I agree that we should have package checksums living in our git repositories. But NPM already has that: the package-lock.json contains checksums. I don't understand what it would give us to have that checksum in the package.json instead of the package-lock.json.

__MatrixMan__ 3 days ago | parent [-]

package.lock only protects after the package has been locked though. If I want to say to you:

> I've audited fooapp==7.8.9 and I believe it is both functional and free of malware

You might act on that information by installing it and locking it to a hash. In this scenario we've missed an opportunity to compare hashes. Maybe you end up with a different one due to a problem with NPM or with one of our connections to it (it's a high value target, so CA's behaving badly isn't out of the question, nor is it being compromised).

If we instead deal in hashes up front, we know that we're talking about the same thing. Also, maybe we're not on the same network partition as NPM for some reason. If you've got a hash for the package, you can get it from whoever happens to have it and you can know it hasn't been tampered with. If you're using names you can't really trust it unless you got it from NPM. In addition to the maybe-NPM-is-inaccessable problems, names create additional load on NPM. Odds are we're in the same room when this conversation happens, so the network path between us is much more likely to be stable, high bandwidth, and free of attackers than the path between each of us and NPM.

mirekrusin 6 days ago | parent | prev | next [-]

name + version are immutable, you can't republish packages in npm under existing version.

you can only unpublish.

content hash integrity is verified in lockfiles.

the problem is with dependencies using semver ranges, especially wide ones like "debug": "*"

initiatives like provenance statements [0] / code signing are also good complement to delayed dependency updates.

also not running as default / whitelisting postinstall scripts is good default in pnpm.

modifying (especially adding) keys in npmjs.org should be behind dedicated 2fa (as well as changing 2fa)

[0] https://docs.npmjs.com/generating-provenance-statements

__MatrixMan__ 6 days ago | parent [-]

Those are promises that npm intends to keep, but whether they do or not isn't something that you as a package user can verify. Plus there's also the possibility that the server you got those bits from was merely masquerading as npm.

The only immutability that counts is immutability that you can verify, which brings us back to cryptographic hashes.

mirekrusin 5 days ago | parent [-]

...which are already present in lockfiles, available in registry ie. https://registry.npmjs.org/debug etc. - it's not a problem.

frankdejonge 6 days ago | parent | prev [-]

Resolving by hash is a half solution at best. Not having automated dependency upgrades also has severe security downsides. Apart from that, lock files basically already do what you describe, they contain the hashes and the resolution is based off the name while the hash ensures for the integrity of the resolved package. The problem is upgrade automation and supply chain scanning. The biggest issue there is that scanning is not done where the vulnerability is introduced because there is no money for it.

__MatrixMan__ 6 days ago | parent [-]

Do you suppose that automated dependency upgrades are less likely to introduce malicious code than to remove it? They're about compliance, not security. If I can get you to use malicious code in the first place I can also trick you into upgrading from safe code to the vulnerable code in the name of "security".

As for lock files, they prevent skulduggery after the maintainer has said "yeah, I trust this thing and my users should too" but the attacks we're seeing is upstream of that point because maintainers are auto-trusting things based on their name+version pair, not based on their contents.

debazel 6 days ago | parent [-]

> If I can get you to use malicious code in the first place I can also trick you into upgrading from safe code to the vulnerable code in the name of "security".

Isn't the whole point that malicious actors usually only have a very short window where they can actually get you to install anything, before shut out again? That's the whole point of having a delay in the package-manager.

__MatrixMan__ 5 days ago | parent [-]

Who is going to discover it in that time? Not the maintainers, they've already released it. Their window for scrutiny has passed.

There is some sense in giving the early adopters some time to raise the alarm and opting into late adoption, but isn't that better handled by defensive use of semantic versioning?

Consider the xzutils backdoor. It was introduced a month before it was discovered, and it was discovered by a user.

If that user had waited a few days, it would just have been discovered a few days later, during which time it may have been added to an even wider scope of downstream packages. That is, supposing they didn't apply reduced scrutiny due to their perception that it was safe due to the soak period.

Its not nothing, but its susceptible to creating a false sense of security.

chuckadams 5 days ago | parent | next [-]

The xz backdoor went undetected so long partly because the build scripts were already so hairy and baroque that no one noticed the extra obfuscations that ran code out of a binary blob in test data. None of which was even in the source repo, it was dropped into the package build scripts externally just before pushing them to the apt/rpm package repositories.

debazel 5 days ago | parent | prev [-]

The maintainers did notice in both of the recent attacks, but it takes time to regain access to your compromised account to take the package down, contact npm, etc.

All recent attacks have also been noticed within hours of release by security companies that automatically scan all newly released packages published to npm.

So as far as I know all recent attacks would have been avoided by adding a short delay.