Hi @LoadingAlias,

> Constant-time MAC, AEAD, and signature verification.

That sounds suspiciously incomplete to me.

Which cryptographic algorithms in the library are currently not implemented in constant time?

Where did the speedup come from? How where these optimizations achieved?

What motivated you to write the library? Why not contribute to existing rust crypto libraries instead? How is the work financed?

What peer review strategy are you following with the library? Who else but yourself has verified this code?

▲

CodesInChaos 2 days ago | parent | next [-]

"Constant-time signature verification" stands out, since unlike signature creation, verification doesn't involve secrets, and thus doesn't require constant-time in most threat models.

	▲	LoadingALIAS a day ago \| parent [-]
		[dead]

▲

sevenoftwelve 2 days ago | parent | prev | next [-]

Why do the different sha2 variants not share code? This seems like a lot of opportunities for small mistakes/discrepancies; especially considering the many architectures.

Was any of this generated using AI?

	▲	LoadingALIAS a day ago \| parent [-]
		The SHA2 variants DO share the compression layer where I felt it mattered: - SHA-224 uses the SHA-256 compression kernels w/ different IV/output truncation. - SHA-384 and SHA-512/256 use the SHA-512 compression kernels w/ different IV/output truncation. There IS some duplicated wrapper/finalization/state code per public type, and I agree that is probably the first place where small discrepancies/mistakes can creep in over time. I appreciate you pointing it out; I've added it to the backlog and will look it over as soon as possible. The reason it exists today is more about keeping monomorphized public types simple. I’m not religious about it; if I can reduce that wrapper duplication w/o making the dispatch/type story worse, I should - and I will. The guardrail is that SHA2 has official vectors + differential/proptest coverage against the sha2 crate for one-shot and streaming paths. Yes, I use an LLM daily and have for a few years now. It's used as an assistant during parts of the project, especially for drafting, refactoring passes, test scaffolding, and review prompts. I use an LLM to write markdown files for the public - it's not something I'm great at. I do not treat generated code as trusted... in fact, it's the exact opposite. It has to compile, pass vectors/differentials/fuzz/Miri where applicable, and survive manual review. Also, this is crypto, the tests are not decoration; they are the bar before code counts. I know that our industry is drowning in vibe-coded nonsense; this is not that. This is like a year of my life... and maintaining it for many years to come. A final point I wanted to leave... this is pre-v1. The point of sharing today was to get people to dig into it and find the problems. If there are other issues, inefficiencies, or smells you fine - please, share them. Thank you!

▲

LoadingALIAS 2 days ago | parent | prev | next [-]

Hey! Thank you for taking a second. Really, I appreciate it. So... fair criticism. The constant-time line is too compressed and should probably be replaced w/ some kind of matrix.

I ask you to give me a few hours. I'm not able to like devote the time to the comments that it deserves. I'm nearly home, give me a bit, please.

Thanks!

▲

LoadingALIAS a day ago | parent | prev [-]

Okay, I finally have a second to breathe. Sorry for the delayed response - life and all that.

I am not claiming the entire crate is constant-time, and if the README reads that way, that is my mistake. My intended claim is MUCH narrower... secret-bearing compare/open/verify/private-op paths avoid secret-dependent early exits where it matters.

NO global constant-time claim for:

- parsers/importers/DER/PHC decoding - algo/profile negotiation - keygen and OS randomness paths - public RSA verification/encryption work - hashes/checksums/fast hashes as whole APIs - length/shape rejection before a primitive boundary - Argon2d/scrypt as blanket CT primitives

With respect to AEAD/MAC verification, the important pieces are full tag comparison and opaque failure. For RSA private ops, the relevant pieces are blinding, fixed-window exponentiation with constant-time table selection, public fault checks, failure accumulation, output clearing, and the release-mode leakage regression gate in CI (rsa.yaml). That is the evidence, it's not proof.

The speedups are not from one trick. This is about a year of work, w/ some general planning before that. The main sources are:

- arch dispatch with portable Rust as the reference path - hardware AES/SHA/PMULL/CLMUL/CRC/etc. where available and measurably better - tuned per-size dispatch tables instead of one backend for every length - fused one-shot paths for small HMAC/HKDF/PBKDF2 cases - reusable scratch APIs to avoid repeated allocation, especially RSA - backend-specific kernels for SHA-2/SHA-3/BLAKE3/AEAD/checksums

It is not uniformly faster... but my God, it's close. Crucially, the README.md/OVERVIEW.md call out the losses too: small AEAD overhead, some X25519/RSA verify rows, PBKDF2-SHA256 at iters=1, and platform-sensitive SHA-3/SHAKE behavior... I'm also having some trouble w/ the MacOS Blake3 perf. It's just been elusive af. The `benchmark_results/OVERVIEW.md` is the clearest source for the raw shape of the wins and losses.

My motivation was straightforward - I needed this. My company’s lead product benefits a LOT from removing C libs/FFI, reducing external deps, avoiding competing types, having a unified no_std/WASM story, and making checksums faster. I had worked on https://crates.io/crates/crc-fast previously and wanted to push that kind of direction much further... but it just wasn't going to happen. I contributed the no-std/wasm compat there and then realized... I need to do this myself; that's the point I started like really working out the details. I'd already been exploring it for a while at that point and was tackling Blake3 w/o C-libs head-on for months.

I did consider contributing more to existing Rust crypto libraries, but this was not a small patch series. The shape I wanted was a single pure-Rust primitive stack with small feature-selected/leaf builds, no mandatory C/OpenSSL/system-lib dependency, no_std support, portable fallbacks, and cross arch dispatch built in. The existing Rust crypto ecosystem is important and I use/compare against it heavily; rscrypto is exploring a different packaging/performance/control point.

This is 100% self-funded right now. If/when the OSS side of my startup is ready, rscrypto may become company-maintained, but it will remain open source forever. I cannot afford to start a FIPS validation process yet. I have a backlog, and the first thing - at least before sharing it today - is deciding the FIPS structure if the opportunity presented itself for a subsidized audit.

No formal third-party audit yet, either. I guess I should have been more explicit about that? Current review evidence is purely the public source, official vectors, RSA Wycheproof (I will likely expand Wycheproof when I get a the time), NIST CAVP subset coverage, differential tests against established crates/libs where possible, proptests, fuzzer/corpus replay, Miri, and the RSA CI gate in '.github/workflows/rsa.yaml'.

I’m posting it publicly because I want serious review before pretending it has had one. I know review matters; I simply cannot afford a proper external audit yet, or I would have done it already.