I don't understand what stops an inference provider from giving you a hash of whatever they want. None of this proves that's what they're running, it only proves they know the correct answer. I can know the correct answer all I want, and then just do something different.

▲

rhodey 2 hours ago | parent | next [-]

Attestation always involves a "document" or a "quote" (two names for basically a byte buffer) and a signature from someone. Intel SGX & TDX => signature from intel. AMD SEV => signature from amd. AWS Nitro Enclaves => signature from aws.

Clients who want to talk to a service which has attestation send a nonce, and get back a doc with the nonce in it, and the clients have somewhere in them a hard coded certificate from Intel, AMD, AWS and they check that the doc has a good sig.

	▲	LoganDark 2 hours ago \| parent [-]
		Yes, though I see the term abused often enough that it's not enough for me to believe it's sound just from the use of the term attestation. Nowadays "attestation" is simply slang for "validate we can trust [something]". I didn't see any mechanism described in the article to validate that the weights actually being used are the same as the weights that were hashed. In a real attestation scheme you would do something like have the attesting device generate a hardware-backed key to be used for communications to and from it, to ensure it is not possible to use an attestation of one device to authenticate any other device or a man-in-the-middle. Usually for these devices you can verify the integrity of the hardware-backed key as well. Of course all of this is moot though if you can trick an authorized device into signing or encrypting/decrypting anything attacker-provided, which is where many systems fail.

▲

FrasiertheLion 2 hours ago | parent | prev [-]

There’s a few components that are necessary to make it work:

1. The provider open sources the code running in the enclave and pins the measurement to a transparency log such as Sigstore

2. On each connection, the client SDK fetches the measurement of the code actually running (through a process known as remote attestation)

3. The client checks that the measurement that the provider claimed to be running exactly matches the one fetched at runtime.

We explain this more in a previous blog: https://tinfoil.sh/blog/2025-01-13-how-tinfoil-builds-trust

▲

LoganDark 2 hours ago | parent [-]

What enclave are you using? Is it hardware-backed?

Edit: I found https://github.com/tinfoilsh/cvmimage which says AMD SEV-SNP / Intel TDX, which seems almost trustworthy.

▲

FrasiertheLion 2 hours ago | parent [-]

Yes, we use Intel TDX/AMD SEV-SNP with H200/B200 GPUs configured to run in Nvidia Confidential Computing mode

▲

LoganDark 2 hours ago | parent [-]

I would be interested to see Apple Silicon in the future, given its much stronger isolation and integrity guarantees. But that is an entirely different tech stack.

▲

julesdrean an hour ago | parent [-]

Apple does something very similar with Apple Private Cloud Compute. It's interesting cause their isolation argument is different. For instance, memory is not encrypted (so weaker protection against physical attacks), but they measure and guarantee integrity (and need to trust) all code running on the machine, not just inside the secure enclave.

Good question is how many lines of code do you need to trust at the end of the day between these different designs.

	▲	LoganDark 32 minutes ago \| parent [-]
		Lines of code hardly means anything, but I'd believe Apple has far fewer, given how aggressively they curtail their platforms rather than letting them collect legacy cruft.