| ▲ | ok123456 4 hours ago |
| How do you prevent compression bomb attacks when files can define their own compression functions? You could have some kind of OOM killer, but that will be a "footgun" that people who are actually doing "big data" will constantly shoot. This pretty much kills any ingestion pipeline where the source is untrusted. |
|
| ▲ | computomatic 3 hours ago | parent | next [-] |
| It seems like the WASM is simply a fallback if no other decoder is available. If the data source is untrusted, simply don’t run the WASM decoders. “Some code is untrusted” does not mean code should never be executed. There are more use cases with trusted sources than untrusted. |
| |
| ▲ | ok123456 3 hours ago | parent [-] | | So I define the data type to be "asdklfjaslkdfjiolsadfjoiusadfoiasfoikasjfdoisadf" and give you a decoder for it. |
|
|
| ▲ | johncolanduoni 4 hours ago | parent | prev | next [-] |
| OOM killing in WebAssembly is trivial, since it’s all in a growable linear memory. All the runtimes I’m aware of have a simple maximum memory setting, and they’ll trap any allocation requests after that point. |
| |
| ▲ | blmarket 3 hours ago | parent | next [-] | | Attack is not just on file format itself. Based on the function signature it's possible for a single decoder to generate infinite bytestream - makes a lot of headache to reader implementation - implementing STRLEN is no longer trivial question. Either engines should put some limit (e.g. VARCHAR(2000) to enforce length to be limited to 2000, but there are some other engines supporting unlimited BLOBs), or decoder should give a hint what is the maximum length it will yield. Unfortunately current research level project does not have such considerations implemented yet... | |
| ▲ | ok123456 3 hours ago | parent | prev | next [-] | | For images, it makes sense: people dealing with 16k x 16k PNGs are uncommon. Give them an error message that tells them the setting to bump. But what should be the threshold for "big data"? I'm sure it will follow Zipf's Law, but the tail will be fatter. | |
| ▲ | titzer 4 hours ago | parent | prev [-] | | And many of them have built-in gas metering, so you can time out the decode if it runs too many instructions. |
|
|
| ▲ | kibwen 3 hours ago | parent | prev [-] |
| Denial-of-service is bad, but it's not in the same ballpark, the same sport, the same planet, or the same universe of bad as RCE. |