Remix.run Logo
perching_aix 2 hours ago

By using it.

repelsteeltje 2 hours ago | parent | next [-]

It works == it's correct?

perching_aix 2 hours ago | parent | next [-]

Yes? What do you think fuzzing, unit testing, integration testing is for? It's an empirical evaluation of correctness. Literally just try and see.

For actual correctness verification in the strong sense, you'd need to start from a specification written in a formal language so that it's machine checkable, which if I had to guess not even win.rar GmbH has.

wavemode an hour ago | parent | next [-]

You're being needlessly dismissive.

From a philosophical perspective, there's no way to know that any piece of software is truly correct without formal verification.

But in the present, non-philosophical context, it's obvious that what we mean is, colloquially, "how well-tested is this against a variety of edge-case files which the official winrar handles correctly? Is there a test suite, and how robust is it? Plenty of software that claims to be compatible with the rar format, doesn't actually successfully read all rar files."

It's also equally obvious, in the present context, that we would prefer these steps to have been taken by the author of the software before we install it and run it on our own computers and data. The parent commenter wasn't just asking about the software's correctness for the sake of academic curiosity.

ameliaquining 13 minutes ago | parent [-]

The post mentions the existence of an extensive test suite, which you can peruse for yourself if you're so inclined: https://github.com/bitplane/rars/tree/master/crates/rars-for...

I don't know how all these test cases were generated, but at least some of them seem to have been copied (with attribution) from the test suites of earlier FOSS RAR implementations.

The ideal would be to test it against a representative corpus of real-world legacy RAR files, but I'm not sure where you'd find one.

fragmede 4 minutes ago | parent [-]

pirate bay

repelsteeltje 2 hours ago | parent | prev [-]

I hope the developers of, say, the brakes in my car don't interpret 'software correctness' the way you do.

Added, later: hey you changed your comment, added a whole paragraph.

perching_aix 2 hours ago | parent | next [-]

I added the second paragraph about formal verification at the same time you posted, in anticipation that you'd immediately dig your heels into it otherwise, despite me highlighting that the other methods are merely empirical.

I was immediately proven right once I pressed "update". That said, I have now deleted my snarky response that followed. Not in the game of capitalizing off of the human equivalent of a race condition.

I should make a browser addon to delay posting, this is the 2nd time this happens in the past few days.

Edit:

Nevermind, it's already a feature built into the site. Turned it on. I wonder if it applies to edits also...

Nope, doesn't seem to. Oh well, should still help.

repelsteeltje 2 hours ago | parent [-]

Haha, off course! The three major sources of software failures: off by one errors and race conditions.

fragmede 3 minutes ago | parent [-]

race off by one conditions

atiedebee 2 hours ago | parent | prev | next [-]

I hope the brakes in my car don't need developers

arcticbull 2 hours ago | parent | next [-]

ABS doesn't just appear organically.

pixl97 2 hours ago | parent | prev [-]

I think you underestimate the complexity of modern braking systems.

throw1234567891 2 hours ago | parent | prev | next [-]

They used to. Now they have systems, standards, and experience. There are only so many ways you can do brakes on the car.

2 hours ago | parent | prev [-]
[deleted]
mjr00 2 hours ago | parent | prev [-]

This is Rust we're talking about. It doesn't even need to work; as long as it compiles, it's correct.

speedgoose 2 hours ago | parent | next [-]

    use std::fs::File;
    use std::io::prelude::*;
    
    fn main() -> std::io::Result<()> {
        let mut file = File::create("content.txt")?;
        file.write_all(b"3!")?;
        Ok(())
    }
rakel_rakel 2 hours ago | parent [-]

; cat content.txt 3!;

dataflow 2 hours ago | parent | prev [-]

> This is Rust we're talking about. It doesn't even need to work; as long as it compiles, it's correct.

No, it doesn't even need to compile. The mere fact that it's in Rust means it's correct.

slopinthebag an hour ago | parent | prev | next [-]

Thus all software that can be used is correct?

You know what I meant: How can we have confidence that this implementation of RAR is functionally identical to what it's based on? What would give me the confidence to use it in a critical piece of infrastructure?

jaggederest an hour ago | parent | next [-]

Validating compression systems is usually really straightforward. There are 3 layers - decode known values from compressed files (or encode, same), round trip without any alterations, and fuzzing with arbitrary binaries

Because it's a defined format there can be binary exact comparisons between the input and output files - we already have an oracle in the form of proper RAR format software, so if they are identical, you don't need to look further for that specific case.

You can see a version of this that I did quite similarly, for postgresql wire format, here: https://github.com/pgdogdev/pgdog/tree/main/integration/sql

It validates that sql with the same setup, teardown, and test results in perfectly exact compatibility between raw postgresql as the control and various configurations of PgDog, with both the text format and binary format, so ultimately a 6-way multivariate test that should always result in binary-exact results.

slopinthebag an hour ago | parent [-]

Right, that's very different from "using it" and it's also different from "Have an LLM generate code that compiles".

perching_aix 39 minutes ago | parent | prev [-]

> Thus all software that can be used is correct?

You also know what I meant, since I spelled it out in more detail a comment later. But even though you're being facetious, yes, that really is the case. If it works it works. That's the bar for the vast, vast majority of software. Demonstrated practical correctness. If you stumble into a bug, you log it as a defect and fix it. That's all that regular people ever have. In the case of this project, this was achieved via fuzz testing.

It's literally no different to e.g. validating the NTFS driver that ships in the Linux kernel, or validating any other (re)implementation of anything. You just do a bunch of empirical testing and hope for the best. It is also why reimplementations always lag behind.

Hell, I'm 99% sure this is exactly what the actual vendor does too, or at least I sure hope that they do have tests at least. Cause they're sure as shit not using a formally verified compiler toolchain, meaning they definitely don't have a formal proof about whether even the official implementation in itself is correct. Only empirical data at best too.

TacticalCoder an hour ago | parent | prev [-]

I could be correct but way too slow in edge cases (unlikely with Rust but you never know), leaking temporary files, having security holes, etc.

There's much more about correctness of a piece of software than: "produces the same output as the original on x test cases".

I'm not saying it's a bad implementation and, if anything, LLMs are much better at translating/porting existing code (and finding bugs) than at writing things unheard of.

You're basically saying, if I may make a pun: "rust me bro, it's correct".

perching_aix 19 minutes ago | parent [-]

[dead]