And I'm not done optimizing. The perf will get better. Rust and Yolo-C will always be faster, but right now we can't know what the difference will be.

Top optimization opportunities:

- InvisiCaps 2.0. While implementing the current capability model, when I was about 3/4 of the way done with the rewrite, I realized that if I had done it differently I would have avoided two branch+compares on every pointer load. That's huge! I just haven't had the appetite for doing yet another rewrite recently. But I'll do it eventually.

- ABI. Right now, Fil-C uses a binary interface that relies on lowering to what ELF is capable of. This introduces a bunch of overhead on every global variable access and every function call. All of this goes away if Fil-C gets its own object file format. That's a lot of work, but it will happen in Fil-C gets more adoption.

- Better abstract interpreter. Fil-C already has an abstract interpreter in the compiler, but it's not nearly as smart as it could be. For example, it doesn't have octagon domain yet. Giving it octagon domain will dramatically improve the performance of loops.

- More intrinsics. Right now, a lot of libc functions that are totally memory safe but are implemented in assembly are implemented in plain Fil-C instead right now, just because of how the libc ports happened to work out. Like, say you call some <math.h> function that takes doubles and returns doubles - it's going to be slower in Fil-C today because you'll end up in the generic C code version compiled with Fil-C. No good reason for this! It's just grunt work to fix!

- The calling convention itself is trash right now - it involves passing things through a thread-local buffer. It's less trashy than the calling convention I started out with (that allocated everything in the heap lmao), but still. There's nothing fundamentally preventing a Fil-C register-based calling convention, but it would take a decent amount of work to implement.

There are probably other perf optimization opportunities that I'm either forgetting right now or that haven't been found yet. It's still early days!

▲

jacquesm 3 days ago | parent | next [-]

This is such an interesting project.

I've always been firmly in the 'let it crash' camp for bugs, the sooner and the closer to the offending piece of code you can generate a crash the better. Maybe it would be possible to embed Fil-C in a test-suite combined with a fuzzing like tool that varies input to try really hard to get a program to trigger an abend. As long as it is possible to fuzz your way to a crash in Fil-C that would be a sign that there is more work to do.

That way 'passes Fil-C' would be a bit like running code under valgrind and move the penalty to the development phase rather than the runtime. Is this feasible or am I woolgathering, and is Fil-C only ever going to work by using it to compile the production code?

▲

SkiFire13 3 days ago | parent [-]

From what I understand some things in Fil-C work "as expected" instead of crashing (e.g. dereferencing a pointer to an out of scope variable will give you the old value of that variable), so it won't work as a sanitizer.

▲

1718627440 2 days ago | parent [-]

You can use the built-in sanitizer from your compiler though.

▲

SkiFire13 2 days ago | parent [-]

At that point why use Fil-C for this though?

▲

1718627440 2 days ago | parent [-]

Because you don't want to let it crash in production? Sanitizer for testing Fil-C for shipping.

▲

pornel 2 days ago | parent | next [-]

Fil-C will crash on memory corruption too. In fact, its main advantage is crashing sooner.

All the quick fixes for C that don't require code rewrites boil down to crashing. They don't make your C code less reliable, they just make the unreliability more visible.

To me, Fil-C is most suited to be used during development and testing. In production you can use other sandboxing/hardening solutions that have lower overhead, after hopefully shaking out most of the bugs with Fil-C.

▲

jacquesm 2 days ago | parent | next [-]

The great thing about such crashes is if you have coredumps enabled that you can just load the crashed binary into GDB and type 'where' and you most likely can immediately figure out from inspecting the call stack what the actual problem is. This was/is my go-to method to find really hard to reproduce bugs.

▲

jitl 2 days ago | parent | prev [-]

I think the issue with this approach is it’s perfectly reasonable in Fil-C to never call `free` because the GC will GC. So if you develop on Fil-C, you may be leaking memory if you run in production with Yolo-C.

▲

pornel 2 days ago | parent [-]

Fil-C uses `free()` to mark memory as no longer valid, so it is important to keep using manual memory management to let Fil-C catch UAF bugs (which are likely symptoms of logic bugs, so you'd want to catch them anyway).

The whole point of Fil-C is having C compatibility. If you're going to treat it as a deployment target on its own, it's a waste: you get overhead of a GC language, but with clunkiness and tedium of C, instead of nicer language features that ground-up GC languages have.

▲

jacquesm 2 days ago | parent [-]

I agree with you but jitl has a point: implicit reliance on the GC could creep in and you might not notice it until you switch back to regular C.

▲

estebank 2 days ago | parent [-]

Fil-C should have a(n on by default) mode where collecting an unfreed allocation is a crash, if it doesn't already.

	▲	pizlonator 2 days ago \| parent [-]
		It's not that simple since some object allocations go unfreed. For example, Fil-C lifts all escaping locals to the heap, but doesn't free them.

▲

SkiFire13 2 days ago | parent | prev [-]

I think you're missing a bit of context from the parent comments:

> Maybe it would be possible to embed Fil-C in a test-suite

▲

qdotme 3 days ago | parent | prev | next [-]

Can you elaborate on what makes ELF (potentially with custom sections/extension and maybe custom ld.so plugin) insufficient?

A lot of remarkably unusual stuff has been shoved into the format without breaking the tooling, so wondering what the restrictions are.

▲

baq 2 days ago | parent | prev | next [-]

> Rust and Yolo-C will always be faster

graydon points in that direction, but since you're here: how feasible is a hypothetical Fil-Unsafe-Rust? would you need to compile the whole program in Fil-Rust to get the benefits of Fil-Unsafe-Rust?

▲

zozbot234 2 days ago | parent | next [-]

It's reasonably easy if you can treat the Safe Rust and Fil-Unsafe-Rust code as accessing different address spaces (in the C programming sense of "a broad subset of memory that a pointer is limited to", not the general OS/hardware sense), since that's essentially what the bespoke Fil-C ABI amounts to in the first place. Which of course is not really a good fit for every use of Unsafe Rust, but might suffice for some of them.

▲

pizlonator 2 days ago | parent | prev | next [-]

What is Fil-Rust and Fil-Unsafe-Rust?

▲

kobebrookskC3 2 days ago | parent [-]

in my mind it would be doing what fil-c does for c to unsafe rust: a hypothetical memory safe implementation of unsafe rust using the same methods fil-c does e.g. gc

▲

pizlonator 2 days ago | parent [-]

Can't do GC unless you go all-in.

So that implies just running all of Rust through the Fil-C transformation

	▲	zozbot234 2 days ago \| parent \| next [-]
		> Can't do GC unless you go all-in. It can be done, especially with a safe non-GC language that can meaningfully guarantee it won't corrupt GC metadata or break its invariants. You only have real issues (and then only wrt. excess overhead, not unsoundness) with pervasive mutual references between the GC and non-GC parts of the program. You do need to promote GC pointers to a root anytime that non-GC code has direct access to them, and add finalizers to GC objects that may need to drop/run destructors on non-GC data.
	▲	baq 2 days ago \| parent \| prev [-]
		that's what I expected, thanks for making this clear!

▲

kobebrookskC3 2 days ago | parent | prev [-]

what would fil-rust do that miri doesn't?

▲

baq 2 days ago | parent [-]

e.g. validate safety across safe/unsafe boundaries

▲

estebank 2 days ago | parent [-]

Miri does do that? It is not aware of the distinction to begin with (which is one of the use cases of the tool: it lets us exercise safe code to ensure there aren't memory violations caused by incorrect MIR lowering). I might be mistaking what you mean. Miri's big limitation is not being able to interface with FFI.

▲

baq 2 days ago | parent [-]

hmmm I thought miri was used in the compiler for static analysis, wasn't aware it's a runtime interpreter.

I guess the primary reason would be running hardened code in production without compromising performance too much, same as you would run Fil-C compiled software instead of the usual way. I've no idea if it's feasible to run miri in prod.

	▲	estebank 2 days ago \| parent [-]
		I guess the confusion happens because MIR: the representation, mir: the stage, stable MIR: the potential future API interface for hooking into the compiler stage, and miri: the MIR interpreter all share pretty much the same name. Const evaluation uses MIR and that's the most likely culprit. Miri is an interpreter (as you found out on your own now), and it is not meant for use in production workloads due to the slowdown it introduces, so it limits its use to use in test suites and during debugging. From my understanding Fil-C is an LLVM operation, so it should be possible to build integration to have a Fil-Rust binary that is slower but gives you some of the benefits of miri. I see value in doing something like that. There are plenty of other languages that would be well served by this too!

▲

kragen 3 days ago | parent | prev | next [-]

The savings of two conditional branches sounds interesting; what would the change be?

▲

pizlonator 3 days ago | parent [-]

- Don’t put flags in the high bits of the aux pointer. Instead if an object has flags, it’ll have a fatter header. Most objects don’t have flags.

- Give up on lock freedom of atomic pointers. This is a fun one because theoretically, it’s worse. But it comes with a net perf improvement because there’s no need to check the low bit of lowers.

	▲	kragen 3 days ago \| parent [-]
		Scary! I'm excited to see how it turns out.

▲

pjmlp 2 days ago | parent | prev | next [-]

Love your Yolo-C remark. :)

▲

senderista 3 days ago | parent | prev [-]

So you'd have to implement binfmt_misc for the new binary format? Will you need to write your own ld.so?

	▲	pizlonator 3 days ago \| parent [-]
		Yes and yes