| ▲ | aw1621107 3 days ago |
| One thing I've been wondering recently about Fil-C - why now? And I don't mean that in a dismissive way at all, I'm genuinely curious about the history. Was there some relatively recent fundamental breakthrough or other change that prevented a Fil-C-like approach from being viable before? Was it a matter of finding the right approach/implementation (i.e., a "software" problem), or is there something about modern hardware which makes the approach impractical otherwise? Something else? |
|
| ▲ | rwmj 2 days ago | parent | next [-] |
| I wrote a bounds checking patch to GCC (mentioned in a link from the article) back in 1995. It did full bounds checking of C & C++ while being compatible with existing libraries and ABIs, making it a bit more practical than Fil-C to deploy in the real world. You only had to recompile your application, if you trusted the libraries (although the bounds checking obviously didn't extend into the libraries unless you recompiled them). It didn't do the GC thing, but instead detected use after free at the point of use. https://www.doc.ic.ac.uk/~phjk/BoundsChecking.html |
| |
| ▲ | aw1621107 2 days ago | parent [-] | | Interesting! How much interest did your work attract at the time? | | |
| ▲ | rwmj 2 days ago | parent [-] | | My supervisor got a few papers out of it and they are fairly widely cited even today, and as academics that was (unfortunately) the best outcome for us. The patch itself was maintained for many years, well into the mid 2000s, out of tree (actually by another person in the end), but as it never went upstream it was hard to keep doing that maintenance. There were several problems in hindsight: C programmers at the time absolutely weren't willing to accept a large slow-down in order to get bounds checking. But also we didn't optimize our changes well (or very much at all) and I'm sure we could have got the delta down a bit if we'd put the work in. The main work that dominated performance was the lookup that you have to do from the raw pointer to the fat struct that stores the pointer bounds (and you have to do this on every pointer operation). We used a splay tree for this which was clever but not very fast. A plain hash or some other data structure could have been much faster. |
|
|
|
| ▲ | zozbot234 3 days ago | parent | prev | next [-] |
| > Was there some relatively recent fundamental breakthrough or other change that prevented a Fil-C-like approach from being viable before? The provenance model for C is very recent (and still a TS, not part of the standard). Prior to that, there was a vague notion that the C abstract machine has quasi-segmented memory (you aren't really allowed to do arithmetic on a pointer to an "object" to reach a different "object") but this was not clearly stated in usable terms. |
| |
| ▲ | actionfromafar 3 days ago | parent | next [-] | | Also in practical terms, you have a lot more address space to "waste" in 64 bit. It would have been frivolous in 32 and downright offending in 16 bit code. | |
| ▲ | uecker 2 days ago | parent | prev [-] | | The memory model always had segmented memory in mind and safe C approaches are not new. The provenance model makes this more precise, but the need for this was to deal with corner cases such as pointer-to-integer roundtrips or access to the representation bytes of a pointer. Of course, neither GCC nor clang get this right, to the extend that those compiler are internally inconsistent and miscompile even code that did not any clarification to be considered correct. |
|
|
| ▲ | pizlonator 3 days ago | parent | prev [-] |
| I’ve been thinking about this problem since 2004. Here’s a rough timeline: - 2004-2018: I had ideas of how to do it but I thought the whole premise (memory safe C) was idiotic. - 2018-2023: I no longer thought the premise was idiotic but I couldn’t find a way to do it that would result in fanatical compatibility. - 2023-2024: early Fil-C versions that were much less compatible and much less performant - end of 2024: InvisiCaps breakthrough that gives current fanatical compatibility and “ok” performance. It’s a hard problem. Lots of folks have tried to find a way to do it. I’ve tried many approaches before finding the current one. |
| |
| ▲ | remexre 3 days ago | parent | next [-] | | Beyond the Git history, is there any write-up of the different capability designs you've gone with? I'm interested in implementing a safe low-level language with less static information around than C has (e.g. no static pointer-int distinction), but I'd rather keep around the ability to restrict capabilities to only refer to subobjects than have the same compatibility guarantees Invisicaps provide, so I was hoping to look into Monocaps (or maybe another design, if there's one that might fit better). | | | |
| ▲ | aw1621107 3 days ago | parent | prev [-] | | That's a really interesting timeline! Sounds like it's been stewing for a lot longer than I expected. Was there anything in particular around 2018 that changed your opinion on the idiotic-ness of the premise? If a hypothetical time machine allowed you to send the InvisiCaps idea back to your 2004-era self, do you think the approach would have been feasible back then as well? | | |
| ▲ | pizlonator 2 days ago | parent [-] | | > Was there anything in particular around 2018 that changed your opinion on the idiotic-ness of the premise? The observation that the C variants used on GPUs are simplistic takes on memory safe C |
|
|