Remix.run Logo
dbremner 7 months ago

Here is a real safety issue that I found and fixed a couple weeks ago. This is an ancient language runtime which originally ran on MS-DOS, Amiga, Atari, and at least a dozen now-discontinued commercial Unices. I've been porting it to 64-bit OSes as a weekend hack. While this particular issue is unlikely to appear in modern applications, a similar pattern might manifest today as a use-after-free error with std::string_view or an invalid iterator.

Background:

    typedef int Text; 
The Text type is used to encode different kinds of string-like values. Negative values represent a generated dictionary. Valid indexes in the string intern table (https://en.wikipedia.org/wiki/String_interning) represent a stored string. Other values represent generated variable names.

const char *textToStr(Text t) - This takes a Text value and returns a pointer to a null-terminated string. If t is a string intern index, then it returns a pointer to the stored string. If t represents either a generated dictionary or generated variable name, then it calls snprintf on a static buffer and returns the buffer's address.

Problem:

The use of a static buffer in textToStr introduces a temporal safety issue when multiple calls are made in the same statement. Here’s an excerpt from a diagnostic error message, simplified for clarity:

    printf(stderr, "Invalid use of \"%s\" with \"%s\"",
                textToStr(e),
                textToStr(s));
If both e and s are generated dictionaries or variables, then each call to textToStr overwrites the static buffer used by the other. Since the evaluation order of function arguments in C++ is undefined, the result is unpredictable and depends on the compiler and runtime.
scintill76 7 months ago | parent [-]

If you're saying C++ won't automatically save you from design mistakes, then I agree. This is a poorly-specified function. The caller can't know exactly how they need to handle the returned pointer.

Potential solutions:

* Return std::string and eat the cost of sometimes copying interned strings

* std::string with your own allocator could probably deal with the two cases fairly cheaply and transparently to the caller

* overload an operator<< or similar and put your conversion result directly into a stream, rather than a buffer that then goes into a stream

* put your generated values in a container that can keep pointers into it valid under all the operations you do on it (I think STL sets would work), keep them there for the rest of the lifetime of the program, and return your pointers to them (or to the interned constant strings)

I think many of these could be termed RAII, so I lean toward the idea in this subthread that RAII and other C++ idioms will help you stay safe, if you use them.

P.S. The function is also not safe if called by multiple threads concurrently. Maybe thread-local storage could be an easy fix for that part, but the other solutions should fix it too. [But if you have any shared storage that caches the generated values such as the last solution, it needs to be protected from concurrency.]

dbremner 7 months ago | parent [-]

This is a 30+ year-old codebase that originally started as pre-ANSI C. The design tradeoffs were very different at the time. I've managed to compile it as C++23 with unreasonably high warning levels, and it's now warning-free. However, there are still many unresolved issues.

One notable limitation is the use of a SIGINT handler implemented with setjmp/longjmp, which reduces the effectiveness of RAII.

Regarding textToStr, there were four instances where it was called twice in the same expression. To avoid potential problems, I replaced these calls with a temporary std::string to store the result of the first call before the second is made. In hindsight, none of these cases likely would have caused issues, but I preferred not to take risks. The SigIntGuard ensures that the std::string destructor runs even if a SIGINT occurs:

    {
        SigIntGuard guard;
        std::string const member{textToView(textOf(fst(b)))};
        std::print(errorStream, R"(No member "{}" in class "{}")", member, textToView(klass(c).text));
    }