Remix.run Logo
potato-peeler 5 days ago

I am curious. Generally basic structures like map are not thread safe and care has to be taken while modifying it. This is pretty well documented in go spec. In your case in dropbox, what was essentially going on?

tsimionescu 5 days ago | parent | next [-]

I think the surprise here is that failing to synchronize writes leads to a SEGFAULT, not a panic or an error. This is the point GP was making, that Go is not fully memory safe in the presence of unsynchronized concurrent writes. By contrast, in Java or C#, unsynchronized writes will either throw an exception (if you're lucky and they get detected) or let the program continue with some unexpected values (possibly ones that violate some invariants). Getting a SEGFAULT can only happen if you're explicitly using native code, raw memory access APIs, or found a bug in the runtime.

elcritch 4 days ago | parent [-]

Segfault sounds better than running with inconsistent data.

tsimionescu 4 days ago | parent | next [-]

No, SEGFAULT means you're lucky and your corrupted memory caused you to access something the OS knew you can't access. But every SEGFAULT means that you have a memory safety violation, and so if you get unlucky, the exact same code that SEGFAULTED once will read oe write random objects in your memory (which might include code areas, GC data structures, etc).

Inconsistent data is pretty bad, but it's not as bad as memory corruption.

foldr 4 days ago | parent | prev | next [-]

The inconsistent data thing could happen in Go too. A segfault is not guaranteed, it’s just one of the more likely possibilities.

Someone 4 days ago | parent [-]

> A segfault is not guaranteed, it’s just one of the more likely possibilities.

Is it? It will depend on the code, but my gut feeling is that you typically would get a few (if not lot of) unnoticed non-segfaulting issues before you get the segfaulting one that tells you straight in your face that you have a problem.

foldr 4 days ago | parent [-]

It probably depends on how exactly the corruption happens. If you overwrite a pointer with an integer value, then the integer is statistically unlikely to correspond to a valid memory address. On the other hand, if you overwrite a pointer with a pointer, or an integer with an integer, all bets are off.

Someone 3 days ago | parent [-]

> If you overwrite a pointer with an integer value, then the integer is statistically unlikely to correspond to a valid memory address

On 64-bit systems, and even then, it depends on the system’s memory layout (I think most integer values in programs are < 2³²)

foldr 2 days ago | parent [-]

Right. It’s unlikely both because the 64-bit value space is huge and because on most systems pointers have some of the high bytes set whereas typical integer values don’t. IIRC this combination of factors is what makes conservative GCs like BoehmGC quite effective on 64-bit architectures.

qcnguy 4 days ago | parent | prev | next [-]

Go/C programs that race can also run with inconsistent data. Nothing guarantees you a segfault under torn writes.

In practice, Java programs tend to pick up on data races very quickly because they mutate some collection and the collections framework has safety checks for this.

to11mtm 4 days ago | parent | prev [-]

Well it depends on what we mean by 'inconsistent'.

In C# For example, if a structure is over CPU arch Word size (i.e. 32 or 64 bits) then you could have a torn read if it's being written. However object refs themselves are always word size so you'll never have a torn pointer read on those.

However, in either case there is still a need in multithreaded environments to remember the CPU's memory ordering rules and put proper fences (or, to be safe, locks, since memory barrier rules are different between ARM and x86 for example).

But that second bit is a fairly hard problem to solve for without having the right type of modelling around your compiler.

tsimionescu 4 days ago | parent [-]

When I said "inconsistent", I was referring to things like getting a length field updated by one thread, but the actual list contents by another - if you have thread safety violations you will end up with exactly this type of issue in any language that allows unsafe threading code (Rust wouldn't outside `unsafe` blocks, for example), even in fully memory safe ones like Java or C#, and even without any bugs in the VM.

maxlybbert 5 days ago | parent | prev | next [-]

I thought the same thing. Maybe the point of the story isn’t “we were surprised to learn you had to synchronize access” but instead “we all thought we were careful, but each of us made this mistake no matter how careful we tried to be.”

nine_k 5 days ago | parent | prev [-]

In Java, there are separate synchronized collections, because acquiring a lock takes time. Normally one uses thread-unsafe collections. Java also gives a very ergonomic way to run any fragment under a lock (the `synchronized` operator).

Rust avoids all this entirely, by using its type system.

layer8 4 days ago | parent | next [-]

Java has separate synchronized collections only because that was initially the default, until people realized that it doesn’t help for the common cases of check-and-modify operations or of having consistency invariants with state outside a single collections (besides the performance impact). In practice, synchronized collections are rarely useful, and instead accesses are synchronized externally.

noisem4ker 5 days ago | parent | prev [-]

Golang has a synchronized map:

https://pkg.go.dev/sync#Map