Remix.run Logo
layer8 8 hours ago

> But careful: == looks at internal state, which isn’t always what the object represents, so for “is this the same data” comparisons keep using equals.

So == for value classes will basically be like memcmp(). That is a bit unfortunate, as it breaks encapsulation, exposing implementation details. Client code can use this to do case distinctions based on how a given value is internally represented. In a way, it’s worse than identity comparison, because identity comparison at least doesn’t expose internal state.

usrusr 8 hours ago | parent | next [-]

Value types are a concept very far away from the "magic black box organism" school of OOP thinking. It's not a novel way of doing classic OOP (does anyone still do that?), it's a way for a language born in OOP ideology get one step further into the post-OOP world.

layer8 7 hours ago | parent | next [-]

That’s just not true, you can have a completely value-based language without OOP that still doesn’t leak implementation details of the values, while also supporting UDTs.

jstimpfle 7 hours ago | parent [-]

OOP isn't just about values vs objects. Yes, the idea that everything needs identity is a big part of the problem. But another big problem is the idea that the implementation and representation of types should be hidden by default. The mindset that there isn't a known and useful data representation for a given type. That everything is done by methods parameterized by a type. It's a misguided idea. There is a place for objects and implementation hiding. But the idea that this should be done on a type granularity is a complete and utter failure.

To see why, consider that to do any useful work, data from different objects (also from different types) has to be combined. To be able to do that in the OOP framework, the encapsulation has to be unwrapped. That's why Java code is littered with getters and setters that don't do any useful work at all, they just make it too painful to get any real work done.

Again, there is a place for objects and implementation hiding, but it's at the highest levels of an architecture where different components get integrated.

tsimionescu 3 hours ago | parent | next [-]

All of this would be valid, except that value classes still pretend that their fields can be private.

This also has huge implications in a language that emphasises dynamic loading like Java. And it also flies in the face of all of the pretenses that ABI compatibility is sacrosanct and no feature that breaka it can be considered, that the design team often touts.

usrusr 2 hours ago | parent [-]

Why pretend? "private" on value types just means nothing to see here except when you happen to be one of the functions conveniently namespaced with the struct.

But I'd say that GP's complaint about inequality leaking makes no sense anyways, because what could be more unequal than different implementation, or different internal state implying different behavior down the line? The public subset isn't some arbitrary interface that could have different implementations. And even then, "equals under interface I1" would have to be considered a very special type of "equality", not the general case.

rowls66 3 hours ago | parent | prev [-]

There is no requirement in the Java language to use getters and setters.

jstimpfle 3 hours ago | parent [-]

But why are there so many of them?

DarkNova6 5 hours ago | parent | prev [-]

Not if you do DDD where a calue type has exactly those semantics and for record types this is actually a free lunch.

ahartmetz 8 hours ago | parent | prev | next [-]

If your bags of data have internal state, there's something wrong with your bags of data. I assume that the Java guys thought far enough to either exclude padding from comparisons or force padding bytes to be zero.

It should work even for strings: They will surely continue to be heap-allocated, and memcmp-ing pointers (inside the new "structs") is exactly an identity comparison.

layer8 8 hours ago | parent [-]

There’s nothing wrong with having non-normalized representations, that’s why there is equals().

For example, you might have a value class for representing (limited-precision) fractions using two longs internally, for the numerator and denominator. For efficiency trade-off reasons, you don’t want to always shorten the fraction. But now client code can distinguish 2/3 from 4/6 using ==.

Scenarios of that sort are conceivable where this actually leaks sensitive information. In any case, it creates dependencies on implementation details where you don’t want to have them.

When designing a value class, you are now in the dilemma of either always having to normalize the representation, costing performance, or having your class be a funnel for leaking implementation details.

ahartmetz 7 hours ago | parent | next [-]

Well. I'd be upset if custom operator==() for plain-old-data structs was removed from C++, but Java never had it to begin with, so for Java, it just means that you have to fall back to using traditional classes (or compare using something other than ==) if you need such "fancy" features.

inigyou 6 hours ago | parent | prev | next [-]

Java can also distinguish a 2/3 object from a 4/6 object using == when they are not value types. It can even distinguish a 2/3 object from a different 2/3 object.

scotty79 3 hours ago | parent [-]

[dead]

jstimpfle 7 hours ago | parent | prev [-]

> There’s nothing wrong with having non-normalized representations

There is a lot wrong with that: complexity, bloat, and slowness.

> But now client code can distinguish 2/3 from 4/6 using ==

That's a great way to obfuscate code. Not a good idea. The right way to do the comparison is, just make a function called CompareRational().

bishabosha 6 hours ago | parent | prev | next [-]

the whole point of value class is that they should not encapsulate state, i.e. its a totally transparent data holder

jmyeet 4 hours ago | parent | prev [-]

I wanted to comment on this as well. The article mentions it but if you've never used Java in anger (is there any other way?) then readers may not understand the true implications of this because it's a breaking change, something Java rarely does. I'll explain for the non-Java people.

Java separates checking identity and equality for objects. == basically checks if two pointers are the same. Equality is a subjective concept based on an interface (ie equals/hashCode). So this means:

    new Integer(1000) == new Integer(1000) // true, used to be false
    new Integer(1000).equals(new Integer(1000)) // true
    new Integer(10) == new Long(10) // compiler error, used to false
    new Integer(10) == new Integer(10) // true
There's a lot going on here. The complication is that in previous versions of Java (and I'm not sure when this changed), integers below a certain value would be replaced with canonical types below a certain value. I think it was 128 but its's been awhile. This led to the difference between 10 and 1000. That's now changed, I suspect because the above comparisons are being implicitly unboxed. That didn't used to happen either. I saw this because the Integer/Long comparison used to return false and it's now a compiler error so there must be unboxing going on.

You may still be able to get the old behavior through variables too.

Anyway, if value classes lose identity then == changes from pointer equality to bitwise equality. That will hopefully resolve a bunch of corner cases like this but it is a breaking change, technically.

papercrane 3 hours ago | parent [-]

    new Integer(10) == new Integer(10) // true
Before value classes this would always be false. The only time comparing Integer objects with == could be true is if Integer object was create by going through Integer.valueOf (or obviously if they were the same object reference.) By default the cached values where -127 to 127, but that is tuneable at runtime.

https://github.com/openjdk/jdk/blob/jdk-27%2B27/src/java.bas...

tsimionescu 3 hours ago | parent | next [-]

It could also be true if the instances were created through auto-boxing (e.g. arrayList.add(10); arrayList.add(10); arrayList.get(0) == array List.get(1) //would return true, but false if you used 1000 instead of 10).

papercrane 2 hours ago | parent [-]

Yes, because auto-boxing is just compiling to Integer.getValue under the hood, the bytecode for Integer.getValue(1) and ((Integer) 1) is the same. They'll both compile to something like:

   iconst_1
   invokestatic java/lang/Integer.valueOf:(I)Ljava/lang/Integer
tsimionescu 2 hours ago | parent [-]

Sure, I was just talking about Java syntax, not the bytecode internals.

jmyeet an hour ago | parent | prev [-]

So you've made my point in showing how complex this is because you're incorrect [1][2]:

> By default, Java maintains a cache of Integer objects for values between -128 and +127.

[1]: https://stackoverflow.com/questions/3130311/weird-integer-bo...

[2]: https://dev.to/marzuk16/understanding-integer-caching-in-jav...

xxs an hour ago | parent [-]

it's easier to remember that it originated from the Byte range, where all bytes could be kept in. Character didn't have negative values so it did [0-128) instead. Long and Short are the same as Byte.

Years before the autoboxing/Integer.valueOf() caching stuff (and before generics), (I) used to have IntegerProvider that did similar stuff to higher ranges. Personally, I have considered autoboxing on integers net-negative for Java