Remix.run Logo
mathisfun123 8 hours ago

> it is nuts that in an object method, there is a performance enhancement through caching a member value

i don't understand what you think is nuts about this. it's an interpreted language and the word `self` is not special in any way (it's just convention - you can call the first param to a method anything you want). so there's no way for the interpreter/compiler/runtime to know you're accessing a field of the class itself (let alone that that field isn't a computed property or something like that).

lots of hottakes that people have (like this one) are rooted in just a fundamental misunderstanding of the language and programming languages in general <shrugs>.

mattclarkdotnet 8 hours ago | parent | next [-]

What's nuts is that the language doesn't guarantee that successive references to the same member value within the same function body are stable. You can look it up once, go off and do something else, and look it up again and it's changed. It's dynamism taken to an unnecessary extreme. Nobody in the real world expects this behaviour. Making it just a bit less dynamic wouldn't change the fundamentals of the language but it would make it a lot more tractable.

1718627440 3 hours ago | parent | next [-]

> What's nuts is that the language doesn't guarantee that successive references to the same member value within the same function body are stable. You can look it up once, go off and do something else, and look it up again and it's changed.

There is no such thing as 'successive references to the same member value' here. It's not that you look up the same object and it can change, it's that you are not referring to the same object at all.

self.x is actually self.__getattr__('x'), which can in fact return a different thing each time. `self.x` IS a string lookup and that is not an implementation detail, but a major design goal. This is the dynamism, that is one of the selling points of Python, it allows you to change and modify interfaces to reflect state. It's nice for some things and it is what makes Python Python. If you don't want that, use another language.

gpderetta 13 minutes ago | parent [-]

ok, then it is nuts that __getattr__ (itself a specially blessed function) is not required to be pure at least from the caller point of view.

rtpg 7 hours ago | parent | prev | next [-]

In Python attribute access aren't stable! `self.x` where `x` is a property is not guaranteed to refer to the same thing.

And getting rid of descriptors would be a _fundamental change to the language_. An immeense one. Loads of features are built off of descriptors or descriptor-like things.

And what you're complaining about is also not true in Javascript world either... I believe you can build descriptor-like things in JS now as well.

_But_ if you want that you can use stuff like mypyc + annotations to get that for you. There are tools that let you get to where you want. Just not out of the box because Python isn't that language.

Remember, this is a scripting language, not a compiled language. Every optimization for things you talk about would be paid on program load (you have pyc stuff but still..)

Gotta show up with proof that what you're saying is verifiable and works well. Up until ~6 or 7 years ago CPython had a concept of being easy to onboard onto. Dataflow analyses make the codebase harder to deal with.

Having said all of that.... would be nice to just inline RPython-y code and have it all work nicely. I don't need it on everything and proving safety is probably non-trivial but I feel like we've got to be closer to doing this than in the past.

I ... think in theory the JIT can solve for that too. In theory

Someone 6 hours ago | parent | prev | next [-]

> What's nuts is that the language doesn't guarantee that successive references to the same member value within the same function body are stable.

The language supports multiple threads and doesn’t have private fields (https://docs.python.org/3/tutorial/classes.html#private-vari...), so the runtime cannot rule out that the value gets changed in-between.

And yes, it often is obvious to humans that’s not intended to happen, and almost never what happens, but proving that is often hard or even impossible.

gpderetta 12 minutes ago | parent [-]

wouldn't a concurrent change without synchronization be UB anyway? Also parent wants to cache the address, not the value (but you have to cache the value if you want to optimize manually)

fulafel 7 hours ago | parent | prev | next [-]

> Nobody in the real world expects this behaviour.

For example, numbers and strings are immutable objects in Python. If self.x is a number and its numeric value is changed by a method call, self.x will be a different object after that. I'd dare say people expect this to work.

codesnik 7 hours ago | parent | prev | next [-]

basically all object oriented languages work like that. You access a member; you call a method which changes that member; you expect that change is visible lower in the code, and there're no statically computable guarantees that particular member is not touched in the called method (which is potentially shadowed in a subclass). It's not dynamism, even c++ works the same, it's an inherent tax on OOP. All you can do is try to minimize cost of that additional dereference. I'm not even touching threads here.

now, functional languages don't have this problem at all.

cherryteastain 5 hours ago | parent [-]

OOP has nothing to do with it. In your C++ example, foo(bar const&); is basically the same as bar.foo();. At the end of the day, whether passing it in as an argument or accessing this via the method call syntax it's just a pointer to a struct. Not to mention, a C++ compiler can, and often does, choose to put even references to member variables in registers and access them that way within the method call.

This is a Python specific problem caused by everything being boxed by default and the interpreter does not even know what's in the box until it dereferences it, which is a problem that extends to the "self" object. In contrast in C++ the compiler knows everything there's to know about the type of this which avoids the issue.

adrian17 3 hours ago | parent | next [-]

That's not true. I mean: it's true that it has little to do with OOP, but most imperative languages (only exception I know is Rust) have the issue, it's not "Python specific". For example (https://godbolt.org/z/aobz9q7Y9):

struct S { const int x; int f() const; }; int S::f() const { int a = x; printf("hello\n"); int b = x; return a-b; }

The compiler can't reuse 'x' unless it's able to prove that it definitely couldn't have changed during the `printf()` call - and it's unable to prove it. The member is loaded twice. C++ compilers can usually only prove it for trivial code with completely inlined functions that doesn't mutate any external state, or mutates in a definitely-not-aliasing way (strict aliasing). (and the `const` don't do any difference here at all)

In Python the difference is that it can basically never prove it at all.

josefx 3 hours ago | parent | prev | next [-]

> This is a Python specific problem caused by everything being boxed

I would say it is part python being highly dynamic and part C++ being full of undefined behavior.

A c++ compiler will only optimize member access if it can prove that the member isn't overwritten in the same thread. Compatible pointers, opaque method calls, ... the list of reasons why that optimization can fail is near endless, C even added the restrict keyword because just having write access to two pointers of compatible types can force the compiler to reload values constantly. In python anything is a function call to some unknown code and any function could get access to any variable on the stack (manipulating python stack frames is fun).

Then there is the fun thing the C++ compiler gets up to with varibles that are modified by different threads, while(!done) turning into while(true) because you didn't tell the compiler that done needs to be threadsafe is always fun.

1718627440 3 hours ago | parent [-]

What is going on here is not, that an attribute might be changed concurrently and the interpreter can't optimize the access. That is also a consideration. But the major issue is that an attribute doesn't really refer to a single thing at all, but instead means whatever object is returned by a function call that implements a string lookup. __getattr__ is not an implementation detail of the language, but something that an object can implement how it wants to, just like __len__ or __gt__. It's part of the object behaviour, not part of the static interface. This is a fundamental design goal of the Python language.

1718627440 3 hours ago | parent | prev [-]

> This is a Python specific problem caused by everything being boxed by default and the interpreter does not even know what's in the box until it dereferences it

That's not the whole thing, what is going on. Every attribute access is a function call to __getattr__, that can return whatever object it wants.

bar.foo (...) is actually bar.__getattr__ ('foo') (bar, ...)

This dynamism is what makes Python Python and it allows you to wrap domain state in interface structure.

mathisfun123 7 hours ago | parent | prev [-]

> same member value within the same function body are stable

Did you miss the part where I explained to you there's no way to identify that it's a member variable?

> Nobody in the real world expects this behaviour

As has already been explained to you by a sibling comment you are in fact wrong and there are in fact plenty of people in the real world who do actually expect this behavior.

So I'll repeat myself: lots of hottakes from just pure. Unadulterated, possibly willful, ignorance.

coldtea 4 hours ago | parent [-]

The above is a very thick response that doesn't address the parent's points, just sweeps them under the rag with "that's just how it was designed/it works".

"Did you miss the part where I explained to you there's no way to identify that it's a member variable?"

No, you you did miss the case where that in itself can be considered nuts - or at least an unfortunate early decision.

"this just how things are dunn around diz here parts" is not an argument.

1718627440 3 hours ago | parent | next [-]

> No, you you did miss the case where that in itself can be considered nuts - or at least an unfortunate early decision.

This is not a side implementation detail, that they got wrong, this is a fundamental design goal of Python. You can find that nuts, but then just don't use Python, because that is (one of) that things, that make Python Python.

mathisfun123 3 hours ago | parent | prev [-]

> considered nuts - or at least an unfortunate early decision

Please explain to us then how exactly you would infer a variable with an arbitrary name is actually a reference to the class instance in an interpreted language.

coldtea 2 hours ago | parent [-]

>Please explain to us then how exactly you would infer a variable with an arbitrary name is actually a reference to the class instance in an interpreted language.

Did I stutter when I wrote about "an unfortunate early decision"? Who said it has to be "an arbitrary name"?

Even so, you could add a bloody marker announcing an arbitrary name (which 99% would be self anyway) as so, as an instruction to the interpreter. If it fails, it fails, like countless other things that can fail during runtime in Python today.

EE84M3i 3 hours ago | parent | prev | next [-]

> the word `self` is not special in any way (it's just convention - you can call the first param to a method anything you want).

The name `self` is a convention, yes, but interestingly in python methods the first parameter is special beyond the standard "bound method" stuff. See for example PEP 367 (New Super) for how `super()` resolution works (TL;DR the super function is a special builtin that generates extra code referencing the first parameter and the lexically defining class)

bmitc 8 hours ago | parent | prev [-]

I don't think it's a hot take to say much of Python's design is nuts. It's a very strange language.