Remix.run Logo
mattclarkdotnet 11 hours ago

Python really needs to take the Typescript approach of "all valid Python4 is valid Python3". And then add value types so we can have int64 etc. And allow object refs to be frozen after instantiation to avoid the indirection tax.

Sensible type-annotated python code could be so much faster if it didn't have to assume everything could change at any time. Most things don't change, and if they do they change on startup (e.g. ORM bindings).

BerislavLopac 2 hours ago | parent | next [-]

> python code could be so much faster if it didn't have to assume everything could change at any time

Definitely, but then it wouldn't be Python. One of the core principles of Python's design is to be extremely dynamic, and that anything can change at any time.

There are many other, pretty good, strictly dynamically typed languages which work just as well if not better than Python, for many purposes.

oblio 2 hours ago | parent [-]

I feel that this excuse is being trotted out too much. Most engineers never get to choose the programming language used for 90% of their professional projects.

And when Python is a mainstream language on top of which large, globally known websites, AI tools, core system utilities, etc are built, we should give up the purity angle and be practical.

Even the new performance push in Python land is a reflection of this. A long time ago some optimizations were refused in order to not complicate the default Python implementation.

drob518 2 hours ago | parent [-]

You’re always free to create your own Python-like language that caters more toward your goals. No excuses, then.

oblio an hour ago | parent [-]

If you're a contributor to Python, my apologies.

drob518 42 minutes ago | parent [-]

I’m not a Python contributor, so no need to apologize to me. But if you have strong ideas about what Python should be, perhaps you should step up and contribute that code rather than saying that others are offering excuses for why they won’t deliver what you want. I have worked on other open source projects where users were very entitled, to the point of demanding that the project team deliver them certain features. It’s not fun. It’s ironic that open source often brings out both the best and the worst in people. Suggesting changes and new features is fine, even critical to a strong roadmap. But we all need to realize that maintainers may have other goals and there’s no obligation on their part to implement anything. The beauty of open source is that you can customize or fork as much as you want to match your goals. But then you’re responsible for doing the work and if your changes are public you may have your own set of users demanding their own favorite changes.

mattclarkdotnet 11 hours ago | parent | prev | next [-]

To clarify, it is nuts that in an object method, there is a performance enhancement through caching a member value.

  class SomeClass
    def init(self)
      self.x = 0
    def SomeMethod(self)
      q = self.x
      ## do stuff with q, because otherwise you're dereferencing self.x all the damn time
1718627440 4 hours ago | parent | next [-]

This is not just a performance concern, this describes completely different behaviour. You forgot that self.x is just Class.__getattr__(self, 'x') and that you can implement __getattr__ how you like. There is no object identity across the values returned by __getattr__.

dekelpilli 10 hours ago | parent | prev | next [-]

Java also has a performance cost to accessing class fields, as exampled by this (now-replaced) code in the JDK itself - https://github.com/openjdk/jdk/blob/jdk8-b120/jdk/src/share/...

anematode 9 hours ago | parent | next [-]

Any decent JIT compiler (and HotSpot's is world class) will optimize this out. Likely this was done very early on in development, or was just to reduce bytecode size to promote inlining heuristics that use it

vanderZwan 4 hours ago | parent | next [-]

String is also a pretty damn fundamental object, and I'm sure trim() calls are extremely common too. I wouldn't be surprised if making sure that seemingly small optimizations like this are applied in the interpreter before the JIT kicks are not premature optimizations in that context.

There might be common scenarios where this had a real, significant performance impacts, E.G. use-cases where it's such a bottle-neck in the interpreter that it measurably affects warm-up time. Also, string manipulation seems like the kind of thing you see in small scripts that end before a JIT even kicks in but that are also called very often (although I don't know how many people would reach for Java in that case.

EDIT: also, if you're a commercial entity trying to get people to use your programming language, it's probably a good idea to make the language perform less bad with the most common terrible code. And accidentally quadratic or worse string manipulation involving excessive calls to trim() seems like a very likely scenario in that context.

LtWorf 8 hours ago | parent | prev [-]

But what if whatever you call is also accessing and changing the attribute?

anematode 7 hours ago | parent [-]

If what you call gets inlined, then the compiler can see that it either does or doesn't modify the attribute and optimize it accordingly. Even virtual calls can often be inlined via, e.g., class hierarchy analysis and inline caches.

If these analyses don't apply and the callee could do anything, then of course the compiler can't keep the value hoisted. But a function call has to occur anyway, so the hoisted value will be pushed/popped from the stack and you might as well reload it from the object's field anyway, rather than waste a stack slot.

LtWorf 38 minutes ago | parent [-]

Another thread can access it and do that, how could the compiler possibly know about it?

Kamii0909 6 hours ago | parent | prev [-]

That was a niche optimization primarily targeting code at intepretor. Even the most basic optimizing compiler in HotSpot tiered compilation chain at that time (the client compiler or C1) would be able to optimize that into the register. Since String is such an important class, even small stuffs like this is done.

duskdozer 7 hours ago | parent | prev | next [-]

You mean even if x is not a property?

mathisfun123 10 hours ago | parent | prev [-]

> it is nuts that in an object method, there is a performance enhancement through caching a member value

i don't understand what you think is nuts about this. it's an interpreted language and the word `self` is not special in any way (it's just convention - you can call the first param to a method anything you want). so there's no way for the interpreter/compiler/runtime to know you're accessing a field of the class itself (let alone that that field isn't a computed property or something like that).

lots of hottakes that people have (like this one) are rooted in just a fundamental misunderstanding of the language and programming languages in general <shrugs>.

mattclarkdotnet 9 hours ago | parent | next [-]

What's nuts is that the language doesn't guarantee that successive references to the same member value within the same function body are stable. You can look it up once, go off and do something else, and look it up again and it's changed. It's dynamism taken to an unnecessary extreme. Nobody in the real world expects this behaviour. Making it just a bit less dynamic wouldn't change the fundamentals of the language but it would make it a lot more tractable.

1718627440 4 hours ago | parent | next [-]

> What's nuts is that the language doesn't guarantee that successive references to the same member value within the same function body are stable. You can look it up once, go off and do something else, and look it up again and it's changed.

There is no such thing as 'successive references to the same member value' here. It's not that you look up the same object and it can change, it's that you are not referring to the same object at all.

self.x is actually self.__getattr__('x'), which can in fact return a different thing each time. `self.x` IS a string lookup and that is not an implementation detail, but a major design goal. This is the dynamism, that is one of the selling points of Python, it allows you to change and modify interfaces to reflect state. It's nice for some things and it is what makes Python Python. If you don't want that, use another language.

gpderetta 2 hours ago | parent [-]

ok, then it is nuts that __getattr__ (itself a specially blessed function) is not required to be pure at least from the caller point of view.

rtpg 9 hours ago | parent | prev | next [-]

In Python attribute access aren't stable! `self.x` where `x` is a property is not guaranteed to refer to the same thing.

And getting rid of descriptors would be a _fundamental change to the language_. An immeense one. Loads of features are built off of descriptors or descriptor-like things.

And what you're complaining about is also not true in Javascript world either... I believe you can build descriptor-like things in JS now as well.

_But_ if you want that you can use stuff like mypyc + annotations to get that for you. There are tools that let you get to where you want. Just not out of the box because Python isn't that language.

Remember, this is a scripting language, not a compiled language. Every optimization for things you talk about would be paid on program load (you have pyc stuff but still..)

Gotta show up with proof that what you're saying is verifiable and works well. Up until ~6 or 7 years ago CPython had a concept of being easy to onboard onto. Dataflow analyses make the codebase harder to deal with.

Having said all of that.... would be nice to just inline RPython-y code and have it all work nicely. I don't need it on everything and proving safety is probably non-trivial but I feel like we've got to be closer to doing this than in the past.

I ... think in theory the JIT can solve for that too. In theory

Someone 8 hours ago | parent | prev | next [-]

> What's nuts is that the language doesn't guarantee that successive references to the same member value within the same function body are stable.

The language supports multiple threads and doesn’t have private fields (https://docs.python.org/3/tutorial/classes.html#private-vari...), so the runtime cannot rule out that the value gets changed in-between.

And yes, it often is obvious to humans that’s not intended to happen, and almost never what happens, but proving that is often hard or even impossible.

gpderetta 2 hours ago | parent [-]

wouldn't a concurrent change without synchronization be UB anyway? Also parent wants to cache the address, not the value (but you have to cache the value if you want to optimize manually)

fulafel 9 hours ago | parent | prev | next [-]

> Nobody in the real world expects this behaviour.

For example, numbers and strings are immutable objects in Python. If self.x is a number and its numeric value is changed by a method call, self.x will be a different object after that. I'd dare say people expect this to work.

codesnik 8 hours ago | parent | prev | next [-]

basically all object oriented languages work like that. You access a member; you call a method which changes that member; you expect that change is visible lower in the code, and there're no statically computable guarantees that particular member is not touched in the called method (which is potentially shadowed in a subclass). It's not dynamism, even c++ works the same, it's an inherent tax on OOP. All you can do is try to minimize cost of that additional dereference. I'm not even touching threads here.

now, functional languages don't have this problem at all.

cherryteastain 6 hours ago | parent [-]

OOP has nothing to do with it. In your C++ example, foo(bar const&); is basically the same as bar.foo();. At the end of the day, whether passing it in as an argument or accessing this via the method call syntax it's just a pointer to a struct. Not to mention, a C++ compiler can, and often does, choose to put even references to member variables in registers and access them that way within the method call.

This is a Python specific problem caused by everything being boxed by default and the interpreter does not even know what's in the box until it dereferences it, which is a problem that extends to the "self" object. In contrast in C++ the compiler knows everything there's to know about the type of this which avoids the issue.

adrian17 5 hours ago | parent | next [-]

That's not true. I mean: it's true that it has little to do with OOP, but most imperative languages (only exception I know is Rust) have the issue, it's not "Python specific". For example (https://godbolt.org/z/aobz9q7Y9):

struct S { const int x; int f() const; }; int S::f() const { int a = x; printf("hello\n"); int b = x; return a-b; }

The compiler can't reuse 'x' unless it's able to prove that it definitely couldn't have changed during the `printf()` call - and it's unable to prove it. The member is loaded twice. C++ compilers can usually only prove it for trivial code with completely inlined functions that doesn't mutate any external state, or mutates in a definitely-not-aliasing way (strict aliasing). (and the `const` don't do any difference here at all)

In Python the difference is that it can basically never prove it at all.

1718627440 4 hours ago | parent | prev | next [-]

> This is a Python specific problem caused by everything being boxed by default and the interpreter does not even know what's in the box until it dereferences it

That's not the whole thing, what is going on. Every attribute access is a function call to __getattr__, that can return whatever object it wants.

bar.foo (...) is actually bar.__getattr__ ('foo') (bar, ...)

This dynamism is what makes Python Python and it allows you to wrap domain state in interface structure.

josefx 5 hours ago | parent | prev [-]

> This is a Python specific problem caused by everything being boxed

I would say it is part python being highly dynamic and part C++ being full of undefined behavior.

A c++ compiler will only optimize member access if it can prove that the member isn't overwritten in the same thread. Compatible pointers, opaque method calls, ... the list of reasons why that optimization can fail is near endless, C even added the restrict keyword because just having write access to two pointers of compatible types can force the compiler to reload values constantly. In python anything is a function call to some unknown code and any function could get access to any variable on the stack (manipulating python stack frames is fun).

Then there is the fun thing the C++ compiler gets up to with varibles that are modified by different threads, while(!done) turning into while(true) because you didn't tell the compiler that done needs to be threadsafe is always fun.

1718627440 4 hours ago | parent [-]

What is going on here is not, that an attribute might be changed concurrently and the interpreter can't optimize the access. That is also a consideration. But the major issue is that an attribute doesn't really refer to a single thing at all, but instead means whatever object is returned by a function call that implements a string lookup. __getattr__ is not an implementation detail of the language, but something that an object can implement how it wants to, just like __len__ or __gt__. It's part of the object behaviour, not part of the static interface. This is a fundamental design goal of the Python language.

mathisfun123 8 hours ago | parent | prev [-]

> same member value within the same function body are stable

Did you miss the part where I explained to you there's no way to identify that it's a member variable?

> Nobody in the real world expects this behaviour

As has already been explained to you by a sibling comment you are in fact wrong and there are in fact plenty of people in the real world who do actually expect this behavior.

So I'll repeat myself: lots of hottakes from just pure. Unadulterated, possibly willful, ignorance.

coldtea 6 hours ago | parent [-]

The above is a very thick response that doesn't address the parent's points, just sweeps them under the rag with "that's just how it was designed/it works".

"Did you miss the part where I explained to you there's no way to identify that it's a member variable?"

No, you you did miss the case where that in itself can be considered nuts - or at least an unfortunate early decision.

"this just how things are dunn around diz here parts" is not an argument.

1718627440 4 hours ago | parent | next [-]

> No, you you did miss the case where that in itself can be considered nuts - or at least an unfortunate early decision.

This is not a side implementation detail, that they got wrong, this is a fundamental design goal of Python. You can find that nuts, but then just don't use Python, because that is (one of) that things, that make Python Python.

mathisfun123 4 hours ago | parent | prev [-]

> considered nuts - or at least an unfortunate early decision

Please explain to us then how exactly you would infer a variable with an arbitrary name is actually a reference to the class instance in an interpreted language.

coldtea 3 hours ago | parent [-]

>Please explain to us then how exactly you would infer a variable with an arbitrary name is actually a reference to the class instance in an interpreted language.

Did I stutter when I wrote about "an unfortunate early decision"? Who said it has to be "an arbitrary name"?

Even so, you could add a bloody marker announcing an arbitrary name (which 99% would be self anyway) as so, as an instruction to the interpreter. If it fails, it fails, like countless other things that can fail during runtime in Python today.

NetMageSCW 36 minutes ago | parent [-]

But now you are no longer talking about the way Python works, but the way you want Python to work - and that has nothing to do with Python.

EE84M3i 5 hours ago | parent | prev | next [-]

> the word `self` is not special in any way (it's just convention - you can call the first param to a method anything you want).

The name `self` is a convention, yes, but interestingly in python methods the first parameter is special beyond the standard "bound method" stuff. See for example PEP 367 (New Super) for how `super()` resolution works (TL;DR the super function is a special builtin that generates extra code referencing the first parameter and the lexically defining class)

bmitc 10 hours ago | parent | prev [-]

I don't think it's a hot take to say much of Python's design is nuts. It's a very strange language.

stabbles 7 hours ago | parent | prev | next [-]

That was how the Mojo language started. And then soon after the hype they said that being a superset of Python was no longer the goal. Probably because being a superset of Python is not a guarantee for performance either.

Hendrikto 2 hours ago | parent [-]

Being a superset would mean all valid Python 3 is valid Python 4. A valuable property for sure, but not what OP suggested. In fact, it is the exact opposite.

fermigier 3 hours ago | parent | prev | next [-]

I have made some experiments with P2W, my experimental Python (subset) to WASM compiler. Initial figures are encouraging (5x speedup, on specific programs).

https://github.com/abilian/p2w

NB: some preliminary results:

  p2w is 4.03x SLOWER than gcc (geometric mean)

  p2w is 5.50x FASTER than cpython (geometric mean)

  p2w is 1.24x FASTER than pypy (geometric mean)
giancarlostoro an hour ago | parent | prev | next [-]

I went sort of this route in an experiment with Claude.. I really want Python for .NET but I said, damn the expense, prioritize .NET compatibility, remove anything that isn't supported feasably. It means 0 python libs, but all of NuGet is supported. The rules are all signatures need types, and if you declare a type, it is that type, no exceptions, just like in C# (if you squint when looking at var in a funny way). I wound up with reasonable results, just a huge trade of the entire Python ecosystem for .NET with an insanely Python esque syntax.

Still churning on it, will probably publish it and do a proper blog post once I've built something interesting with the language itself.

coredog64 34 minutes ago | parent [-]

IronPython -> TitaniumPython?

wolvesechoes 6 hours ago | parent | prev | next [-]

> Python really needs to take the Typescript approach of "all valid Python4 is valid Python3"

It is called type hints, and is already there. TS typing doesn't bring any perf benefits over plain JS.

stabbles 6 hours ago | parent [-]

You really need dedicated types for `int64` and something like `final`. Consider:

    class Foo:
      __slots__ = ("a", "b")
      a: int
      b: float
there are multiple issues with Python that prevent optimizations:

* a user can define subtype `class my_int(int)`, so you cannot optimize the layout of `class Foo`

* the builtin `int` and `float` are big-int like numbers, so operations on them are branchy and allocating.

and the fact that Foo is mutable and that `id(foo.a)` has to produce something complicates things further.

wolvesechoes 5 hours ago | parent [-]

Maybe, but I quoted specific part I was replying to. TS has no impact on runtime performance of JS. Type hints in Python have no impact on runtime performance of Python (unless you try things like mypyc etc; actually, mypy provides `from mypy_extensions import i64`)

Therefore Python has no use for TS-like superset, because it already has facilities for static analysis with no bearing on runtime, which is what TS provides.

wiseowise 5 hours ago | parent [-]

What OP means is that they need to:

1) Add TS like language on top of Python in backwards compatible way

2) Introduce frozen/final runtime types

3) Use 1 and 2 to drive runtime optimizations

wolvesechoes 3 hours ago | parent [-]

Still makes no sense. OP demands introduction of different runtime semantics, but this doesn't require adding more language constructs (TS-like superset). Current type hints provide all necessary info on the language level, and it is a matter of implementation to use them or not.

From all posts it looks like what OP wants is a different language that looks somewhat like Python syntax-wise, so calling for "backwards-compatible" superset is pointless, because stuff that is being demanded would break compatibility by necessity.

bloppe 11 hours ago | parent | prev | next [-]

But that's just not what python is for. Move your performance-critical logic into a native module.

wiseowise 5 hours ago | parent | next [-]

I’ll be happy if over night all Python code in the world can reap 10-100x performance benefits without changing much of a codebase, you can continue having soup of multiple languages.

bloppe 30 minutes ago | parent | next [-]

Me too, but changing the referential semantics would be a massive breaking change. That doesn't qualify as "without changing much of a codebade". And tacking on a giant new orthogonal type system to avoid breaking existing code would be akin to creating a new language. Why bother when you can just write Python modules in Rust.

drob518 an hour ago | parent | prev | next [-]

I’d like to be good looking and drive a Ferrari. But that probably isn’t going to happen, either.

BerislavLopac 2 hours ago | parent | prev [-]

Any program written in Python of any significant size is literally a soup of multiple languages.

LtWorf 28 minutes ago | parent [-]

There's no project that isn't like that.

mattclarkdotnet 10 hours ago | parent | prev [-]

Performance is one part of the discussion, but cleanliness is another. A Python4 that actually used typing in the interpreter, had value types, had a comptime phase to allow most metaprogramming to work (like monkey patching for tests) would be great! It would be faster, cleaner, easier to reason about, and still retain the great syntax and flexibility of the language.

BerislavLopac 2 hours ago | parent | next [-]

> A Python4 that actually used typing in the interpreter, had value types, had a comptime phase to allow most metaprogramming to work (like monkey patching for tests) would be great! It would be faster, cleaner, easier to reason about, and still retain the great syntax and flexibility of the language.

And what prevents someone from designing such a language?

LtWorf 27 minutes ago | parent [-]

PSF has full time employees. If someone else does it as a personal project it would remain a personal project and we'd never hear about it.

mechsy 8 hours ago | parent | prev [-]

I too see potential in this - it started feeling a bit weird in recent years switching between Go, Python and Rust codebases with Python code looking more and more like a traditional statically typed language and not getting the performance benefits. I know I know, there are libraries and frameworks which make heavy use of fun stuff you can do with strings (leading to the breakdown of even the latest and greatest IDE tooling and red squiggly lines all over you code) and don’t get me started on async etc.

Funnily enough I’ve found Python to be excellent for modelling my problem domain with Pydantic (so far basically unparalleled, open for suggestions in Go/Rust), while the language also gets out of my way when I get creative with list expressions and the like. So overall, still it is extremely productive for the work I’m doing, I just need to spin up more containers in prod.

panzi 11 hours ago | parent | prev | next [-]

Isn't rpython doing that, allowing changes on startup and then it's basically statically typed? Does it still exist? Was it ever production ready? I only once read a paper about it decades ago.

mattclarkdotnet 11 hours ago | parent [-]

RPython is great, but it changes semantics in all sorts of ways. No sets for example. WTF? The native Set type is one of the best features of Python. Tuples also get mangled in RPython.

rich_sasha 10 hours ago | parent | prev | next [-]

I think sadly a lot of Python in the wild relies heavily, somewhere, on the crazy unoptimisable stuff. For example pytest monkey patches everything everywhere all the time.

You could make this clean break and call it Python 4 but frankly I fear it won't be Python anymore.

NetMageSCW 33 minutes ago | parent | next [-]

Perl 6 showed what happens when you do something like that.

fyrn_ 8 hours ago | parent | prev | next [-]

As a person who has spent a lot of time with pytest, I'm ready for testing framework that doesn't do any of that non-obvious stuff. Generally use unittest as much as I can these days, so much less _wierd_ about how it does things. Like jeeze pytest, do you _really_ need to stress test every obscure language feature? Your job is to call tests.

mattclarkdotnet 10 hours ago | parent | prev [-]

Allowing metaprogramming at module import (or another defined phase) would cover most monkey patching use cases. From __future__ import python4 would allow developers to declare their code optimisable.

musicale 9 hours ago | parent | prev | next [-]

> Python really needs to take the Typescript approach of "all valid Python4 is valid Python3

Great idea, but I'm not convinced that they learned anything from the Python 2 to 3 transition, so I wouldn't hold my breath.

If you want a language system without contempt for backward compatibility, you're probably better off with Java/C++/JavaScript/etc. (though using JS libraries is like building on quicksand.) Bit of a shame since I want to like Python/Rust/Swift/other modern-ish languages, but it turns out that formal language specifications were actually a pretty good idea. API stability is another.

musicale 8 hours ago | parent [-]

is that you, python core dev team? ;-)

dobremeno 4 hours ago | parent | prev | next [-]

SPy [1] is a new attempt at something like this.

TL;DR: SPy is a variant of Python specifically designed to be statically compilable while retaining a lot of the "useful" dynamic parts of Python.

The effort is led by Antonio Cuni, Principal Software Engineer at Anaconda. Still very early days but it seems promising to me.

[1] https://github.com/spylang/spy

BiteCode_dev 5 hours ago | parent | prev | next [-]

There will be not Python 4, and 3.X policy requires forward compat, so we are already there.

mattclarkdotnet 11 hours ago | parent | prev [-]

Oh, and while we're at it, fix the "empty array is instantiated at parse time so all your functions with a default empty array argument share the same object" bullshit.

zahlman 8 hours ago | parent | next [-]

We don't call them "arrays".

It has nothing to do with whether the list is empty. It has nothing to do with lists at all. It's the behaviour of default arguments.

It happens at the time that the function object is created, which is during runtime.

You only notice because lists are mutable. You should already prefer not to mutate parameters, and it especially doesn't make sense to mutate a parameter that has a default value because the point of mutating parameters is that the change can be seen by the caller, but a caller that uses a default value can't see the default value.

The behaviour can be used intentionally. (I would argue that it's overused intentionally; people use it to "bind" loop variables to lambdas when they should be using `functools.partial`.)

If you're getting got by this, you're fundamentally expecting Python to work in a way that Pythonistas consider not to make sense.

Revisional_Sin 6 hours ago | parent [-]

It's best practice to avoid mutable defaults even if you're not planning to mutate the argument.

It's just slightly annoying having to work around this by defaulting to None.

Izkata 10 hours ago | parent | prev | next [-]

Execution time, not parse time. It's a side effect of function declarations being statements that are executed, not the list/dict itself. It would happen with any object.

mattclarkdotnet 10 hours ago | parent | next [-]

It's still ridiculous. A hypothetical Python4 would treat function declarations as declarations not executable statements, with no impact on real world code except to remove all the boilerplate checks.

zahlman 8 hours ago | parent | next [-]

There is no such thing as a "function declaration" in Python. The keyword is "def", which is the first three letters of the word "define" (and not a prefix of "declare"), for a reason.

The entire point of it being an executable statement is to let you change things on the fly. This is key to how the REPL works. If I have `def foo(): ...` twice, the second one overwrites the first. There's no need to do any checks ahead of time, and it works the same way in the REPL as in a source file, without any special logic, for the exact same reason that `foo = 1` works when done twice. It's actually very elegant.

People who don't like these decisions have plenty of other options for languages they can use. Only Python is Python. Python should not become not-Python in order to satisfy people who don't like Python and don't understand what Python is trying to be.

1718627440 4 hours ago | parent | prev | next [-]

You are describing a completely different language, that differs in very major ways from Python. You can of course create that, but please don't call it Python 4 !

boxed 9 hours ago | parent | prev [-]

You think so but then you write a function with a default argument pointing to some variable that is a list and now suddenly the semantics of that are... what?

codesnik 8 hours ago | parent [-]

you could just treat argument initialization as an executable expression which is called every time you call a function. If you have a=[], then it's a new [] every time. If a=MYLIST then it's a reference to the same MYLIST. Simple. And most sane languages do it this way, I really don't know why python has (and maintain) this quirk.

1718627440 4 hours ago | parent [-]

What are the semantics of the following:

    b = ComplexObject (...)
    # do things with b

    def foo (self, arg=b):
        # use b

    return foo
Should it create a copy of b every time the function is invoked? If you want that right now, you can just call b.copy (), when you always create that copy, then you can not implement the current choice.

Should the semantic of this be any different? :

    def foo (self, arg=ComplexObject (...)):
Now imagine a:

    ComplexObject = list
codesnik 3 hours ago | parent [-]

I wonder, why that kind of ambiguity or complexity even comes to your mind at all. Just because python is weird?

def foo(self, arg=expression):

could, and should work as if it was written like this (pseudocode)

def foo(self, arg?): if is_not_given(arg): arg=expression

if "expression" is a literal or a constructor, it'd be called right there and produce new object, if "expression" is a reference to an object in outer scope, it'd be still the same object.

it's a simple code transformation, very, very predictable behavior, and most languages with closures and default values for arguments do it this way. Except python.

1718627440 3 hours ago | parent [-]

What you want is for an assignment in a function definition to be a lambda.

  def foo (self, arg=lambda : expression):
Assignment of unevaluated expressions is not a thing yet in Python and would be really surprising. If you really want that, that is what you get with a lambda.

> most languages with closures and default values for arguments do it this way.

Do these also evaluate function definitions at runtime?

codesnik 2 hours ago | parent [-]

yes they do. check ruby for example.

mattclarkdotnet 10 hours ago | parent | prev [-]

Let's not get started on the cached shared object refs for small integers....

zahlman 8 hours ago | parent [-]

What realistic use case do you have for caring about whether two integers of the same value are distinct objects? Modern versions of Python warn about doing unpredicatble things with `is` exactly because you are not supposed to do those things. Valid use cases for `is` at all are rare.

thaumasiotes 6 hours ago | parent [-]

> Valid use cases for `is` at all are rare.

There might not be that many of them, depending on how you count, but they're not rare in the slightest. For example, you have to use `is` in the common case where you want the default value of a function argument to be an empty list.

exyi 2 hours ago | parent | prev | next [-]

If you change this you break a common optimization:

https://github.com/python/cpython/blob/3.14/Lib/json/encoder...

Default value is evaluated once, and accessing parameter is much cheaper than global

zeratax 5 hours ago | parent | prev [-]

there is PEP 671 for that, which introduces extra syntax for the behavior you want. people rely on the current behavior so you can't really change it