| ▲ | fooker 9 hours ago |
| Counterintuitively: program in python only if you can get away without knowing these numbers. When this starts to matter, python stops being the right tool for the job. |
|
| ▲ | libraryofbabel 9 hours ago | parent | next [-] |
| Or keep your Python scaffolding, but push the performance-critical bits down into a C or Rust extension, like numpy, pandas, PyTorch and the rest all do. But I agree with the spirit of what you wrote - these numbers are interesting but aren’t worth memorizing. Instead, instrument your code in production to see where it’s slow in the real world with real user data (premature optimization is the root of all evil etc), profile your code (with pyspy, it’s the best tool for this if you’re looking for cpu-hogging code), and if you find yourself worrying about how long it takes to add something to a list in Python you really shouldn’t be doing that operation in Python at all. |
| |
|
| ▲ | Demiurge 7 hours ago | parent | prev | next [-] |
| I agree. I've been living off Python for 20 years and have never needed to know any of these numbers, nor do I need them now, for my work, contrary to the title. I also regularly use profiling for performance optimization and opt for Cython, SWIG, JIT libraries, or other tools as needed. None of these numbers would ever factor into my decision-making. |
| |
| ▲ | AtlasBarfed 5 hours ago | parent [-] | | ..... You don't see any value in knowing that numbers? | | |
| ▲ | Demiurge 11 minutes ago | parent | next [-] | | That's what I just said. There is zero value to me knowing these numbers. I assume that all python built in methods are pretty much the same speed. I concentrate on IO being slow, minimizing these operations. I think about CPU intensive loops that process large data, and I try to use libraries like numpy, DuckDB, or other tools to do the processing. If I have a more complicated system, I profile its methods, and optimize tight loops based on PROFILING. I don't care what the numbers in the article are, because I PROFILE, and I optimize the procedures that are the slowest, for example, using cython. Which part of what I am saying does not make sense? | |
| ▲ | TuringTest 5 hours ago | parent | prev [-] | | As others have pointed out, Python is better used in places where those numbers aren't relevant. If they start becoming relevant, it's usually a sign that you're using the language in a domain where a duck-typed bytecode scripting-glue language is not well-suited. |
|
|
|
| ▲ | MontyCarloHall 9 hours ago | parent | prev | next [-] |
| Exactly. If you're working on an application where these numbers matter, Python is far too high-level a language to actually be able to optimize them. |
|
| ▲ | bathtub365 5 hours ago | parent | prev | next [-] |
| These basically seem like numbers of last resort. After you’ve profiled and ruled out all of the usual culprits (big disk reads, network latency, polynomial or exponential time algorithms, wasteful overbuilt data structures, etc) and need to optimize at the level of individual operations. |
|
| ▲ | Quothling 7 hours ago | parent | prev [-] |
| Why? I've build some massive analytic data flows in Python with turbodbc + pandas which are basically C++ fast. It uses more memory which supports your point, but on the flip-side we're talking $5-10 extra cost a year. It could frankly be $20k a year and still be cheaper than staffing more people like me to maintain these things, rather than having a couple of us and then letting the BI people use the tools we provide for them. Similarily when we do embeded work, micro-python is just so much easier to deal with for our engineering staff. The interoperability between C and Python makes it great, and you need to know these numbers on Python to know when to actually build something in C. With Zig getting really great interoperability, things are looking better than ever. Not that you're wrong as such. I wouldn't use Python to run an airplane, but I really don't see why you wouldn't care about the resources just because you're working with an interpreted or GC language. |
| |
| ▲ | fooker 7 hours ago | parent | next [-] | | > you need to know these numbers on Python to know when to actually build something in C People usually approach this the other way, use something like pandas or numpy from the beginning if it solves your problem. Do not write matrix multiplications or joins in python at all. If there is no library that solves your problem, it's a great indication that you should avoid python. Unless you are willing to spend 5 man-years writing a C or C++ library with good python interop. | | |
| ▲ | oivey 6 hours ago | parent [-] | | People generally aren’t rolling their own matmuls or joins or whatever in production code. There are tons of tools like Numba, Jax, Triton, etc that you can use to write very fast code for new, novel, and unsolved problems. The idea that “if you need fast code, don’t write Python” has been totally obsolete for over a decade. | | |
| ▲ | fooker 6 hours ago | parent [-] | | Yes, that's what I said. If you are writing performance sensitive code that is not covered by a popular Python library, don't do it unless you are a megacorp that can put a team to write and maintain a library. | | |
| ▲ | oivey 6 hours ago | parent [-] | | It isn’t what you said. If you want, you can write your own matmul in Numba and it will be roughly as fast as similar C code. You shouldn’t, of course, for the same reason handrolling your own matmuls in C is stupid. Many problems can performantly solved in pure Python, especially via the growing set of tools like the JIT libraries I cited. Even more will be solvable when things like free threaded Python land. It will be a minority of problems that can’t be, if it isn’t already. |
|
|
| |
| ▲ | its-summertime 6 hours ago | parent | prev [-] | | From the complete opposite side, I've built some tiny bits of near irrelevant code where python has been unacceptable, e.g. in shell startup / in bash's PROMPT_COMMAND, etc. It ends up having a very painfully obvious startup time, even if the code is nearing the equivalent of Hello World time python -I -c 'print("Hello World")'
real 0m0.014s
time bash --noprofile -c 'echo "Hello World"'
real 0m0.001s
| | |
| ▲ | dekhn 5 hours ago | parent | next [-] | | What exactly do you need 1ms instead of 14ms startup time in a shell startup?
The difference is barely perceptible. Most of the time starting up is time spent seartching the filesystem for thousands of packages. | | |
| ▲ | NekkoDroid 4 hours ago | parent [-] | | > What exactly do you need 1ms instead of 14ms startup time in a shell startup? I think as they said: when dynamically building a shell input prompt it starts to become very noticable if you have like 3 or more of these and you use the terminal a lot. | | |
| ▲ | dekhn 2 hours ago | parent [-] | | Ah, I only noticed the "shell startup" bit. Yes, after 2-3 I agree you'd start to notice if you were really fast. I suppose at that point I'd just have Gemini rewrite the prompt-building commands in Rust (it's quite good at that) or merge all the prompt-building commands into a single one (to amortize the startup cost). |
|
| |
| ▲ | 6 hours ago | parent | prev [-] | | [deleted] |
|
|