Remix.run Logo
gib444 9 hours ago

> A month ago, I went on a performance quest trying to optimize a PHP script that took 5 days to run. Together with the help of many talented developers, I eventually got it to run in under 30 seconds

That's a huge improvement! How much was low hanging fruit unrelated to the PHP interpreter itself, out of curiosity? (E.g. parallelism, faster SQL queries etc)

brentroose 8 hours ago | parent | next [-]

Almost all, actually. I wrote about it here: https://stitcher.io/blog/11-million-rows-in-seconds

A couple of things I did:

- Cursor based pagination - Combining insert statements - Using database transactions to prevent fsync calls - Moving calculations from the database to PHP - Avoiding serialization where possible

tiffanyh 8 hours ago | parent [-]

Aren’t these optimizations less about PHP, and more about optimizing how your using the database.

toast0 5 hours ago | parent | next [-]

PHP is kind of like C. It can be very fast if you do things right, and it gives you more than enough rope to tie yourself in knots.

Making your application fast is less about tuning your runtime and more about carefully selecting what you do at runtime.

Runtime choice does still matter, an environment where you can reasonably separate sending database queries and receiving the result (async communication) or otherwise lets you pipeline requests will tend to have higher throughput, if used appropriately, batching queries can narrow the gap though. Languages with easy parallelism can make individual requests faster at least while you have available resources. Etc.

A lot of popular PHP programs and frameworks start by spending lots of time assembling a beautiful sculpture of objects that will be thrown away at the end of the request. Almost everything is going to be thrown away at the end of the request; making your garbage beautiful doesn't usually help performance.

tiffanyh an hour ago | parent [-]

Would love to read more stories by you toast0 on things you've optimized in the past (given the huge scale you've worked on). Lessons learned, etc. I always find your comments super interesting :)

toast0 an hour ago | parent [-]

<3 I always love seeing your comments and questions, too!

Well on the subject of PHP, I think I've got a nice story.

The more recent one is about Wordpress. One day, I had this conversation:

Boss: "will the blog stay up?"

toast0: "yeah, nobody goes to the blog, it's no big deal"

Boss: "they will"

toast0: "oh, ummmm we can serve a static index.html and that should work"

Later that day, he posted https://blog.whatsapp.com/facebook I took a snapshot to serve as index.html and the blog stayed up. A few months later, I had a good reason to tear out WordPress (which I had been wanting to do for a long time), so I spent a week and made FakePress which only did exactly what we needed and could serve our very exciting blog posts in something like 10-20 ms per page view instead of whatever WordPress took (which was especially not very fast if you hit a www server that wasn't in the same colo as our database servers). That worked pretty well, until the blog was rewritten to run on the FB stack --- page weight doubled, but since it was served by the FB CDN, load time stayed about the same. The process to create and translate blog entries was completely different, and the RSS was non-compliant: I didn't want to include a time with the date, and there is/was no available timeless date field in any of the RSS specs, so I just left the time out ... but it was sooo much nicer to run.

Sadly, I haven't been doing any large scale optimization stuff lately. My work stuff doesn't scale much at the moment, and personal small scale fun things include polishing up my crazierl [1] demo (will update the published demo in the next few days or email me for the release candidate url), added IPv6 to my Path MTU Discovery Test [2] since I have somewhere to run IPv6 at MTU 1500, and I wrote memdisk_uefi [3], which is like Syslinux's MEMDISK but in UEFI. My goal with memdisk_uefi is to get FreeBSD's installer images to be usable with PXE in UEFI ... as of FreeBSD 15.0, in BIOS mode you can use PXE and MEMDISK to boot an installer image; but UEFI is elusive --- I got some feedback from FreeBSD suggesting a different approach than what I have, but I haven't had time to work on that; hopefully soonish. Oh and my Vanagon doesn't want to run anymore ... but it's cold out and I don't seem to want to follow the steps in the fuel system diagnosis, so that's not progressing much... I did get a back seat in good shape though so now it can carry 5 people nowhere instead of only two (caveat: I don't have seat belts for the rear passengers, which would be unsafe if the van was running)

[1] https://crazierl.org/

[2] http://pmtud.enslaves.us/

[3] https://github.com/russor/memdisk_uefi

hu3 8 hours ago | parent | prev | next [-]

It's still valid as as example to the language community of how to apply these optimizations.

swasheck 8 hours ago | parent | prev [-]

in all my years doing database tuning/admin/reliability/etc, performance have overwhelmingly been in the bad query/bad data pattern categories. the data platform is rarely the issue

tosti 6 hours ago | parent [-]

The worst offenders I've seen were looping over a shitty ORM

cobbzilla 5 hours ago | parent | next [-]

hey don’t forget, that shitty ORM also empowers you to write beautiful, fluent code that, under the hood, generates a 12-way join that brings down your entire database.

edoceo 5 hours ago | parent | prev [-]

And that is true across languages.

Joel_Mckay 7 hours ago | parent | prev [-]

In general, it is bad practice to touch transaction datasets in php script space. Like all foot-guns it leads to Read-modify-write bugs eventually.

Depending on the SQL engine, there are many PHP Cursor optimizations that save moving around large chunks of data.

Clean cached PHP can be fast for REST transactional data parsing, but it is also often used as a bodge language by amateurs. PHP is not slow by default or meant to run persistently (low memory use is nice), but it still gets a lot of justified criticism.

Erlang and Elixir are much better for clients/host budgets, but less intuitive than PHP =3