▲ | qdw 6 days ago | ||||||||||||||||
One of their line items complains about being unable to bind 65k PostgreSQL placeholders (the linked post calls them "parameters") in a single query. This is a cursed idea to begin with, so I can't fully blame PostgreSQL. From the linked GitHub issue comments, it looks like they adopted the sensible approach of refactoring their ORM so that it splits the big query into several smaller queries. Anecdotally, I've found 3,000 to 5,000 rows per write query to be a good ratio. Someone else suggested first loading the data into a temp table and then joining against that, which would have further improved performance, especially if they wrote it as a COPY … FROM. But the idea was scrapped (also sensibly) for requiring too many app code changes. Overall, this was quite an illuminating tome of cursed knowledge, all good warnings to have. Nicely done! | |||||||||||||||||
▲ | e1g 6 days ago | parent | next [-] | ||||||||||||||||
Another strategy is to pass your values as an array param (e.g., text[] or int[] etc) - PG is perfectly happy to handle those. Using ANY() is marginally slower than IN(), but you have a single param with many IDs inside it. Maybe their ORM didn’t support that. | |||||||||||||||||
▲ | motorest 6 days ago | parent | prev | next [-] | ||||||||||||||||
> This is a cursed idea to begin with, so I can't fully blame PostgreSQL. After going through the list, I was left with the impression that the "cursed" list doesn't really refers to gotchas per se but to lessons learned by the developers who committed them. Clearly a couple of lessons are incomplete or still in progress, though. This doesn't take away from their value of significance, but it helps frame the "curses" as persona observations in an engineering log instead of statements of fact. | |||||||||||||||||
▲ | fdr 6 days ago | parent | prev | next [-] | ||||||||||||||||
that also popped out at me: binding that many parameters is cursed. You really gotta use COPY (in most cases). I'll give you a real cursed Postgres one: prepared statement names are silently truncated to NAMEDATALEN-1. NAMEDATALEN is 64. This goes back to 2001...or rather, that's when NAMEDATALEN was increased in size from 32. The truncation behavior itself is older still. It's something ORMs need to know about it -- few humans are preparing statement names of sixty-plus characters. | |||||||||||||||||
| |||||||||||||||||
▲ | Terr_ 6 days ago | parent | prev | next [-] | ||||||||||||||||
> One of their line items complains about being unable to bind 65k PostgreSQL placeholders (the linked post calls them "parameters") in a single query. I've actually encountered this one, it involved an ORM upserting lots of records, and how some tables had SQL array-of-T types, where each item being inserted consumes one bind placeholder. That made it an intermittent/unreliable error, since even though two runs might try to touch the same number of rows and columns, you the number of bind-variables needed for the array stuff fluctuated. | |||||||||||||||||
▲ | burnt-resistor 6 days ago | parent | prev | next [-] | ||||||||||||||||
Or people who try to send every filename on a system through xargs in a single command process invocation as arguments (argv) without NUL-terminated strings. Just hope there are no odd or corrupt filenames, and plenty of memory. Oopsie. find -print0 with parallel -0/xargs -0 are usually your friends. Also, sed and grep without LC_ALL=C can result in the fun "invalid multibyte sequence". | |||||||||||||||||
▲ | Aeolun 6 days ago | parent | prev [-] | ||||||||||||||||
I don’t think that makes intuitive sense. Whether I send 50k rows or 10x5k rows should make no difference to the database. But somehow it does. It’s especially annoying with PG, where you just cannot commit a whole lot of small values fast due to this weird limit. |