Remix.run Logo
oftenwrong 5 days ago

A boolean is smaller, which is a relevant consideration for some workloads. For example, you may be pre-aggregating a large amount of data to serve a set of analytical queries which do not care about the associated timestamp. The smaller data type is more efficient both in storage and in query execution.

Additionally, there are situations where it is logical to store a boolean. For example, if the boolean denotes an outcome:

    process_executed_at timestamp not null
    process_succeeded boolean not null
maxbond 5 days ago | parent [-]

It's unlikely the boolean will result in better utilization, the savings will probably be consumed by padding. Most people don't know how to use structure packing to create a row which is actually smaller after it's been padded (though it's not very hard, anyone could learn). Columns are generally ordered by which features were shipped first and not by alignment (as is necessary to minimize padding).

I do try my best to pack my columns, but it's a fragile and likely premature optimization. Better to opt for something defensive at a cost of like, 7 bytes per row (in Postgres).