Remix.run Logo
dgb23 17 hours ago

If „select *“ breaks your code, then there‘s something wrong with your code. I think Rich Hickey talked about this. Providing more than is needed should never be a breaking change.

Certain languages, formats and tools do this correctly by default. For the others you need a source of truth that you generate from.

sql_nitpicker 17 hours ago | parent | next [-]

I don't see anything wrong with what the article is saying. If you have a view over a join of A and B, and the view uses "select *", then what is gonna happen when A adds a column with the same name as a column in B?

In sqlite, the view definition will be automatically expanded and one of the columns in the output will automatically be distinguished with an alias. Which column name changes is dependent on the order of tables in the join. This can absolutely break code.

In postgres, the view columns are qualified at definition time so nothing changes immediately. But when the view definition gets updated you will get a failure in the DDL.

In any system, a large column can be added to one of the constituent tables and cause a performance problem. The best advice is to avoid these problems and never use "select *" in production code.

naasking 2 hours ago | parent [-]

Seems like a database failure if it can't notify you that introduced a breaking change. All of the schema information is available to the database after all, so it should be able to tell you about the duplicate column breaking that view.

rileymat2 17 hours ago | parent | prev | next [-]

The reasoning is in the article, and true.

> Schema evolution can break your view, which can have downstream effects

Select * is the problem itself in the face of schema evolution and things like name collision.

0xbadcafebee 16 hours ago | parent | prev | next [-]

`select *` is bad for many reasons, but the biggest is that the "contract" your code has with the remote data store isn't immutable. The database can change, for many different reasons, independent of your code. If you want to write reliable code, you need to make as few assumptions as possible. One of those assumptions is what the remote schema is.

hvb2 15 hours ago | parent [-]

Sure but columns can change data types too which 'select column's doesn't protect you from either

layer8 14 hours ago | parent [-]

A column changing its data type is generally considering a breaking change for the schema (for obvious reasons), while adding more columns isn’t. Backwards-compatible schema evolution isn’t practical without the latter — you’d have to add a new secondary table whenever you want to add more columns.

This mirrors how adding additional fields to an object type in a programming language usually isn’t considered a breaking change, but changing the type of an existing field is.

tremon 17 hours ago | parent | prev [-]

If you have select * in your code, there already is something wrong with your code, whether it breaks or not: the performance and possibly output of your code is now dependent on the table definition. I'm pretty sure Rich Hickey has also talked about the importance of avoiding non-local dependencies and effects in your code.

onli 16 hours ago | parent [-]

The performance and partly the output of the code is always dependent on the table definition. * instead of column names just removes an output limiter, which can be useful or can be irrelevant, depending on the context.

Though sure, known to negatively affect performance, I think in some database systems more than in others?