Remix.run Logo
sbrother 3 days ago

When I was most recently at Google (2021-ish) my team owned a bunch of SQL Pipelines that had fairly effective SQL tests. Not my favorite thing to work on, but it was a productive way to transform data. There are lots of open source versions of the same idea, but I have yet to see them accompanied with ergonomic testing. Any recommendations or pointers to open source SQL testing frameworks?

physicles 3 days ago | parent [-]

Could you describe what made those tests effective? I just wrote some tools to write concise tests for some analytics queries, and some principles I stumbled on are:

- input data should be pseudorandom, so the chance of a test being “accidentally correct” is minimized

- you need a way to verify only part of the result set. Or, at the very least, a way to write tests so that if you add a column to the result set, your test doesn’t automatically break

In addition, I added CSV exports so you can verify the results by hand, and hot-reload for queries with CTEs — if you change a .sql file then it will immediately rerun each CTE incrementally and show you which ones’ output changed.