Remix.run Logo
jihadjihad 11 hours ago

> You did not write a single line of SQL. You did not set up an attribution model. You asked a question, in English, and got a table.

But nobody bothered to check if it was correct. It might seem correct, but I've been burned by queries exactly like these many, many times. What can often happen is that you end up with multiplied rows, and the answer isn't "let's just add a DISTINCT somewhere".

The answer is to look at the base table and the joins. You're joining customers to two (implied) one-to-many tables, charges and email_events. If there are multiple charges rows per customer, or an email can match multiple email_events rows, it can lead to a Cartesian multiplication of the rows since any combination of matches from the base table to the joined tables will be included.

If that's the case, the transactions and revenue values are likely to be inflated, and therefore the pretty pictures you passed along to your boss are wrong.

Further reading, and a terrific resource:

https://kb.databasedesignbook.com/posts/sql-joins/#understan...

chairmansteve 3 hours ago | parent | next [-]

Yes. The correct answer is to ask an SQL expert to write the query. An LLM will make the expert much faster at writing the script, maybe.

bluefirebrand 27 minutes ago | parent | next [-]

SQL is not hard enough to require an LLM to think about for you

I will never understand Engineers who struggle with SQL lookups. The vast majority of queries are extremely basic set theory

mattmanser 3 hours ago | parent | prev [-]

It doesn't.

I can write that script faster than I can write the text asking the AI to write the script as SQL is concise and my IDE has auto-complete.

Axel2Sikov 7 hours ago | parent | prev [-]

Ok but… nobody said you didn’t had to check either(?).

paulryanrogers 7 hours ago | parent [-]

How do you check if you don't have any other view into the data but SQL and you don't know SQL?

bluefirebrand 26 minutes ago | parent | next [-]

Sql takes at most an afternoon to learn enough of to navigate a database with

Axel2Sikov 5 hours ago | parent | prev [-]

Same way you do today; you trust whoever wrote the query.

I do not sell a wrapper on top of some LLM; you can absolutely write your SQL directly. There is an engine, there are iceberg tables. You can just live your best life doing your own SQL by hand.

Now if you couldnt do it before and you have a sensible understanding, you can likely do a bit more with the CLI tooling. And if you know a lot more, you can still do that. The queries are not hidden, or abstracted, If you need them they will be saved - transparently in SQL.

So I dont know what is the answer to the question "how do people do things they don't know how to do" ?

paulryanrogers 4 hours ago | parent [-]

> So I dont know what is the answer to the question "how do people do things they don't know how to do" ?

The statue quo had been to learn SQL or ask a human you trust to check their own work, which hopefully you can reuse.

Now it's ask AIs that are intentionally a bit random, and less likely to (or incapable of) check(ing) their work. Perhaps without seeing the SQL at all, requiring to trust it for every interaction. And in a culture that moves so fast that there is no checking by any(one|thing).

XenophileJKO 4 hours ago | parent [-]

If you think a language model can't check their work, then you are using the tools wrong. Plain and simple.

Modern models are quite capable at surfacing and validating their assumptions and checking correctness of solutions.

Oversight helps you build confidence in the solutions. Is it perfect, no.. but way better then most engineers I also ask to check things.

Bridged7756 2 hours ago | parent [-]

No they don't. To be able to "check one's work", implies that they can be held accountable, that they can tell apart right from wrong, when in reality they're merely text predictors.

If you think an LLMs can check their work, then you are doing a terrible job at writing software. Plain and simple.

They even go as far as "cheating", so tests fail, writing incorrect tests, or straight out leaking code (lol) like the latest Claude Code blunder. Is this the tool the original comment "is using wrong, plain and simple"? Or do you have access to some other model that works in a wildly different way than generating text predictions?