Remix.run Logo
sgarland 2 days ago

Genuinely do not understand the point of these tools. There is already a practically natural language to query RDBMS; it’s called SQL. I guarantee you, anyone who knows any other language could learn enough SQL to do 99% of what they wanted in a couple of hours. Give it a day of intensive study, and you’d know the rest. It’s just not that complicated.

brulard 2 days ago | parent | next [-]

SQL is simple for simple needs, basic joins and some basic aggregates. Even that you won't learn in 2 hours. And that is just scratching the surface of what can be done in SQL and what you need to query. With LLMs and tools like this you simply say what you need in english, you don't need to understand the normalizations, m:n relation tables, CTEs, functions, JSON access operators, etc.

sgarland 2 days ago | parent [-]

For reference, I’m a DBRE. IMO, yes, most people can learn basic joins and aggregates in a couple of hours, but that is subjective.

> you don’t need to understand the normalizations

You definitely should. Normalizing isn’t that difficult of a concept, Wikipedia has terrific descriptions of each level.

As to the rest, maybe read docs? This is my primary frustration with LLMs in general: people seem to believe that they’re just as good of developers as someone who has read the source documentation, because a robot told them the answer. If you don’t understand what you’re doing, you cannot possibly understand the implications and trade-offs.

aurareturn a day ago | parent [-]

Thank goodness 99% don’t want to understand everything. Otherwise, you wouldn’t be paid very well at your job, right?

physix 2 days ago | parent | prev | next [-]

Without having looked at it, I would assume the value comes from not having to know the data model in great detail, such that you can phrase your query using natural language, like

"Give me all the back office account postings for payment transfers of CCP cleared IRD trades which settled yesterday with a payment amount over 1M having a value date in two days"

That's what I'd like to be able to say and get an accurate response.

v5v3 2 days ago | parent | prev | next [-]

In a business, a management decision maker has to rely on a Db analyst if any query they have cannot be answered by any front end tool they have been given. And that introduces latency to the process

A 100% accurate ai powered solution would have many customers.

But can this generation of llms produce 100% accuracy?

nicktikhonov 2 days ago | parent | prev [-]

and yet this was on the front page of hacker news for an entire day :D

it's all about friction. why spend minutes writing a query when you can spend 5 seconds speaking the result you want and get 90-100% of the way there.

sgarland 2 days ago | parent [-]

Mostly because you don’t know if it’s correct unless you know SQL. It’s entirely too easy to get results that look correct but aren’t, especially when using windowing functions and the like.

But honestly, most queries I’ve ever seen are just simple joins, which shouldn’t take you 5 minutes to write.

AdieuToLogic 2 days ago | parent | next [-]

> Mostly because you don’t know if it’s correct unless you know SQL. It’s entirely too easy to get results that look correct but aren’t ...

This is the fundamental problem when attempting to use "GenAI" to make program code, SQL or otherwise. All one would have to do is substitute SQL with language/library of choice above and it would be just as applicable.

sgarland 2 days ago | parent [-]

Fully agree, I just harp on SQL because a. It’s my niche b. It always seems to be a “you can know this, but it doesn’t really matter” thing even for people who regularly interact with RDBMS, and it drives me bonkers.

brulard 2 days ago | parent | prev [-]

> most queries I’ve ever seen are just simple joins

Good for you. Some of us deal with more complex queries, even if it may not seems so from the outside. For example getting hierarchical data based on parent_id, while having non-trivial conditions for the parents and the children or product search queries which need to use trigram functions with some ranking, depending on product availability across stores and user preferences.

I agree knowing SQL is still useful, but more for double checking the queries from LLMs than for trying to build queries yourself.

sgarland 2 days ago | parent [-]

> getting hierarchical data based on parent_id

So, an adjacency list (probably, though there are many alternatives, which are usually better). That’s not complex, that’s a self-join.

> trigram functions

That’s an indexing decision, not a query. It’s also usually a waste: if you’re doing something like looking up a user by email or name, and you don’t want case sensitivity to wreck your plan, then use a case-insensitive collation for that column.

> I agree knowing SQL is still useful, but more for double checking the queries from LLMs

“I agree knowing Python / TypeScript / Golang is still useful, but more for double checking the queries from LLMs.” This sounds utterly absurd, because it is. Why SQL is seen as a nice-to-have instead of its reality - the beating heart of every company - is beyond me.

brulard 2 days ago | parent [-]

Your Python / TypeScript etc. argument is a strawman, thats why it sounds absurd. Your arguments would hold better if an average person was good and very quick at learning and memoizing complex new things. I don't know if you work with people like that, but that's definitely not the norm. Even developers know little SQL unless it's their specific focus.

In the original comment you said:

> I guarantee you, anyone who knows any other language could learn enough SQL to do 99% of what they wanted in a couple of hours. Give it a day of intensive study, and you’d know the rest. It’s just not that complicated.

Well your "guarantee" does not hold up. Where I live, every college level developer went through multiple semesters of database courses and yet I don't see these people proficient in SQL. In couple hours? 99% of what they need? Absurd

sgarland a day ago | parent [-]

It's not a strawman, it's reductio ad absurdum. SQL and Python are both languages that are commonly used. It would be (currently; who knows in a few years) laughable if someone said they didn't need to deeply understand Python to be able to correctly write Python at an employable level, modulo experience levels - I don't expect a Junior to know the vagaries of the language, e.g. that bools are aliased to integers.

> Even developers know little SQL unless it's their specific focus.

Yes, and I believe this to be deeply problematic. We don't generally allow people to use a language they don't understand in production, except for SQL.

> Where I live, every college level developer went through multiple semesters of database courses and yet I don't see these people proficient in SQL.

That's horrifying.

Look, while I would love it if everyone writing SQL knew relational algebra, basic set theory, and the ins and outs of their specific RDBMS implementation, I think the below suffices for the majority of work in web dev:

    SELECT: extract the columns that are named, optionally with an alias with AS (or simply a space)
    FROM: the [main] table to extract columns from
    [INNER] JOIN: an additional table to examine, returning only their intersection
    LEFT [OUTER] JOIN: an additional table to examine, returning everything in the LHS table, as well as any matches from the RHS table, with NULLs filling in missing data
    RIGHT [OUTER] JOIN: the same as LEFT JOIN, but with the logic swapped
    FULL [OUTER] JOIN: an additional table to examine, returning the union of both tables, regardless of matches
    ON: an expression to use for joining tables, generally consisting of at least one column from each table to match
    WHERE: a predicate (or series of predicates, with boolean operators joining them) to use for filtering the result set
    ORDER BY: one or more columns to order the result set by, in either ascending (ASC) or descending (DESC) order
    GROUP BY: one or more columns (though strictly speaking, this number must match the number of non-aggregated columns in the SELECT) to group the result set by
    LIMIT: a limit for the maximum number of rows returned

You're telling me that given a simple educational schema like Northwind Traders, and the documentation for their RDBMS, that someone who already knows a programming language couldn't use the above to figure it out in a fairly short order?
brulard a day ago | parent [-]

You made here an important assumption

> someone who already knows a programming language

I'm sure someone who can already code, can write a simple query. But my argument is

1. with AI assistance programmer would be quicker, with less friction, more productive, enabled to make queries beyond his current abilities

2. with AI assistance Non-programmer would be enabled to use SQL at all

3. real world queries are often not trivial (todays developers have simple queries covered by ORM / query building tools)

Regarding real life queries - I look at my last query that I crafted with difficulty and AI help - starts `WITH RECURSIVE`, uses `UNION ALL`, `GROUP_CONCAT`, `COALESCE` (even with SELECT statement inside), multiple CTEs. It would take me hours to get to that. I can have that in minutes with AI help. I don't even mention different dialects, feature support, arrays and JSONs, extensions, etc.