Remix.run Logo
ryoshu 3 days ago

I still struggle with ORMs. SQL is... declarative. If you're working with multiple RDBMSs, sure? Maybe I want my local dev to be sqlite and scaled be postgres? I've never run into that in production. A DSL on top of a DSL doesn't make a lot of sense.

t-writescode 3 days ago | parent | next [-]

Hand-rolling SQL inside another programming language comes with some unpleasantness, like protecting against SQL injection and making sure the SQL is valid, especially when hand-constructing the query based on input parameters: “sort ascending? Descending? Filter all but things in this group? etc.”

Parameter management in some languages are unpleasant, like how JDBC only has positional arguments; and any time you do string concat in a language, you start getting in danger of misformed SQL.

Ultra basic ORMs, like Exposed (Kotlin), are well-tested frameworks that do exactly what I want. Want a single value in the =? Or want it to be an “in”? Or what if it’s null? Handled. No special string management. Want parameters? Handled.

When I see pure ORM’d code, I can feel safe expecting it to be protected from injection and formatting issues. It’s reduced cognitive load and greater safety.

When I see raw string SQL management, I have to put another layer of care and attention to try and make sure (and maybe still fail) there’s no application-crashing mistakes in that part of code.

It’s kinda like working with typed and compiled code. Greater protection from error.

webstrand 3 days ago | parent | next [-]

It sounds like you're describing a query builders which, unlike true ORMs, don't attempt to mask the object-relational boundary. They only help you build queries in a convenient way and sometimes type-safe way. Query builders are great.

ORMs are not query builders. The problem with ORMs is that they hide the query logic from you. It's not clear what's getting joined in, how its getting joined in, or if its actually 1 or N queries. The further you get from a straight query builder, too, the fewer SQL features you have access to, such as parameterized joins, CTEs, window functions, etc. Sometimes you can hack those into the ORM, but often you have to resort to string concat and build the parameterized query and arguments manually.

I've never used Exposed, but from what I can tell it's kind of a hybrid? the query builder parts look great, but I'd still be wary of the ORM parts.

I've never had a good experience debugging performance issues in ORM-generated queries. Maybe I haven't used the right ones, but the libraries I've used have gone out of their way to hide how the query is actually being executed and only have obtuse levers to tweak that execution. Sure you can see the raw logs of what queries the ORM executed, but you can't easily figure out why its chosen a particular strategy. Or you can't easily make it stop using a pathological join ordering.

prmph 3 days ago | parent | next [-]

If one can update the underlying Db, I've developed an ORM pattern that I use in my projects that works very well.

They keys is to encapsulate most of the (possibly complex) CRUD logic in Db functions (for retrieval these would be table-valued functions) and access these from the application side as virtual tables.

I also have capable but easy to use filtering/sorting/paging operators and combinators in the ORM that are translated into SQL where/sort/limit clauses.

Because the heavy lifting is already done by the db functions (which you have full control of to use whatever SQL you need), the pattern is actually quite powerful but easy to use.

You can define virtual read only tables, several virtual tables that access the same actual table in different way, custom operators that transcend SQL, etc

webstrand 15 hours ago | parent [-]

Doesn't this break the object-relation mapping the ORM would be trying to do? Since the views would be producing resultsets that do not map to one single object? Or do you just define a new object type with virtual relations?

prmph 14 hours ago | parent [-]

Exactly. I define a pair of DTOs for each virtual table: an input type (the shape of a record that can be inserted) and an output one (the shape of each record returned when you select from the virtual table, which might include generated, computed, and foreign columns, etc).

These types do not necessarily have to map to the underlying table types.

andoando 3 days ago | parent | prev [-]

I HATE ORMs. I end up spending an hour or two trying to figure out why something isnt working for what should be a simple query.

Theyre also seem quite restrictive to what raw sql can do.

3 days ago | parent [-]
[deleted]
monkeyelite 3 days ago | parent | prev [-]

You’re arguing against a straw man. All major language sql libraries are not based on string manipulation and provide things like escaping, arguments, etc out of the box.

kaoD 3 days ago | parent | next [-]

Query builders are still an antipattern (what we traditionally think of when we say query builders) because they are still a DSL that (1) you have to learn along with SQL and (2) never map cleanly to actual SQL, so you always have to resort to `.raw` calls when you need anything more complex than a simple SELECT+JOIN.

Even for simple SELECTs, I lost count of how many times I had to enable runtime DEBUG logging in my query builders to get a query that I can copy-paste into my DB client for debugging, data exploring, etc. I should be able to copy-paste my query from code and run it in `psql` with minimal changes.

Raw SQL is the way to go[0]. Period. You just need a thin wrapper on top of it that adds escaping, arguments, type-safety and other QoL.

[0] https://gajus.medium.com/stop-using-knex-js-and-earn-30-bf41...

esafak 3 days ago | parent | next [-]

That thin wrapper is a query builder.

kaoD 3 days ago | parent [-]

> antipattern (what we traditionally think of when we say query builders)

3 days ago | parent [-]
[deleted]
monkeyelite 2 days ago | parent | prev [-]

I'm with you.

ameliaquining 3 days ago | parent | prev [-]

Only for parameterization over scalar values. If you want to do any kind of composition more sophisticated than that, you're either stitching together strings or using some kind of more heavyweight abstraction like an ORM.

monkeyelite 3 days ago | parent | next [-]

That’s because the composition is supposed to be inside sql. Views, functions, etc.

This is another reason why the ORM is a leaky abstraction - it hides all the best features from you.

ameliaquining 3 days ago | parent | next [-]

I suspect the biggest reason those aren't more popular is that they usually have to be stored as state in the database, which isn't what you want when developing an application. You want all of your query logic to be versioned with your application code.

branko_d 3 days ago | parent | next [-]

> You want all of your query logic to be versioned with your application code.

SQL can be stored in version control just as well as any other code. This can include application-level queries as well as the DDL SQL which defines the actual structure of your database.

It's sad that tooling for this kind of workflow doesn't seem to be particularly good across the board, Visual Studio being somewhat of an exception.

ameliaquining 3 days ago | parent [-]

The problem isn't version-controlling the SQL; it's making sure that, when a particular build of your app executes a query that calls out to a user-defined function, it uses the SQL function code from the same commit that the app itself was built from. Otherwise you have a potential source of version skew, and those are always really annoying and unpleasant to deal with.

I think Postgres sort of supports this but it's really clunky, and also I think you'd have to run the function-creation code on every connection; I don't know whether that would create any performance problems.

What does Visual Studio do?

snuxoll 3 days ago | parent | next [-]

You just mentioned PostgreSQL, which, like pretty much every RDMBS sans MySQL, DB2/400, and maybe DB2/ZOS (never worked with the ZOS variant) supports schemas. If you need to worry about this, keep your tables that actually contain your data in one schema, then keep views, functions/stored procedures, etc. in separate schemas every time you make incompatible changes.

The database is a separate component, the biggest mistake nearly every developer makes is trying to make a single application own it. Let me tell you, from experience, the chances that this one single application will be the only thing that every connects to your database past initial development is slim. "Oh, but we're doing microservices and everything only ever talks over HTTP or gRPC" - yeah, sure, and I don't have a backlog of tech debt a mile long.

branko_d 3 days ago | parent | prev [-]

> when a particular build of your app executes a query that calls out to a user-defined function, it uses the SQL function code from the same commit that the app itself was built from

I don't really see a "problem" here. Having everything in the same repo is probably the easiest way to ensure that the client cannot go out of sync with the database. When making a release, create a Git tag and deploy both client and database from there.

But you must make sure you know which tag is deployed at each customer. We use naming conventions for that (the name of the customer is part of the Git tag name), or you could hold that mapping externally if necessary.

Once you have that, making a client-level hotfix for a specific customer is (relatively) easy - just branch from the customer's release tag, do the changes you need, run the tests, and there is high probability everything will work properly. Once you release the hotfix, you create another tag and remember that this tag is now installed at the customer, and so on...

If you make changes to the database, then of course you still need to have an upgrade procedure from one version of the database to another, but you can be confident that the client will never query the "wrong" database version. And since both old and new database structure are just SQL files under different Git tags, you can discover exactly what changed by just by diffing.

> What does Visual Studio do?

Visual Studio has a special SQL project type, where you can keep you base table definitions, as well as all the SQL code (stored procedures, functions, views, user-defined types, indexes etc.). You can group the SQL project together with client projects, tools, automated tests and benchmarks in the same "solution" (kind of a project-of-projects). When working on the product, you load that solution, and you have all dependencies in one place, right in the Solution Explorer. This eases navigation and minimizes dependencies whose source code you cannot (easily) see.

After you make your changes, you deploy to the local database (we use the free SQL Server Developer Edition, other databases have their own free editions), run/debug tests, run/debug clients as needed. You can even start debugging a C# test or client application and step into T-SQL stored procedure seamlessly, which can be a very powerful tool for ferreting-out difficult bugs. When you are done with your changes, commit to Git, let the CI double-check the tests and make the build for the the last-row-of-defense manual QA testing. Then deploy, associate the new Git tag to the customer, rinse-and-repeat...

Basically, we treat SQL like every other code, and manage SQL dependencies not fundamentally unlike any other dependencies.

monkeyelite 3 days ago | parent | prev [-]

In most organizations a database is broader than any individual application - both in lifecycle and scope. So it makes sense that this state exists in a different way.

I suspect it’s because people never learned to use them, but they did learn to use the ORM.

ameliaquining 3 days ago | parent [-]

The textbooks all say that, but is it really true in practice? I suspect it's a bit of holdover conventional wisdom from an earlier era of software. Usually, when I've created a database for a particular app, it was never used by anything else besides that app (and manual queries, for which versioning is mostly not relevant).

You might choose to have a set of views and functions that are versioned separately from your app, for the same reasons you might choose to set up a microservice. But as with microservices, it should be a deliberate decision that you make because you're getting something out of it that outweighs the complexity costs and version-skew headaches; it should not be the default way to do composition.

baq 3 days ago | parent | next [-]

It’s absolutely true. Nobody should be building a microservice architecture for a product without a proven market fit, monolithic development is just faster at this stage, and when you get around to scaling it once you outgrow it (which should be much later than conventional microservice wisdom agrees) the database isn’t the bottleneck anyway and hence stays monolithic (don’t confuse with adding shards).

monkeyelite 2 days ago | parent | prev [-]

Making two views and deprecating one is actually the simplest possible compatibility strategy - just like adding a new version of a function to a header.

3 days ago | parent | prev [-]
[deleted]
branko_d 3 days ago | parent | prev [-]

> Only for parameterization over scalar values.

ADO.NET has full support for table-valued parameters.

ameliaquining 3 days ago | parent [-]

That's part of .NET Framework and therefore legacy, right? Do the database libraries from recent versions of .NET do this?

In any case, it's just one framework; previous comment said "all major languages". And it's useful to be able to abstract and compose over expressions and predicates and such, not just data values, which this still doesn't help with.

branko_d 3 days ago | parent [-]

> That's part of .NET Framework and therefore legacy, right? Do the database libraries from recent versions of .NET do this?

ADO.NET is available both in the legacy Windows-only .NET Framework and in the new cross-platform .NET (previously known as .NET Core).

> In any case, it's just one framework; previous comment said "all major languages".

Well, you are not implementing a piece of code in "all major languages" - you can pick the one that fits the problem best.

> And it's useful to be able to abstract and compose over expressions and predicates and such, not just data values, which this still doesn't help with.

You can do that via LINQ - there is even special query-like syntax built right into C# for that that looks like this:

    var companyNameQuery =
        from cust in nw.Customers
        where cust.City == "London"
        select cust.CompanyName;
This does NOT load the entire table in memory just to filter on City. It actually transpiles to SQL which does the filtering on the server.

But anything non-trivial is much better done in SQL proper, IMO. Most of the time, at least for OLTP, you'll be using static SQL - that is you will not need to change the text of the SQL query, just parameters. But dynamic SQL is a thing and can be very useful on occasion - which is string concatenation with all the problems that might bring.

never_inline 3 days ago | parent | prev | next [-]

How do you do conditional filters in pure SQL from a backend Java / Python app, without doing string concatenation?

Not a fan of all the proxy object circus ORMs do but I'd leave row-> domain object mapping and filter building to some library. Sweet spot is probably something akin to Android Room / Micronaut Data JDBC.

minitech 3 days ago | parent | next [-]

Query builders that operate at the SQL level. (A popular example of that in Python is SQLAlchemy Core, but there are better ways to do it, especially in better-typed languages.)

yencabulator 2 days ago | parent [-]

I was just pondering, for a little Rust project of mine, whether to suffer some weird ORM-smelling query builder, or to just build the AST with https://github.com/apache/datafusion-sqlparser-rs/ and convert to string..

foobazgt 3 days ago | parent | prev | next [-]

JOOQ (http://jooq.org) is pretty fantastic for this, and it's my go-to for working with RDBMs' on the JVM. It provides a DSL-like API that lets you write pretty much any SQL you need in a type-safe way (without string concatenation).

crazygringo 3 days ago | parent | prev | next [-]

What's wrong with string concatenation?

whatevaa 3 days ago | parent | next [-]

Guaranteed source of bugs in complex cases.

crazygringo 3 days ago | parent [-]

More complex cases are more likely to have bugs period, just in their logic.

String concatenation isn't really a major source of that. Just make sure your parentheses match, as you need to do no matter what, and include a space at the start and end of each string to make sure you don't accidentally smush terms together likethis.

t-writescode 3 days ago | parent | prev [-]

Simpler SQL injection risk and more testing to make sure all potential branching paths don’t result in invalid SQL.

webstrand 3 days ago | parent [-]

There's zero danger of sql injection so long as everything is being passed by parameters. You just concatenate placeholders when you need string concatenation to build the query.

crazygringo 3 days ago | parent [-]

Exactly this.

And if you're testing, you've got to test every query combination anyways. It's not just syntax that can be wrong, but logic and performance.

sgarland 3 days ago | parent | prev | next [-]

SQL has CASE statements, if you’d really like to have all branching logic in pure SQL.

paulddraper 3 days ago | parent | prev [-]

String concatenation

Xss3 3 days ago | parent [-]

No, we must build 16 more layers of pointless abstraction in a new DSL.

chillfox 3 days ago | parent | prev | next [-]

My main issue with ORMs is they always end up being just another thing to learn, adding needless complexity. They are not an alternative to SQL as you always end up having to understand what kind of SQL they create and how it works for either performance or complex queries.

dec0dedab0de 3 days ago | parent | prev | next [-]

I just want to write one language at a time if I can. I like sql when querying directly, almost as a UI of sorts, but it’s not my favorite when I am just trying to get my code to work, and the database is a relatively minor detail.

ozgrakkurt 3 days ago | parent | prev | next [-]

Can’t relate this comment to the article. They can’t just run user sql on DB because they are changing internal db schema between releases. And they can’t implement real sql because it is massive compared to some simple query dsl

lmm 3 days ago | parent | prev [-]

SQL is just extremely bad on top of being poorly integrated with the host language. Middle-endian order, terrible abstraction capabilities, no test support to speak of, essentially no project management tooling...

I use ORMs so that I can write the thing I want to do in a vaguely reasonable language, just like I manipulate XML datastructures in code instead of writing XSLT.