▲ | SQL can scale just as well as nosql? | |
1 points by jonasbrouthers 14 hours ago | 3 comments | ||
I always believed (and people always told me) that popular nosql implementations had superior scalability over RDBMS's due to better ability to horizontally partition. But lately I've been wondering whether that ability to horizontally scale is simply easier (ie. less complex due to less referential integrity constraints, implemented automatically in many cloud offerings), rather than superior. Can't SQL scale just as well, if you implement sharding? You do the same thing these nosql DBs are already doing under the hood, and simply add more machines and divide your data across them. of course you have to think about your access patterns to pick shard keys that don't distribute data that you need in joins across different colocations, and implement and maintain the actual sharding, which I'm not saying is an easy task. but isn't this extra work needed due to the fact that SQL is gives you relationships and joins in the first place, whereas with NoSQL, to support similar queries, you have to denormalize (something you can do in SQL as well) or make multiple queries to multiple tables, which is inefficient? basically, I think of it like this: in the best case for nosql, there are no joins, you're just reading from one big flat table, in which case you can easily achieve the same scalability with SQL by sharding on any key without any complex sharding design. in the worst case for nosql, you have to do a join-like query combining data from multiple tables (ie. all comments for a user), which will require multiple queries, whereas with SQL, if you design your sharding correctly, the join will still be on the same machine thus efficient, AND relations enforced. therefore, SQL is superior in any case, but just requires more thought and work to achieve the same scalability given its enforcement of relationships? ^ is this an accurate take? :P | ||
▲ | rawgabbit 13 hours ago | parent | next [-] | |
SQL can actually scale if you use the CQRS architecture which consists of two SQL instances. The first instance is an append only SQL instance. That is you are doing nothing but inserts into a table. You then use database replication to create a second read only instance. You use the second instance for reports, long running transactions, and other queries that take massive locks. You could use NOSQL for the first instance but then you have to write your own integration sync. | ||
▲ | FrankWilhoit 14 hours ago | parent | prev | next [-] | |
Any real-world business semantic model is going to have LOTS of joins. The game is to make them more efficient. But first and always, remember that it doesn't matter how fast you get the wrong answer. | ||
▲ | wmf 13 hours ago | parent | prev [-] | |
Yes, NoSQL is pointless. |