why are noSQL databases more scalable than SQL?

Recently I read a lot about noSQL DBMSs. I understand CAP theorem, ACID rules, BASE rules and the basic theory. But didn’t find any resources on why is noSQL scalable more easily than RDBMS (e.g. in case of a system that requires lots of DB servers)?

I guess that keeping constraints and foreign keys cost resources and when a DBMS is distributed, it is a lot more complicated. But I expect there’s a lot more than this.

Can someone please explain how noSQL/SQL affects scalability?

4

noSQL databases give up a massive amount of functionality that a SQL database gives you by it’s very nature.

Things like automatic enforcement of referential integrity, transactions, etc. These are all things that are very handy to have for some problems, and which require some interesting techniques to scale outside of a single server (think about what happens if you need to lock two tables for an atomic transaction, and they are on different servers!).

noSQL databases don’t have all that. If you need that stuff, you need to do it yourself, but if you DON’T need it (and there are a lot of applications that don’t), then boy howdy are you in luck. The DB doesn’t have to do all of these complex operations and locking across much of the dataset, so it’s really easy to partition the thing across many servers/disks/whatever and have it work really fast.

6

It’s not about NoSQL vs SQL, it’s about BASE vs ACID.

Scalable has to be broken down into its constituents:

  • Read scaling = handle higher volumes of read operations
  • Write scaling = handle higher volumes of write operations

ACID-compliant databases (like traditional RDBMS’s) can scale reads. They are not inherently less efficient than NoSQL databases because the (possible) performance bottlenecks are introduced by things NoSQL (sometimes) lacks (like joins and where restrictions) which you can opt not to use. Clustered SQL RDBMS’s can scale reads by introducing additional nodes in the cluster. There are constraints to how far read operations can be scaled, but these are imposed by the difficulty of scaling up writes as you introduce more nodes into the cluster.

Write scaling is where things get hairy. There are various constraints imposed by the ACID principle which you do not see in eventually-consistent (BASE) architectures:

  • Atomicity means that transactions must complete or fail as a whole, so a lot of bookkeeping must be done behind the scenes to guarantee this.
  • Consistency constraints mean that all nodes in the cluster must be identical. If you write to one node, this write must be copied to all other nodes before returning a response to the client. This makes a traditional RDBMS cluster hard to scale.
  • Durability constraints mean that in order to never lose a write you must ensure that before a response is returned to the client, the write has been flushed to disk.

To scale up write operations or the number of nodes in a cluster beyond a certain point you have to be able to relax some of the ACID requirements:

  • Dropping Atomicity lets you shorten the duration for which tables (sets of data) are locked. Example: MongoDB, CouchDB.
  • Dropping Consistency lets you scale up writes across cluster nodes. Examples: riak, cassandra.
  • Dropping Durability lets you respond to write commands without flushing to disk. Examples: memcache, redis.

NoSQL databases typically follow the BASE model instead of the ACID model. They give up the A, C and/or D requirements, and in return they improve scalability. Some, like Cassandra, let you opt into ACID’s guarantees when you need them. However, not all NoSQL databases are more scalable all the time.

The SQL API lacks a mechanism to describe queries where ACID’s requirements are relaxed. This is why the BASE databases are all NoSQL.

Personal note: one final point I’d like to make is that most cases where NoSQL is currently being used to improve performance, a solution would be possible on a proper RDBMS by using a correctly normalized schema with proper indexes. As proven by this very site (powered by MS SQL Server) RDBMS’s can scale to high workloads, if you use them appropriately. People who don’t understand how to optimize RDBMS’s should stay away from NoSQL, because they don’t understand what risks they are taking with their data.

Update (2019-09-17):

The landscape of databases has evolved since posting this answer. While there is still the dichotomy between the RDBMS ACID world and the NoSQL BASE world, the line has become fuzzier. The NoSQL databases have been adding features from the RDBMS world like SQL API’s and transaction support. There are now even databases which promise SQL, ACID and write scaling, like Google Cloud Spanner, YugabyteDB or CockroachDB. Typically the devil is in the details, but for most purposes these are “ACID enough”. For a deeper dive into database technology and how it has evolved you can take a look at this slide deck (the slide notes have the accompanying explanation).

21

It’s true that NoSQL databases (MongoDB, Redis, Riak, Memcached, etc.) don’t maintain foreign key constraints, and atomic operations must be more explicitly specified. It’s also true that SQL databases (SQL Server, Oracle, PostgreSQL, etc.) can be scaled to handle very large performance requirements by seasoned DBAs.

NoSQL databases allow seasoned programmers, who are well aware of race-conditions and atomic operations, to forego a large amount of processing only required in a small percentage of today’s web application code. NoSQL databases certainly have atomic operations and most all transactional requirements present in SQL databases can also be obtained NoSQL databases. The difference is the level of abstraction. NoSQL databases remove the higher levels of abstraction and hand that capability to the application programmer, thereby resulting is faster code overall with the increased probability of data corruption by unseasoned programmers.

As a result we are much more likely to see NoSQL databases being used more and more heavily in the web application space, where development time and performance are very important. Financial and corporate software is likely to retain it’s SQL heritage because hardware performance is relatively cheap, they have seasoned DBAs on-hand, and the increased risk caused by unseasoned programmers is not palatable.

1

From IBM developerWorks: Supply cloud-level data scalability with NoSQL databases

Scalability is the system that should be able to support very large databases with very high request rates at very low latency.

NoSQL systems have a number of design features in common:

  • The ability to horizontally scale out throughput over many servers.
  • A simple call level interface or protocol (in contrast to a SQL
    binding).
  • Support for weaker consistency models than the ACID transactions in
    most traditional RDBMS.
  • Efficient use of distributed indexes and RAM for data storage.
  • The ability to dynamically define new attributes or data schema.

Why relational databases may not be optimal for Scaling

In general, relational database management systems have been considered as a “one-size-fits-all solution for data persistence and retrieval” for decades. They have matured after extensive research and development efforts and very successfully created a large market and solutions in different business domains.

The ever-increasing need for scalability and new application requirements have created new challenges for traditional RDBMS, including some dissatisfaction with this one-size-fits-all approach in some web-scale applications. The answer to this has been a new generation of low-cost, high-performance database software designed to challenge dominance of relational database management systems. A big reason for the NoSQL movement is that different implementations of web, enterprise, and cloud computing applications have different requirements of their databases — not every application requires rigid data consistency, for example.

Another example: For high-volume websites like eBay, Amazon, Twitter, or Facebook, scalability and high availability are essential requirements that cannot be compromised. For these applications, even the slightest outage can have significant financial consequences and impacts customer trust.

Over on DBA.SE: What does horizontal scaling mean?

Horizontal Scaling is essentially building out instead of up. You don’t go and buy a bigger beefier server and move all of your load onto it, instead you buy 1+ additional servers and distribute your load across them.

Horizontal scaling is used when you have the ability to run multiple instances on servers simultaneously. Typically it is much harder to go from 1 server to 2 servers then it is to go from 2 to 5, 10, 50, etc.

Once you’ve addressed the issues of running parallel instances, you can take great advantage of environments like Amazon EC2, Rackspace’s Cloud Service, GoGrid, etc as you can bring instances up and down based on demand, reducing the need to pay for server power you aren’t using just to cover those peak loads.

Relational Databases are one of the more difficult items to run full read/write in parallel.

Trang chủ Giới thiệu Sinh nhật bé trai Sinh nhật bé gái Tổ chức sự kiện Biểu diễn giải trí Dịch vụ khác Trang trí tiệc cưới Tổ chức khai trương Tư vấn dịch vụ Thư viện ảnh Tin tức - sự kiện Liên hệ Chú hề sinh nhật Trang trí YEAR END PARTY công ty Trang trí tất niên cuối năm Trang trí tất niên xu hướng mới nhất Trang trí sinh nhật bé trai Hải Đăng Trang trí sinh nhật bé Khánh Vân Trang trí sinh nhật Bích Ngân Trang trí sinh nhật bé Thanh Trang Thuê ông già Noel phát quà Biểu diễn xiếc khỉ Xiếc quay đĩa Dịch vụ tổ chức sự kiện 5 sao Thông tin về chúng tôi Dịch vụ sinh nhật bé trai Dịch vụ sinh nhật bé gái Sự kiện trọn gói Các tiết mục giải trí Dịch vụ bổ trợ Tiệc cưới sang trọng Dịch vụ khai trương Tư vấn tổ chức sự kiện Hình ảnh sự kiện Cập nhật tin tức Liên hệ ngay Thuê chú hề chuyên nghiệp Tiệc tất niên cho công ty Trang trí tiệc cuối năm Tiệc tất niên độc đáo Sinh nhật bé Hải Đăng Sinh nhật đáng yêu bé Khánh Vân Sinh nhật sang trọng Bích Ngân Tiệc sinh nhật bé Thanh Trang Dịch vụ ông già Noel Xiếc thú vui nhộn Biểu diễn xiếc quay đĩa Dịch vụ tổ chức tiệc uy tín Khám phá dịch vụ của chúng tôi Tiệc sinh nhật cho bé trai Trang trí tiệc cho bé gái Gói sự kiện chuyên nghiệp Chương trình giải trí hấp dẫn Dịch vụ hỗ trợ sự kiện Trang trí tiệc cưới đẹp Khởi đầu thành công với khai trương Chuyên gia tư vấn sự kiện Xem ảnh các sự kiện đẹp Tin mới về sự kiện Kết nối với đội ngũ chuyên gia Chú hề vui nhộn cho tiệc sinh nhật Ý tưởng tiệc cuối năm Tất niên độc đáo Trang trí tiệc hiện đại Tổ chức sinh nhật cho Hải Đăng Sinh nhật độc quyền Khánh Vân Phong cách tiệc Bích Ngân Trang trí tiệc bé Thanh Trang Thuê dịch vụ ông già Noel chuyên nghiệp Xem xiếc khỉ đặc sắc Xiếc quay đĩa thú vị
Trang chủ Giới thiệu Sinh nhật bé trai Sinh nhật bé gái Tổ chức sự kiện Biểu diễn giải trí Dịch vụ khác Trang trí tiệc cưới Tổ chức khai trương Tư vấn dịch vụ Thư viện ảnh Tin tức - sự kiện Liên hệ Chú hề sinh nhật Trang trí YEAR END PARTY công ty Trang trí tất niên cuối năm Trang trí tất niên xu hướng mới nhất Trang trí sinh nhật bé trai Hải Đăng Trang trí sinh nhật bé Khánh Vân Trang trí sinh nhật Bích Ngân Trang trí sinh nhật bé Thanh Trang Thuê ông già Noel phát quà Biểu diễn xiếc khỉ Xiếc quay đĩa
Thiết kế website Thiết kế website Thiết kế website Cách kháng tài khoản quảng cáo Mua bán Fanpage Facebook Dịch vụ SEO Tổ chức sinh nhật