Consistency in Databases

It is one of the ACID properties which is often traded-off between different database systems (relational/ nosql/ graph etc.) (heard of eventually consistent DB?).

It is often sacrificed for speed, scalability in some databases

It can be categorised in two parts:

  • Consistency in Data
  • Consistency in Read

Data Consistency

  • That a transaction leaves the updated database in a consistent state, i.e. where all applicable constraints and integrity (referential) rules on the data are satisfied.
  • The user is responsible for defining the constraints and integrity rules which in turn define the data-consistency in database
  • Atomicity of committed transactions is also maintained (assuming the transaction itself is atomic from application perspective)
  • Problems in Isolation of transactions may also cause inconsistency in data.

Example:

Let’s take below tables for our example database:

SALE_SUMMARY: the table is expected to summarise all the sales made through our system so far with items being sold and total quantity sold for each item

SALES: The table to persist individual sale transactions done in the system, with quantity sold in each transaction and price at which each item was sold

Dummy tables for Sales and Sale-summary data to understand Data consistency
Dummy tables for Sales and Sale-summary data to understand Data consistency

Based on our understanding of tables we expect that there were 10 quantities of item_id=1 (iPhone) were sold. But summing up the total qty_sold in the SALES table for ITEM_ID=1 we see that only 6 quantity sales were recorded in this. Similarly, there’s an entry for ITEM_ID=3 being sold in the SALES table whereas there’s no corresponding summary record in the SALE_SUMMARY table (or maybe there was a record earlier which got deleted). 

This kind of scenario depicts violation of the data-integrity or consistency as required from application’s business perspective.

Read Consistency

  • A change committed by a transaction should be visible to the next subsequent transaction immediately
  • It affects the system design and behaviour as a whole
  • Almost all types of databases (relational, NoSQL) suffer from this. More so when they’re scaled horizontally…
  • Eventual consistency comes in this part; with the assumption that data-consistency is maintained

Example:

Let’s consider the following database architecture setup. We have 1 primary database: P-DB (could be of any type: Relational, NoSQL etc) and two read replicas which are in auto-replication with the primary DB and are used only to serve read-requests. Note that the replication of data after writing to primary DB could take some time.

Database architecture with read replica setup for increased availability giving rise to read consistency phenomena
Database architecture with read replica setup for increased availability

So once a write request arrives and makes some write operation on the P-DB, it could take some time (Δt) to propagate to our read replica databases. Before this time, if a read request arrives looking for the change made by the previous write request, the result returned could be something old/ undesirable. And this basically is what inconsistency in “Read” is.

Above structure is also what gives rise to the term “Eventual Consistency”. The read replicas databases in above architecture would get the data after some time (eventually) it gets written in the Primary DB and thus becomes consistent with the Primary at some point of time later.

Rate this post

Leave a Reply