In this article we discuss about what is a Database Transaction, what are its intricacies and why it’s needed?
A transaction is simply “a collection of SQL queries which are treated as a single unit of work.“
What is the need of a db transaction?
Let’s think about the scenario where transactions were not possible, i.e. we are forced to write a single query to achieve every goal of a task… Let us take an example. Suppose we have a database of a retail store with two tables:
Now whenever a customer wants to purchase an item we check if the item has enough quantity available in our inventory (SELECT), if yes then we need to reduce it by the quantity that the customer is purchasing (UPDATE) and make an entry in our sales register (INSERT). Below are the representative queries to do these:
SELECT QTY FROM INVENTORY WHERE ITEM_ID = 1;
If QTY >= quantity needed by customer (2):
UPDATE INVENTORY SET QTY = QTY - QTY_SOLD WHERE ITEM_ID = 1;
INSERT INTO SALES (SALE_ID, ITEM_ID, QTY_SOLD, SALE_PRICE) VALUES(1, 1, 2, 999.0);
And all this work needs to be done as a single unit or work, i.e. do all or do none. Can you think of doing all this with a single SQL query? Well, I can’t think of any, besides stored-procedures may be…
Many times it becomes very difficult or impossible to do so… hence we are forced to break the task into multiple queries and execute them as a single unit of work.
Transaction lifespan – BEGIN to COMMIT
- A DB transaction starts with a keyword like BEGIN.
- Only when a transaction is “COMMITTED” successfully is when a database would guarantee that the changes made during the transaction would be persisted and would be available to further transactions.
- So the question is what happens when we “COMMIT” a transaction. Think about how you would implement the commit part -> write to disk with individual query change vs write to memory for individual query and then write to disk during ‘committing’. Consequently, what about rollback in each approach? In the later one it’s quite straight forward, you just discard the changes in-memory and rollback is done but it can be quite difficult to rollback in the first approach.
- What happens if during “COMMIT” we have a database crash?
When do transactions matter
- Only when we change/ add data?
- How about a read-only transaction? (Hint: think of generating reports…)
The questions raised in this POST form some of the points of differences between various DBMS systems out in the market. Honestly, there is no right or wrong approach to answering these questions, well not really… It all boils down to trade-offs… What a DBMS implementer would like to optimise the system for.