Overview
Multi-Version Concurrency Control (MVCC) is a powerful concurrency control mechanism widely used in modern database management systems, particularly in PostgreSQL. If you are new to this topic, start at [^1] & [^2] for introduction. While MVCC is a well-established concept, understanding its intricacies can be challenging, especially when trying to bridge the gap between theoretical knowledge and practical implementation.
This blog post aims to learn MVCC by taking a hands-on implementation in Java. Our goal is to create a simplified yet functional version of MVCC that illustrates its core principles and mechanisms.
Basic MVCC in java
Let’s start by creating the necessary classes and then implement a simple MVCC system.
1 | import java.util.concurrent.ConcurrentHashMap; |
This implementation provides a basic MVCC system similar to PostgreSQL’s approach. Here’s a breakdown of the main components:
MVCCDatabase: The main class that handles the MVCC operations.VersionedData: An inner class that represents a version of data with its associated transaction ID.ConcurrentHashMap<String, List<VersionedData>>: Stores the versioned data for each key.AtomicLong currentTransactionId: Generates unique transaction IDs.
The key MVCC concepts implemented here are:
- Each write operation creates a new version of the data.
- Read operations retrieve the most recent version of the data that’s not newer than the reading transaction.
- Transactions are identified by monotonically increasing IDs.
This implementation is simplified and doesn’t include features like commit/rollback or garbage collection of old versions. In a real-world scenario, you’d need to add these features and handle concurrent access more robustly.
Would you like me to explain any part of this implementation in more detail?
Implementing Serializable Txns
To implement serializable transactions in our MVCC system, we need to add several key features to ensure that concurrent transactions behave as if they were executed serially. Here’s what we need to add:
1 | import java.util.concurrent.ConcurrentHashMap; |
To implement serializable transactions in our MVCC system, we’ve added the following key features:
Transaction Management:
- We now keep track of active transactions using the
activeTransactionsmap. - Each transaction has its own
Transactionobject that stores read and write sets.
- We now keep track of active transactions using the
Conflict Detection:
- Write-Write Conflicts: We check if any other active transaction has written to the same key.
- Read-Write Conflicts: We check if any other active transaction has written to a key that we’re trying to read.
Optimistic Concurrency Control (OCC): What we will actually implement is a form of OCC. In this approach, Transactions proceed without locking data items. If there’s a conflict, the transaction aborts (in our case, by throwing a TransactionConflictException).
- Read Phase: When a transaction reads data (in our
getmethod), it adds the key to its read set. - Write Phase: When a transaction writes data (in our
putmethod), it adds the key to its write set and stores the write in a local buffer (ourwritesmap in theTransactionclass). - Validation Phase: This occurs during the
put,get, andcommitTransactionmethods:- We check for write-write conflicts by seeing if any other active transaction has written to the same key.
- We check for read-write conflicts by seeing if any other active transaction has written to a key that we’re trying to read.
- Commit Phase: If validation succeeds, we apply all the writes atomically during the
commitTransactionmethod. - Commit and Rollback:
commitTransaction: Performs a final conflict check and applies all writes atomically.rollbackTransaction: Discards all changes made by the transaction.
- Error Handling:We introduce a
TransactionConflictExceptionto handle serialization conflicts. - Visibility Rules:
- Transactions only see committed data from other transactions that started before them.
- A transaction’s own writes are immediately visible to itself.
- Read Phase: When a transaction reads data (in our
This implementation provides serializable isolation by ensuring that concurrent transactions behave as if they were executed serially. It does this by detecting and preventing conflicts that would violate serializability.
Key points to note:
- This is still a simplified implementation. A production-ready system would need additional features like deadlock detection and prevention.
- Performance optimizations, such as predicate locking for range queries, are not implemented here.
- This implementation may be prone to high abort rates under contention. Real databases often use more sophisticated techniques to reduce conflicts.
To further improve this system, we might consider:
- Implementing a scheduler to manage transaction order and reduce conflicts.
- Adding support for different isolation levels.
- Implementing a more efficient conflict detection mechanism, possibly using timestamps or multi-version timestamps.
- Adding support for distributed transactions.
Optimization - Timestamp-based conflict detection
In our current implementation, we’re using a relatively simple approach to detect conflicts: we check all active transactions for potential conflicts every time we perform a read or write operation. This can become inefficient as the number of concurrent transactions increases. Using timestamps or multi-version timestamps can provide a more scalable approach to conflict detection. Here’s how these mechanisms work:
- Timestamp-based conflict detection:
In this approach, each transaction is assigned a unique timestamp when it starts. Additionally, each data item maintains two timestamps:
- Read timestamp (RTS): The highest timestamp of any transaction that has read the item.
- Write timestamp (WTS): The highest timestamp of any transaction that has written to the item.
When a transaction tries to read or write an item, we can use these timestamps to detect conflicts:
- For a read operation: If the transaction’s timestamp is less than the item’s WTS, it means a newer transaction has already written to this item, so we have a conflict.
- For a write operation: If the transaction’s timestamp is less than either the item’s RTS or WTS, we have a conflict.
This approach allows for quicker conflict detection without needing to check all active transactions.
- Multi-version timestamps:
This is an extension of the timestamp-based approach that maintains multiple versions of each data item, each with its own timestamp. It’s similar to what we’re already doing with our versioned data, but with more sophisticated timestamp management.
In this approach:
- Each write operation creates a new version of the data item with the transaction’s timestamp.
- Read operations find the latest version with a timestamp less than or equal to the reading transaction’s timestamp.
This allows for even better concurrency because:
- Read operations never conflict with write operations.
- Only conflicting write operations need to be serialized.
Here’s a basic example of how you might implement multi-version timestamps:
1 | class VersionedData { |
In this example:
- Each transaction gets a unique timestamp when it begins.
- Write operations create new versions with the transaction’s timestamp.
- Read operations find the latest version valid for the transaction’s timestamp.
- We update the read timestamp of the version that was read, which can be used for cleanup of old versions.
This approach allows for efficient conflict detection and resolution, as well as potentially better performance in read-heavy workloads, as reads don’t block writes and vice versa.
Implementing such a system would make our MVCC more efficient and scalable, especially when dealing with a high number of concurrent transactions. However, it also introduces additional complexity in terms of version management and garbage collection of old versions.
Testing
Let’s focus on the following anomalies:
- Dirty Read
- Non-Repeatable Read
- Phantom Read
- Lost Update
We’ll use JUnit for our tests. Here’s an implementation with tests for these anomalies:
1 | import org.junit.jupiter.api.BeforeEach; |
Let’s go through each test and explain what it’s checking:
testNoDirtyRead():
This test ensures that a transaction cannot read uncommitted changes made by another transaction. It creates two transactions: one that updates a value but doesn’t commit, and another that reads the value. The read should return the original value, not the uncommitted update.testNoNonRepeatableRead():
This test verifies that within a single transaction, repeated reads of the same data return the same result, even if another transaction modifies and commits changes to that data in between the reads.testNoPhantomRead():
This test checks for phantom reads, which occur when a transaction reads a set of rows that satisfy a search condition, but a second transaction inserts new rows that match the condition. The test ensures that the count of keys remains the same within a transaction, even if another transaction adds a new key.testNoLostUpdate():
This test checks for lost updates, which can occur when two transactions read and update the same data concurrently. It uses two concurrent threads to increment a counter and ensures that both increments are reflected in the final value.
These tests demonstrate that our MVCC implementation provides serializable isolation, preventing common transaction anomalies. However, there are a few important points to note:
These tests assume that our
MVCCDatabaseclass throws aTransactionConflictExceptionwhen it detects a conflict. The tests might need to be adjusted if the conflict resolution mechanism is different.The
countKeys()method is a simplified way to simulate a range query. In a real database, you’d have more sophisticated methods for range queries.The
testNoLostUpdate()test uses aCountDownLatchto coordinate the two threads. This is a simplification; in a real-world scenario, you’d need more robust concurrency control.These tests focus on the core ACID properties, but they don’t cover all possible edge cases. In a production system, you’d want even more comprehensive tests, including stress tests with many concurrent transactions.
The tests assume that the database operations (
get,put, etc.) don’t throw checked exceptions. If they do, you’ll need to adjust the test methods to declare these exceptions.
To run these tests, you’ll need to have JUnit 5 in your project’s classpath. These tests provide a good starting point for verifying the correctness of your MVCC implementation, but remember that thorough testing of a concurrent system often requires more extensive scenarios and stress testing.
References
[^1] https://www.interdb.jp/pg/pgsql05.html
[^2] https://15445.courses.cs.cmu.edu/spring2023/notes/18-multiversioning.pdf