In a Monolithic system, “Time” is easy. You ask the OS for Date.now(), and that is the absolute truth.
In a Distributed System, “Time” is a nightmare.
Common questions like “Which event happened first?” become remarkably hard to answer.
The Problem with Physical Clocks
Distributed systems span different machines, often in different time zones (e.g., US, India, Australia). Each machine has its own Physical Clock.
Even with NTP (Network Time Protocol), these clocks drift. One server might think it’s 10:00:00.001 while another thinks it’s 10:00:00.005.
This Clock Skew means you cannot rely on timestamps to order events.
- Scenario: User A books a ticket at
10:00:01(Server US). User B books the same ticket at10:00:02(Server India). If the India server’s clock is 5 seconds slow, it might record the time as09:59:57, making it look like User B booked it first!
Ideally: Partial Ordering
We don’t need to know the exact time. We just need to know the Order of Events. This is called the Happened-Before Relationship.
The Solution: Lamport Logical Clocks
Leslie Lamport introduced a simple algorithm to solve this using Logical Counters instead of real time.
The Algorithm
Every process maintains a counter C, initially 0.
- Internal Event: Before executing an event, increment
C = C + 1. - Send Message: Increment
C = C + 1and sendCalong with the message. - Receive Message: When receiving a message with counter
C_msg, update your local counter:C = max(C, C_msg) + 1.
Visualizing the Flow
Let’s look at three processes (P1, P2, P3) interacting. Notice how P2’s clock jumps forward when it receives a message from P3.
%%{init: {'theme': 'dark'}}%%
sequenceDiagram
participant P1
participant P2
participant P3
Note over P1,P3: Initial State: All Clocks = 0
P1->>P1: Event A (C=1)
P2->>P2: Event B (C=1)
P1->>P2: Message (C=2)
Note right of P2: P2 receives C=2.<br/>max(1, 2) + 1 = 3
P2->>P2: Event C (C=3)
P3->>P3: Event D (C=1)
P3->>P2: Message (C=2)
Note right of P2: P2 receives C=2.<br/>max(3, 2) + 1 = 4
P2->>P2: Event E (C=4)
P2->>P1: Message (C=5)
Note left of P1: P1 receives C=5.<br/>max(1, 5) + 1 = 6
P1->>P1: Event F (C=6)
What This Tells Us
- If event
Acaused eventB, thenClock(A) < Clock(B). - This gives us a Partial Ordering of events.
- It does not give us real time (duration), but it ensures causality is respected.
Conclusion
In Distributed Systems, Logical Time > Physical Time. Tools like Lamport Clocks and Vector Clocks (an immense improvement on Lamport) are the bedrock of systems like Amazon Dynamo, Cassandra, and Google Spanner. They allow us to agree on the sequence of events without needing atomic clocks in every server.