Geek Logbook

Tech sea log book

Understanding Distributed System – Coordination


Our ultimate goal is to build a distributed application consisting of a group of processes that gives its users the illusion they are interacting with one coherent node. While achieving a perfect illusion may not always be possible or desirable, some degree of coordination is always needed to build a distributed application.

Chapter 6 – System Models

To reason about distributed systems, we need to define precisely what can and can’t happen. A system model encodes expectations about the behavior of processes, communication links, and timing.

  • The fair-loss link model assumes that messages may be lost and duplicated, but if the sender keeps retransmitting a message, eventually, it will be delivered to the destination.
  • The reliable link model assumes that a message is delivered exactly once, without loss or duplication. A reliable link can be implemented on top of a fair-loss one by de-duplicating messages at the receiving side.
  • The authenticated reliable link model makes the same assumptions as the reliable link but additionally assumes that the receiver can authenticate the sender.

Similarly, we can model the behavior of processes based on the type of failures we expect to happen:

  • The arbitrary-fault model assumes that a process can deviate from its algorithm in arbitrary ways, leading to crashes or unexpected behaviors caused by bugs or malicious activity.
  • The crash-recovery model assumes that a process doesn’t deviate from its algorithm but can crash and restart at any time, losing its in-memory state.
  • The crash-stop model assumes that a process doesn’t deviate from its algorithm but doesn’t come back online if it crashes.
  • The synchronous model assumes that sending a message or executing an operation never takes more than a certain amount of time.
  • The asynchronous model assumes that sending a message or executing an operation on a process can take an unbounded amount of time.
  • The partially synchronous model assumes that the system behaves synchronously most of the time.

Chapter 7 – Failure Detection

Several things can go wrong when a client sends a request to a server. In the best case, the client sends a request and receives a response. But what if no response comes back after some time? In that case, it’s impossible to tell whether the server is just very slow, it crashed, or a message couldn’t be delivered because of a network issue.

  • A ping is a periodic request that a process sends to another to check whether it’s still available.
  • A heartbeat is a message that a process periodically sends to another.

Chapter 8 – Time

Time is an essential concept in any software application, even more so in distributed ones. We have seen it play a crucial role in the network stack (e.g., DNS record TTL) and failure detection (timeouts). Another important use of it is for ordering events.

The flow of execution of a single-threaded application is simple to understand because every operation executed sequentially in time, one after the other. But in a distributed system, there is no shared global clock that all processes agree on that can be used to order operations. And, to make matters worse, processes can run concurrently.

Chapter 9 – Leader Election

There are times when a single process in the system needs to have special powers, like accessing a shared resource or assigning work to others. To grant a process these powers, the system needs to elect a leader among a set of candidate processes, which remains in charge until it relinquishes its role or becomes otherwise unavailable. When that happens, the remaining processes can elect a new leader among themselves.

A leader election algorithm needs to guarantee that there is at most one leader at any given time and that an election eventually completes even in the presence of failures. These two properties are also referred to as safety and liveness, respectively, and they are general properties of distributed algorithms.

Chapter 10 – Replication

Data replication is a fundamental building block of distributed systems. One reason for replicating data is to increase availability. If some data is stored exclusively on a single process, and that process goes down, the data won’t be accessible anymore. However, if the data is replicated, clients can seamlessly switch to a copy. Another reason for replication is to increase scalability and performance; the more replicas there are, the more clients can access the data concurrently.

Implementing replication is challenging because it requires keeping replicas consistent with one another even in the face of failures. In this chapter, we will explore Raft’s replication algorithm, a replication protocol that provides the strongest consistency guarantee possible — the guarantee that to the clients, the data appears to be stored on a single process, even if it’s actually replicated.

Chapter 11 – Coordination Avoidance

Another way of looking at state machine replication is as a system that requires two main ingredients:

  • A broadcast protocol that guarantees every replica receives the same updates in the same order even in the presence of faults (aka fault-tolerant total order broadcast).
  • A deterministic function that handles updates on each replica.

Unsurprisingly, implementing a fault-tolerant total order broadcast protocol is what makes state machine replication hard to solve since it requires consensus. More importantly, the need for a total order creates a scalability bottleneck since updates need to be processed sequentially by a single process (e.g., the leader in Raft). Also, total order broadcast isn’t available during network partitions as the CAP theorem applies to it as well.

Chapter 12 – Transactions

Transactions provide the illusion that either all the operations within a group complete successfully or none of them do, as if the group were a single atomic operation.

If your application exclusively updates data within a single relational database, then bundling some changes into a transaction is straightforward. On the other hand, if your system needs to atomically update data that resides in multiple data stores, the operations need to be wrapped into a distributed transaction, which is a lot more challenging to implement.

12.1 ACID

Consider a money transfer from one bank account to another. If the withdrawal succeeds, but the deposit fails for some reason, the funds need to be deposited back into the source account. In other words, the transfer needs to execute atomically.

  • Atomicity guarantees that partial failures aren’t possible.
  • Consistency guarantees that the application-level invariants must always be true.
  • Isolation guarantees that a transaction appears to run in isolation as if no other transactions are executing.
  • Durability guarantees that once the database commits the transaction, the changes are persisted on durable storage so that the database doesn’t lose the changes if it subsequently crashes.

Leave a Reply

Your email address will not be published. Required fields are marked *.