Geek Logbook

Tech sea log book

Understanding Distributed System – Communication

Part I – Communication

Introduction

Interprocess communication (IPC) is fundamental to distributed systems, enabling processes to exchange data over networks. This communication relies on agreed-upon rules, which are specified by network protocols.

Protocol Stack

Network protocols are organized in a stack, with each layer building on the abstraction provided by the layer below. Lower layers are closer to the hardware.

  • Link Layer: Operates on local network links (e.g., Ethernet, Wi-Fi), providing an interface to the network hardware. Switches operate at this layer, forwarding packets based on MAC addresses.
  • Internet Layer: Routes packets across the network from one machine to another.
  • Transport Layer: Transmits data between two processes.
  • Application Layer: Defines high-level communication protocols (e.g., HTTP, DNS).

Reliable Links (Chapter 2)

At the internet layer, nodes communicate by routing packets from one router to the next. This requires addressing nodes and routing packets across routers.

  • IP Protocol: Handles node addressing.
  • TCP Protocol: Provides a reliable communication channel over IP, ensuring that data arrives in order without gaps, duplication, or corruption.

Reliability (2.1)

TCP partitions a byte stream into segments, sequentially numbered for detection of holes and duplicates. Each segment is acknowledged by the receiver. If not acknowledged, the segment is retransmitted. A checksum verifies segment integrity.

Connection Lifecycle (2.2)

Before data can be transmitted, a connection must be opened. The operating system manages the connection state through a socket, which tracks connection state changes. The connection can be in the opening, established, or closing state.

Secure Links (Chapter 3)

While TCP ensures reliable communication, it does not encrypt data. To protect against interception, Transport Layer Security (TLS) encrypts the communication channel. TLS, running on top of TCP, provides encryption, authentication, and integrity for secure communication.

Chapter 5 – APIs

Communication between a client and a server can be direct or indirect, depending on whether the client communicates directly with the server or through a broker.

Request-Response Communication

This chapter focuses on direct communication, specifically the request-response style. In this style, a client sends a request message to the server, which replies with a response message. This is akin to a function call across process boundaries and over the network. Request and response messages contain data serialized in a language-agnostic format. The choice of format affects serialization and deserialization speed, readability, and how easily it can be evolved over time. For instance, JSON is human-readable but verbose, with increased parsing overhead.

Synchronous and Asynchronous Communication

When a client sends a request, it can either block and wait for the response (synchronous) or use an outbound adapter to invoke a callback upon receiving the response (asynchronous).

RESTful API Design

Representational State Transfer (REST) is a set of design principles for creating elegant and scalable HTTP APIs. A RESTful API adheres to these principles, including:

  • Requests are stateless, containing all necessary information for processing.
  • Responses are labeled as cacheable or non-cacheable. Cacheable responses can be reused for equivalent future requests.

Given the prevalence of RESTful HTTP APIs, the chapter will guide you through creating an HTTP API.

HTTP Protocol

HTTP, a request-response protocol, encodes and transports information between clients and servers. It is foundational for RESTful APIs.

Chapter 5 – Resources, Request Methods, and Response Status Codes

Resources (5.2)

An HTTP server hosts resources, which can be physical or abstract entities like documents, images, or collections of other resources. A resource is identified by a URL, which describes its location on the server.

In our catalog service, the collection of products is a type of resource, accessible via a URL like https://www.example.com/products?sort=price, where:

  • https is the protocol
  • www.example.com is the hostname
  • products is the resource name
  • ?sort=price is the query string, containing additional parameters affecting how the service handles the request, such as the sort order of returned products.

The URL without the query string is also known as the API’s /products endpoint.

Request Methods (5.3)

HTTP requests use methods to create, read, update, and delete (CRUD) resources. When a client requests a resource, it specifies which method to use. The method can be thought of as the action to perform on the resource.

Common methods include POST, GET, PUT, and DELETE. For example, the API for our catalog service could be defined as:

  • POST /products: Create a new product and return the URL of the new resource.
  • GET /products: Retrieve a list of products. The query string can filter, paginate, and sort the collection.
  • GET /products/42: Retrieve product 42.
  • PUT /products/42: Update product 42.
  • DELETE /products/42: Delete product 42.

Response Status Codes (5.4)

After receiving a request, the server processes it and sends a response back to the client. The HTTP response contains a status code indicating whether the request succeeded.

  • Status codes 200-299: Success.
  • Status codes 300-399: Redirection.
  • Status codes 400-499: Client errors.
  • Status codes 500-599: Server errors.

Leave a Reply

Your email address will not be published. Required fields are marked *.

*
*