Table of Contents
Minimum Spanning Tree in Distributed Systems: Gallager-Humblet-Spira (GHS) Algorithm
Distributed systems can be highly complex due to the extensive communication and interaction between server nodes, leading to potential failures within the subsystems. Consider a scenario where a node needs to broadcast a message across a distributed network efficiently. This is where the concept of a Minimum Spanning Tree (MST) becomes crucial.
An MST is a subset of the edges of a connected, weighted, undirected graph that satisfies the following properties:
-
Number of edges: An MST contains exactly V-1 edges, where V is the number of vertices in the graph.
-
Unique MST: If all the edge weights in the graph are distinct, then there is a unique MST. Otherwise, multiple valid MSTs may exist with the same total weight.
In distributed systems, MSTs can significantly enhance efficiency, throughput, and minimize latency for message transmissions. The Gallager-Humblet-Spira (GHS) algorithm is a specialized distributed MST algorithm that achieves these goals.
Overview of the GHS Algorithm
The GHS algorithm is designed to construct an MST for a connected, weighted, undirected graph in a distributed setting. This algorithm is particularly useful when nodes operate as independent entities within a distributed system, communicating through messages along the edges of the graph.
Nodes in the graph: Machines/servers communicating with each other.
Edges in the graph: Communication links.
Edge weights: The cost associated with transmitting data across the communication lines.
To achieve efficient message broadcasting, we need to cover all nodes in the network with edges such that the overall cost is minimized.
Distributed Setup
In a distributed computing environment, no single node will have complete topological information. Instead, each node only knows the weights of its adjacent edges and can maintain information about these weights.
Problem Statement: Identify the minimum-weight spanning tree for the entire network so that each node can determine which of its adjacent edges are part of the spanning tree.
Key Strategies and GHS Algorithm
Core Concepts of the GHS Algorithm:
-
Fragments: Initially, each node forms its own fragment (a subtree with a single node). These fragments are merged iteratively to form larger fragments until a single MST is obtained.
-
Minimum Weight Edge (MWE): Each fragment identifies its minimum outgoing edge (MWE), the edge with the smallest weight connecting it to another fragment.
-
Levels: Fragments have levels. Initially, all fragments are at level 0. Merging two fragments increases their level.
-
Leader: Each fragment has a leader node responsible for managing and coordinating the merging process.
-
Message Notation: Nodes communicate using specific message types such as INIT, MERGE, FIND, REPORT, CONNECT.
-
Termination: The algorithm terminates when all nodes are part of a single fragment spanning the entire graph.
Steps of the GHS Algorithm:
-
Initialization: Each node starts as its own fragment and acts as its own leader. Nodes identify the minimum outgoing edge (MWE) among their neighbors.
-
Finding MWE: Each fragment identifies its MWE and exchanges FIND and REPORT messages among the nodes within the fragment.
-
Merging Fragments: Fragments are merged through their MWEs. When two fragments of the same level merge, the new fragment’s level increases. If levels differ, the merged fragment adopts the higher level.
-
Repeat Until MST is Formed: The process of finding MWEs and merging fragments repeats until all nodes belong to a single fragment.
Scenario: GHS in a Distributed Environment (Data Center)
Data Center Setup: A data center consists of servers represented as nodes in the graph, with communication links between servers having weights representing data transmission costs.
Objective: Construct an MST using the GHS algorithm to minimize the overall communication cost.
Example: Consider 6 servers connected as follows (S1, S2, S3, S4, S5, S6).
Initial Graph:
10
S1 ------- S2
| |
15 5
| |
S3 S4
| |
20 10
| |
S5 ------- S6
30
Using the GHS Algorithm:
-
Initialization: Each server (S1, S2, …, S6) initializes its fragment: Fragment ID = Server ID, Level = 0, State = NULL.
-
Identify MWE: Each server identifies the minimum cost link connecting it to another fragment. S1: MWE = (S1-S2: 10), S2: MWE = (S2-S4: 5), etc.
-
Merge Fragments: Fragments merge through their MWEs. S1 and S2 merge through edge (S1-S2: 10), forming a new fragment (S1, S2). Similarly, S2 and S4 merge through edge (S
Source Link: medium.com
Source: medium.com
Via: medium.com