Skip to content

Fabric Topologies

Choosing the right topology is critical for performance, scalability, and cost-effectiveness in an InfiniBand fabric.

The Leaf-Spine architecture is the most common foundation for InfiniBand fabrics.

  • Leaf Switches: Connect directly to the compute nodes (servers).
  • Spine Switches: Connect to all Leaf switches, aggregating traffic.
  • Traffic Flow: Traffic between any two nodes on different leaves travels up to a spine and back down to the destination leaf.

Benefits:

  • Predictable Latency: Deterministic hop count between nodes.
  • Scalability: Easy to expand by adding more spines or leaves.
  • Redundancy: All leaves connect to all spines; losing a spine only reduces bandwidth, not connectivity.

For very large clusters, a third layer is added. Super Spines connect sets of Spine/Leaf groups, allowing the fabric to scale to thousands of nodes.

A Fat-Tree is a topology where links closer to the top (root) of the tree are “fatter” (have more bandwidth) to prevent congestion. It is the standard implementation of Leaf-Spine in HPC.

Fat-Tree performance is often defined by its Oversubscription Ratio—the ratio of downlink bandwidth (to servers) versus uplink bandwidth (to spines).

  • 1:1 (Non-Blocking): For every 100Gb/s of bandwidth to servers, there is 100Gb/s of bandwidth to spines. Ensures full line-rate performance for all nodes simultaneously.
  • 2:1 (Blocking): For every 200Gb/s to servers, there is only 100Gb/s to spines. End nodes may not achieve full bandwidth if everyone transmits at once, but latency remains low.

Advantages:

  • Efficient for high-performance computing.
  • Lowest and most deterministic latency.
  • Scalable via multiple layers (Leaf -> Spine -> Super Spine).

Dragonfly+ connects groups of compute nodes in a highly scalable, cost-effective manner.

  • Groups: Inside a group, nodes are connected in a full bipartite (Leaf-Spine) topology.
  • Inter-Group: Groups are connected to each other in a full mesh (all-to-all).

Requirement: Dragonfly+ requires Adaptive Routing to function efficiently due to the multiple path options between groups.

Advantages:

  • Supports a larger number of hosts than Fat-Tree for the same switch count.
  • More cost-effective (fewer cables/switches) for large scales.
  • High bandwidth and low latency.

In a 3D Torus, nodes are connected in a ring formation across three dimensions (x, y, z).

  • Connections: Each switch connects to its 6 neighbors (2 in x, 2 in y, 2 in z).
  • Resilience: If a link breaks, traffic can wrap around the ring in the other direction.

Advantages:

  • Cost-Effective: Simple, short cabling. Ideal for massive installations (supercomputers).
  • Fault Tolerance: Highly resilient due to multiple paths.
  • Locality: Excellent for applications where communication is localized to neighbors.

Disadvantages:

  • Higher hop count for distant nodes (higher latency).
  • Typically has a higher oversubscription ratio.