Fabric Monitoring & HA
Ensuring the stability and availability of an InfiniBand fabric requires understanding how the Subnet Manager (SM) handles redundancy, failover, and continuous monitoring.
Subnet Manager High Availability
Section titled “Subnet Manager High Availability”While a single Subnet Manager is required for the fabric to function, it represents a single point of failure. Therefore, it is recommended to have at least two SMs (one Master, one Standby).
- Master SM: The active instance managing the fabric.
- Standby SM: Passive instances that monitor the Master and are ready to take over.
Master SM Election
Section titled “Master SM Election”When multiple SMs are present, an election process determines the Master.
- Priority: Each SM is assigned a 4-bit priority value (0-15).
- 0: Lowest priority (default).
- 15: Highest priority.
- GUID Tie-Breaker: If multiple SMs have the same highest priority, the SM with the lowest GUID is elected Master.
SMInfo Attribute
Section titled “SMInfo Attribute”The SMInfo attribute acts as a heartbeat and information exchange mechanism between SMs.
- Used during subnet discovery and polling.
- Contains: SM Port GUID, Priority, and State (Master/Standby).
Failover and Handover
Section titled “Failover and Handover”Failover Process
Section titled “Failover Process”If the Master SM fails or becomes disconnected:
- A Standby SM detects the failure (via missing heartbeats).
- The Standby with the highest priority (or lowest GUID) promotes itself to Master.
Impact:
- Existing Sessions: Generally not impacted.
- New Sessions: Must wait until the new Master is elected and the fabric is stable.
- LIDs: Usually do not change. The new Master attempts to retrieve the GUID-to-LID database from the old Master. If unavailable, it may trigger a new discovery and assignment phase.
Double Failover Scenario
Section titled “Double Failover Scenario”A “double failover” occurs when a failed Master comes back online with a higher priority than the current Master, causing another handover.
Prevention:
To avoid unnecessary handovers, you can configure the master_sm_priority. When a Standby promotes itself, it can raise its priority to 15 (highest), ensuring that the old Master (likely with a lower priority) does not immediately take back control upon return.
Fabric Sweeps
Section titled “Fabric Sweeps”The Subnet Manager continuously monitors the fabric using “sweeps”.
Light Sweep
Section titled “Light Sweep”- Frequency: Periodically (default every 10 seconds).
- Purpose: Checks for status changes without disrupting the fabric.
- Triggers:
- Port status changes.
- New SM detected.
- Standby SM priority change.
- Outcome: If any significant change is detected, it triggers a Heavy Sweep.
Heavy Sweep
Section titled “Heavy Sweep”- Trigger: Triggered by a Light Sweep finding changes or by an InfiniBand Trap (e.g., a switch detecting a port state change).
- Process:
- Full fabric discovery (rediscover topology).
- New LIDs assigned (only if needed, e.g., for new hosts).
- Switch Linear Forwarding Tables (LFTs) are recalculated and reprogrammed.
- Impact:
- Traffic on affected routes may experience a short disruption/latency while the topology is recalculated.
- Host or Leaf switch failures typically trigger a Heavy Sweep.
Monitoring Utilities
Section titled “Monitoring Utilities”The perftest and infiniband-diags packages provide tools to monitor SM status.
sminfo: Displays the Master SM’s LID, GUID, Priority, and State.smpquery: Queries internal SM attributes.- Example:
smpquery nd 12(Get Node Description of the node with LID 12).
- Example:
saquery: Queries the Subnet Administration database.- Example:
saquery -s(List all active SMs, including Master and Standbys).
- Example: