Skip to content

Skyway Gateway

The NVIDIA Skyway Gateway is an appliance that bridges InfiniBand (IB) fabrics to Ethernet networks, enabling IP-over-InfiniBand (IPoIB) connectivity between IB hosts and external Ethernet resources.

In most modern AI clusters, hosts are multi-homed — they have both InfiniBand and Ethernet interfaces. IB handles GPU-to-GPU RDMA traffic while Ethernet provides management, storage, and external connectivity. In these environments, Skyway is unnecessary.

Skyway becomes relevant when physical constraints limit the available network infrastructure to InfiniBand only — for example, when rack space, cabling, or switch budget prevents deploying a parallel Ethernet fabric, but hosts still need IP connectivity to external services.

Skyway Gateway operates as both an InfiniBand host on the fabric and an IP router. It facilitates communication by routing traffic from IB HCAs to Ethernet networks using the standard IPoIB protocol.

  • Protocol Support: Supports IPv4 addresses only.
  • Function: Acts as a bridge/gateway for IB hosts to access Ethernet services (storage, management, external connectivity).

The Skyway appliance is built on an x86 server platform equipped with multiple InfiniBand Host Channel Adapters (HCAs) and Ethernet ConnectX adapters.

  • Port Pairing: The IB HCA and Ethernet HCA ports are essentially bridged in pairs. Traffic entering a specific IB port is forwarded out its corresponding Ethernet port.
  • Throughput: High-performance implementations utilize 8x HDR InfiniBand ports and 8x 200GbE ports (ConnectX-6) to deliver up to 1.6 Tb/s of throughput per appliance.
  • Scalability: Multiple appliances (e.g., up to 4) can be deployed in a single Skyway domain to scale total throughput.

Skyway leverages SR-IOV (Single Root I/O Virtualization) to virtualize physical network resources.

  • Virtual Functions (VFs): Each physical IB port is divided into multiple Virtual Functions.
  • Addressing: Each VF is assigned its own unique Virtual GUID (V-GUID), Virtual GID (V-GID), and Virtual LID (V-LID).
  • Gateway Redundancy: The gateway configures its IB ports into a port-channel, assigning a single IP address to this logical interface. This IP acts as the default gateway for IPoIB-enabled hosts within the IB fabric.

The communication process involves address resolution (ARP) and path queries to the Subnet Manager (SM) to establish connections.

  1. Distribution: 64 VFs (V-GUIDs) are distributed across the InfiniBand ports on the gateway appliance.
  2. ARP Request: When an IB host needs to communicate with an Ethernet destination, it sends an ARP request for its default gateway (the Skyway IP). This broadcast is handled by the default HCA receiver on the Skyway appliance (typically ib0).
  3. Load Balancing: The Skyway kernel processes the ARP request. It selects a specific VF (and its V-GUID) to handle the traffic, load-balancing based on the source IP of the requesting host.
  4. Path Resolution: The IB host receives the V-GUID of the assigned gateway port. It then sends a Path Query to the Subnet Manager (SM) to resolve this V-GUID to a LID.
  5. SM Response: The Subnet Manager responds with the LID of the Skyway Gateway port.
  6. Data Transmission: The host encapsulates the IP packet into an InfiniBand packet and sends it to the gateway’s LID.
  7. Forwarding: The packet arrives at the specific IB port on the gateway. Hardware forwarding strips the IB headers and moves the IP packet to the paired Ethernet port.
  8. Egress: The Ethernet port routes the packet to the next hop or destination using its routing table.
  1. Ingress: An external source sends an Ethernet packet destined for the IP address of an IB host.
  2. Routing: The external network routes the packet to the Skyway Ethernet port-channel.
  3. Internal Forwarding: The specific Ethernet port receiving the traffic forwards it to its paired IB port.
  4. ARP & Discovery: If the destination IB host’s MAC/GUID is not cached, the gateway sends an ARP request into the IB fabric.
  5. Host Response: The target IB host responds with its GUID.
  6. Path Query: The gateway queries the Subnet Manager to determine the LID for that GUID.
  7. SM Response: The Subnet Manager provides the destination LID.
  8. Encapsulation: The gateway encapsulates the Ethernet payload into an IPoIB packet and sends it to the IB host’s LID.
  9. Processing: The IB host receives and processes the packet.

A basic Skyway deployment involves configuring the IB port-channel, the Ethernet port-channel, the Subnet Manager for virtualization, and the IB hosts.

The Subnet Manager must have virtualization enabled for Skyway’s SR-IOV to function.

OpenSM — Add the following to opensm.conf:

virt_enabled 2
virt_max_ports_in_process 0

Managed IB Switch — If using the embedded SM on a managed switch:

Terminal window
configure terminal
ib sm virt enable
ib sm virt-max-ports-in-progress 0
write memory

Skyway uses a Cisco-like CLI (similar to MLNX-OS on NVIDIA IB switches). Configure the IB port-channel with a source IP and a virtual IP that IB hosts will use as their default gateway.

Terminal window
enable
configure terminal
interface ib port-channel 1 ip address 192.168.0.254/24
interface ib port-channel 1 virtual ip address 192.168.0.1/24
interface ib port-channel 1 mtu 4092
write memory

The MTU must be set to 4092 or lower. InfiniBand’s maximum MTU is 4094, and the encapsulated IP payload needs to be smaller to avoid forwarding issues.

Configure each IB host with an IP on the IPoIB subnet and a default route pointing to the Skyway virtual IP.

Terminal window
ifconfig ib0 192.168.0.10/24
ip route add 0/0 via 192.168.0.1

For production, make these settings persistent using netplan or equivalent.

Configure the Ethernet port-channel on the Skyway appliance. LACP must be set to active mode on the remote Ethernet switch.

Terminal window
enable
configure terminal
interface ethernet port-channel 1 ip address 192.168.1.2/30
interface ethernet port-channel 1 mtu 4090
ip route 0.0.0.0/0 192.168.1.1
write memory

The Ethernet MTU should be lower than the IB port-channel MTU.

An example configuration for the Ethernet switch connected to the Skyway appliance:

Terminal window
enable
configure terminal
interface port-channel 1
no shut
exit
interface ethernet 0/1-0/8
channel-group 1 mode active
no shut
exit
vlan 10
exit
interface port-channel 1 switchport access vlan 10
interface port-channel 1 switchport mode access
interface vlan 10 ip address 192.168.1.1/24
ip route 192.168.0.0/24 192.168.1.254
end
wr mem

The static route 192.168.0.0/24 points to the Skyway Ethernet IP as the next hop for the IPoIB subnet.

Skyway supports HA deployments with up to 4 appliances in a single gateway domain. All appliances share a common LACP port-channel across the Ethernet side, similar to a VPC/MLAG design.

Each appliance in the domain holds one of three roles:

  • Master Gateway — Only one per domain. Responsible for V-GUID assignment, load balancing, and overall domain coordination.
  • Active Backup Gateway(s) — Actively forwarding traffic and ready to assume the master role.
  • Non-Active Backup Gateway(s) — Standing by for failover.

Each domain member distributes its IB host list to all other members in the domain.

  • MLNX-GW OS version must be identical across all appliances.
  • All appliances must share the same L2 management subnet.
  • All appliances must use the same HA domain ID.
  • All Skyway Ethernet interfaces must be connected to L3 router interfaces.
  • Virtual IP and Ethernet port-channel configuration must be identical on all appliances.
  • Even single-appliance deployments should include HA configuration to simplify future scale-out.

On the master appliance, set a higher priority to ensure election as master:

Terminal window
gw ha 1
gw ha priority 100

On all other appliances, join the same domain:

Terminal window
gw ha 1

All appliances require a reboot after HA configuration. Validate the setup with:

Terminal window
show gw ha

When multiple Skyway appliances are in the same domain, all their Ethernet ports join a single port-channel on the remote switch. For example, with two appliances (ports 0/1-0/8 for appliance 1, 0/9-0/16 for appliance 2):

Terminal window
enable
configure terminal
interface port-channel 1
no shut
exit
interface ethernet 0/1-0/16
channel-group 1 mode active
no shut
exit
vlan 10
exit
interface port-channel 1 switchport access vlan 10
interface port-channel 1 switchport mode access
interface vlan 10 ip address 192.168.1.1/24
ip route 192.168.0.0/24 192.168.1.254
end
wr mem

If a gateway appliance fails (hardware failure, cabling issue, or port configuration change), its V-GUIDs are automatically reassigned to other HCAs in the domain. IB hosts see no disruption — from their perspective, the gateway remains reachable.

Skyway supports multiple IPoIB subnets per partition key (P_Key), enabling multi-tenant environments where each partition operates as an isolated network segment (similar to VLANs).

  • Each P_Key gets its own IPoIB subnet, but all share the same Skyway domain.
  • A single domain supports up to 20 IPoIB subnets (best practice recommends 10 or fewer to avoid longer boot times).
  • P_Key interfaces only support IPv4.
  • All fabric nodes are connected to the management P_Key (0x7FFF) by default.

To configure a P_Key-specific IPoIB subnet on the Skyway IB port-channel (using P_Key 0x1 as an example):

Terminal window
configure terminal
interface ib port-channel 1 pkey 0x1
interface ib port-channel 1 pkey 0x1 ip address 192.168.0.254 255.255.255.0
interface ib port-channel 1 pkey 0x1 virtual ip address 192.168.0.1 255.255.255.0
write memory

On the IB host, create an interface for the P_Key using the format ib0.8<pkey>:

Terminal window
ifconfig ib0.8001 192.168.0.10/24
ip route add 0/0 via 192.168.0.1

View all P_Key interfaces on the Skyway appliance:

Terminal window
show interfaces ib port-channel 1 pkey brief

To display SM-configured P_Keys, check partitions.conf on the Subnet Manager. Running ifconfig on a host will show its configured P_Key interface.

CommandPurpose
ifconfig / ip addrVerify IPoIB interface IP configuration
route / ip routeVerify default route points to Skyway virtual IP
ping / ibpingTest connectivity
ibstatCheck local HCA link status
ibswitchesList switches in the fabric
iblinkinfoShow link connectivity details
CommandPurpose
show interfaces ibDisplay all IB interfaces
show interfaces ib port-channelDisplay IB port-channel status
show interfaces eth port-channelDisplay Ethernet port-channel status
show gw vf-distributionShow VF-to-HCA port assignments
show gw haDisplay HA domain status and roles
show asic versionDisplay HCA firmware versions
show imagesList available OS images
show versionShow installed OS version

ibdiagnet can be used to validate Virtual Functions. Compare the output file ibdiagnet2.vports with the Skyway command show gw vf-distribution to confirm VF assignments match.

Virtualization settings are stored in opensm.conf on the Subnet Manager.