As the parameter scale of large AI models continues to expand, global computing infrastructure is undergoing a fundamental transformation centred on interconnect architecture upgrades. Industry analysis suggests that 2026 will mark a critical inflection point, as shipments of commercial GPUs rise in tandem with the large-scale deployment of proprietary ASICs by cloud service providers. This convergence is reshaping intra–data centre communication, with optical interconnects and copper cabling forming a complementary technology stack to support hyperscale AI clusters.
NVIDIA’s newly unveiled Rubin architecture has set a new industry benchmark. Its NVL 144 rack leverages sixth-generation NVLink and NVSwitch technologies to deliver 129.6 TB/s of bidirectional bandwidth per rack. At the physical interconnect level, the architecture adopts 1.6T Active Electrical Cables (AECs) to enable high-density connections between GPUs and switching chips, pushing per-lane speeds to 224G. This design reinforces copper cables as the dominant solution for short-reach, in-rack transmission, while a fat-tree topology addresses network congestion challenges in ultra-large-scale clusters. Estimates indicate that a 9,216-GPU cluster would require 288 spine switches, with an optical module–to–chip ratio of 1:12, directly stimulating demand for high-end optical modules.
Google’s TPU cluster interconnect strategy highlights a highly customized approach. Its 64-accelerator rack adopts a 3D Torus topology, achieving 4.8 TB/s of bidirectional bandwidth through a hybrid interconnect model. Within the rack, 80 copper cables handle direct communication between non-adjacent boards, while optical modules are deployed for inter-rack connectivity. At hyperscale, Google introduces Optical Circuit Switching (OCS), using 64 units of 300×300 port devices to dynamically reconfigure optical paths. This reduces power consumption by over 30% for training workloads at the 100,000-accelerator scale, while significantly improving network flexibility.
Amazon’s Trainium3 networking architecture underscores innovation at the physical layer. In the NVL72×2 configuration, 144 chips are interconnected through a three-tier structure combining PCBs, backplane connectors and AEC copper cables. Each rack deploys 216 PCIe copper cables with 64 ports, forming an all-copper interconnect solution that maintains 400G low-latency bandwidth while cutting rack-level power consumption by approximately 25%. For cluster scaling, Amazon adopts a dual-plane network architecture: the ENA network handles front-end traffic, while the EFA network is dedicated to compute communication. Using a Clos topology, the system supports linear scaling to clusters exceeding 130,000 chips.
Meta’s disaggregated system fabric (DSF) offers another architectural blueprint. The Minerva rack combines Tomahawk 5 switching chips with 112G PAM4 copper backplanes to deliver 204.8 Tbps of symmetric bandwidth. In hyperscale deployments, the two-tier DSF switching network maintains a 1:1 oversubscription ratio to ensure lossless transmission. Building a cluster of 18,432 chips under this design requires approximately 184,000 units of 800G optical modules. This configuration underscores the irreplaceable role of optical modules in long-reach interconnects, while reaffirming copper cables’ cost advantages in short-distance scenarios.
Industry analysts note that AI infrastructure development in 2026 will be characterized by the parallel advancement of optical and copper technologies. Within racks, high-speed copper solutions such as DACs and AECs are expected to dominate due to their lower power consumption and cost advantages, driving capacity expansion across the supply chain. Meanwhile, as cluster sizes surpass the 100,000-accelerator threshold, demand for optical modules is set to grow exponentially. In high-specification deployments by companies such as NVIDIA and Meta, the ratio of optical chips and modules per system is rising sharply, potentially triggering a new wave of capacity competition.
Market participants are therefore advised to focus on leading players in optical communication chips and high-speed copper cable connectors. These companies have already secured first-mover advantages through technological iteration and forward-looking capacity expansion, positioning them to benefit from the next phase of AI-driven infrastructure growth.
Source: MSN
