Close Menu
geekfence.comgeekfence.com
    What's Hot

    Designing trust & safety (T&S) in customer experience management (CXM): why T&S is becoming core to CXM operating model 

    January 24, 2026

    iPhone 18 Series Could Finally Bring Back Touch ID

    January 24, 2026

    The Visual Haystacks Benchmark! – The Berkeley Artificial Intelligence Research Blog

    January 24, 2026
    Facebook X (Twitter) Instagram
    • About Us
    • Contact Us
    Facebook Instagram
    geekfence.comgeekfence.com
    • Home
    • UK Tech News
    • AI
    • Big Data
    • Cyber Security
      • Cloud Computing
      • iOS Development
    • IoT
    • Mobile
    • Software
      • Software Development
      • Software Engineering
    • Technology
      • Green Technology
      • Nanotechnology
    • Telecom
    geekfence.comgeekfence.com
    Home»Technology»Nvidia Rubin’s Network Doubles Bandwidth
    Technology

    Nvidia Rubin’s Network Doubles Bandwidth

    AdminBy AdminJanuary 11, 2026No Comments5 Mins Read1 Views
    Facebook Twitter Pinterest LinkedIn Telegram Tumblr Email
    Nvidia Rubin’s Network Doubles Bandwidth
    Share
    Facebook Twitter LinkedIn Pinterest Email



    Earlier this week, Nvidia surprise-announced their new Vera Rubin architecture (no relation to the recently unveiled telescope) at the Consumer Electronics Show in Las Vegas. The new platform, set to reach customers later this year, is advertised to offer a ten-fold reduction in inference costs and a four-fold reduction in how many GPUs it would take to train certain models, as compared to Nvidia’s Blackwell architecture.

    The usual suspect for improved performance is the GPU. Indeed, the new Rubin GPU boasts 50 quadrillion floating-point operations per second (petaFLOPS) of 4-bit computation, as compared to 10 petaflops on Blackwell, at least for transformer-based inference workloads like large language models.

    However, focusing on just the GPU misses the bigger picture. There are a total of six new chips in the Vera-Rubin-based computers: the Vera CPU, the Rubin GPU, and four distinct networking chips. To achieve performance advantages, the components have to work in concert, says Gilad Shainer, senior vice president of networking at Nvidia.

    “The same unit connected in a different way will deliver a completely different level of performance,” Shainer says. “That’s why we call it extreme co-design.”

    Expanded “in-network compute”

    AI workloads, both training and inference, run on large numbers of GPUs simultaneously. “Two years back, inferencing was mainly run on a single GPU, a single box, a single server,” Shainer says. “Right now, inferencing is becoming distributed, and it’s not just in a rack. It’s going to go across racks.”

    To accommodate these hugely distributed tasks, as many GPUs as possible need to effectively work as one. This is the aim of the so-called scale-up network: the connection of GPUs within a single rack. Nvidia handles this connection with their NVLink networking chip. The new line includes the NVLink6 switch, with double the bandwidth of the previous version (3,600 gigabytes per second for GPU-to-GPU connections, as compared to 1,800 GB/s for NVLink5 switch).

    In addition to the bandwidth doubling, the scale-up chips also include double the number of SerDes—serializer/deserializers (which allow data to be sent across fewer wires) and an expanded number of calculations that can be done within the network.

    “The scale-up network is not really the network itself,” Shainer says. “It’s computing infrastructure, and some of the computing operations are done on the network…on the switch.”

    The rationale for offloading some operations from the GPUs to the network is two-fold. First, it allows some tasks to only be done once, rather than having every GPU having to perform them. A common example of this is the all-reduce operation in AI training. During training, each GPU computes a mathematical operation called a gradient on its own batch of data. In order to train the model correctly , all the GPUs need to know the average gradient computed across all batches. Rather than each GPU sending its gradient to every other GPU, and every one of them computing the average, it saves computational time and power for that operation to only happen once, within the network.

    A second rationale is to hide the time it takes to shuttle data in-between GPUs by doing computations on them en-route. Shainer explains this via an analogy of a pizza parlor trying to speed up the time it takes to deliver an order. “What can you do if you had more ovens or more workers? It doesn’t help you; you can make more pizzas, but the time for a single pizza is going to stay the same. Alternatively, if you would take the oven and put it in a car, so I’m going to bake the pizza while traveling to you, this is where I save time. This is what we do.”

    In-network computing is not new to this iteration of Nvidia’s architecture. In fact, it has been in common use since around 2016. But, this iteration adds a broader swath of computations that can be done within the network to accommodate different workloads and different numerical formats, Shainer says.

    Scaling out and across

    The rest of the networking chips included in the Rubin architecture comprise the so-called scale-out network. This is the part that connects different racks to each other within the data center.

    Those chips are the ConnectX-9, a networking interface card; the BlueField-4 a so-called data processing unit, which is paired with two Vera CPUs and a ConnectX-9 card for offloading networking, storage, and security tasks; and finally the Spectrum-6 Ethernet switch, which uses co-packaged optics to send data between racks. The Ethernet switch also doubles the bandwidth of the previous generations, while minimizing jitter—the variation in arrival times of information packets.

    “Scale-out infrastructure needs to make sure that those GPUs can communicate well in order to run a distributed computing workload and that means I need a network that has no jitter in it,” he says. The presence of jitter implies that if different racks are doing different parts of the calculation, the answer from each will arrive at different times. One rack will always be slower than the rest, and the rest of the racks, full of costly equipment, sit idle while waiting for that last packet. “Jitter means losing money,” Shainer says.

    None of Nvidia’s host of new chips are specifically dedicated to connect between data centers, termed ‘“scale-across.” But Shainer argues this is the next frontier. “It doesn’t stop here, because we are seeing the demands to increase the number of GPUs in a data center,” he says. “100,000 GPUs is not enough anymore for some workloads, and now we need to connect multiple data centers together.”

    From Your Site Articles

    Related Articles Around the Web



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email

    Related Posts

    India smartphone shipments were flat YoY at ~153M; Apple shipped 14M iPhones, raising its share of shipments to a record 9%, up from 7% in 2024 (Jagmeet Singh/TechCrunch)

    January 24, 2026

    Today’s NYT Connections: Sports Edition Hints, Answers for Jan. 23 #487

    January 23, 2026

    The Fork-It-and-Forget Decade – O’Reilly

    January 21, 2026

    AI Data Centers Face Skilled Worker Shortage

    January 20, 2026

    How to Clean Your Keurig (and When)

    January 19, 2026

    Can Congress rein in Trump on Greenland?

    January 18, 2026
    Top Posts

    Understanding U-Net Architecture in Deep Learning

    November 25, 202511 Views

    Hard-braking events as indicators of road segment crash risk

    January 14, 20269 Views

    Microsoft 365 Copilot now enables you to build apps and workflows

    October 29, 20258 Views
    Don't Miss

    Designing trust & safety (T&S) in customer experience management (CXM): why T&S is becoming core to CXM operating model 

    January 24, 2026

    Customer Experience (CX) now sits at the intersection of Artificial Intelligence (AI)-enabled automation, identity and access journeys, AI-generated content…

    iPhone 18 Series Could Finally Bring Back Touch ID

    January 24, 2026

    The Visual Haystacks Benchmark! – The Berkeley Artificial Intelligence Research Blog

    January 24, 2026

    Data and Analytics Leaders Think They’re AI-Ready. They’re Probably Not. 

    January 24, 2026
    Stay In Touch
    • Facebook
    • Instagram
    About Us

    At GeekFence, we are a team of tech-enthusiasts, industry watchers and content creators who believe that technology isn’t just about gadgets—it’s about how innovation transforms our lives, work and society. We’ve come together to build a place where readers, thinkers and industry insiders can converge to explore what’s next in tech.

    Our Picks

    Designing trust & safety (T&S) in customer experience management (CXM): why T&S is becoming core to CXM operating model 

    January 24, 2026

    iPhone 18 Series Could Finally Bring Back Touch ID

    January 24, 2026

    Subscribe to Updates

    Please enable JavaScript in your browser to complete this form.
    Loading
    • About Us
    • Contact Us
    • Disclaimer
    • Privacy Policy
    • Terms and Conditions
    © 2026 Geekfence.All Rigt Reserved.

    Type above and press Enter to search. Press Esc to cancel.