Move Over InfiniBand and Fibre Channel, All Paths lead to Ethernet

As we look forward to 2020 and reflect back on major data center infrastructure product & solution announcements in 2019, I can’t stop but think about how Ethernet fabric is going to places where it has not gone before. I remember the time when I managed a diverse portfolio of network fabrics – Fibre channel for storage networks, Ethernet for server connectivity and InfiniBand for high performance compute clusters. However, today’s modern data centers and IT solutions are only deploying Ethernet networks.

I want to share some key product & solution announcements from 2019 to prove the point.

  • At Supercomputing show in Nov 2019, the TOP500 list was updated. TOP500 is a list of the world’s 500 most powerful computer systems. These supercomputers are typically built using a distributed compute architecture connected using a high-performance network fabric. In the past, these network fabrics were based on proprietary technologies. However, in the Nov 2019 list, we find that majority of them use high performance Ethernet fabrics. Further, Ethernet fabric continues to grow rapidly among the TOP500 and other HPC cluster deployments.
  • In 2015, Intel launched OmniPath, a high-performance interconnect for HPC customers. However, in 2H2019, it was widely reported that Intel decided to retreat from OmniPath, in favor of Ethernet for HPC customers.
  • Oracle Exadata is an engineered solution that delivers better performance, lower cost and higher availability for Oracle databases to run OLTP, data warehousing and in-memory analytics. It was built using scale-out compute servers and scale-out storage servers connected using an InfiniBand fabric. In late 2019, Oracle announced Exadata 8XM, which now runs over a high-performance Ethernet RoCE network fabric.
  • Customers are increasingly embracing AI/ML technologies to harness and benefit from the vast amount of data being generated everywhere. Training AI workloads require special AI processors to be connected to a very high performance low-latency network to train AI models on large datasets. Habana Labs, an AI startup, has developed a Training Processor with an on-chip 100G RoCEv2 link to connect to a high-performance Ethernet network. It is a proof point of Ethernet becoming the fabric for AI/ML solutions.
  • At AWS re-Invent in Dec 2019, one of the AWS keynotes specifically focused on how the AWS platform is optimized for HPC and AI workloads. AWS infrastructure is built with a range of compute options connected to a highly scalable Ethernet network fabric using EFA (Elastic Fabric Adapter), a network interface with kernel bypass to accelerate inter-node communication at scale for HPC & ML applications.
  • Earlier in 2019, Facebook shared details about F16, its next-generation data center fabric design using 12.8Tbps Ethernet switches to deliver 4x the capacity of its previous design. Today, all cloud data centers exclusively use Ethernet fabrics with the latest generation of Ethernet switch asics to connect 100s of thousands of servers and storage nodes.
  • We are seeing rise of NVMe-oF (NVMe over Fabrics) storage arrays with up to 12 x 100G network connections so a server farm can have shared access to NVMe drives over a high performance low-latency Ethernet network. Leading-edge NVMe-oF products from WesternDigital, Seagate, Dell-EMC etc. rely on an Ethernet fabric to serve data center customers.

Here are some key reasons why Ethernet will continue to increase its market dominance.

  • Ubiquitous cost-effective solutions available from multiple providers: Ethernet is a ubiquitous technology with solutions available from a range of providers, who constantly innovate to offer the most compelling and cost-effective solutions. On the other hand, only a single provider offers InfiniBand solutions and two providers offer Fibre channel solutions. Further, it is easier to deploy, manage and troubleshoot Ethernet networks compared to alternatives.
  • Higher performance, higher radix and low-latency solutions: Today, Ethernet switches offer connectivity of up to 400 Gbps. In contrast, InfiniBand HDR only offers connectivity of up to 200 Gbps and Fibre channel offers connectivity of up to 32 Gbps. High performance low-latency Ethernet solutions are perfect for HPC workloads. Further, Fibre channel and InfiniBand switch asics have not been able to keep up with the high radix Ethernet switches that enable customers to build a network fabric with less tiers/hops and, hence, provide lower cost/power and lower latencies. Those are critical requirements for HPC clusters.
  • Enhancements for storage deployments: For storage applications, Ethernet offers capabilities like DCB (Data Center Bridging), PFC (Priority Flow Control) & RoCE (RDMA over Converged Ethernet) to deliver high throughput and a scalable storage network.
  • High performance Ethernet NICs: Today, customers have access to high performance, low-latency NICs that support robust implementation of capabilities such as RDMA and MPI (Message Passing Interface), critical requirements in HPC and AI/ML deployments.

We are proud to have contributed to the Ethernet market momentum. Innovium delivered the world’s highest performance 12.8Tbps TERALYNX switch silicon with the lowest latency and unmatched telemetry in early 2018. We enabled leading OEM and ODM system partners to develop a broad range of high performance 100-400G switches that are being deployed by world’s top public and private cloud data center customers today. We continue to innovate and deliver unmatched telemetry & analytics, programmability, and scalable Ethernet solutions across all performance & price points for Cloud and Edge data centers.

Please contact us at [email protected] for information on our products.