Simplifying Data Center Networking Deployments

Summary: As the complexity of deploying applications and managing data across a multi-cloud infrastructure increases, data center operators must prioritize simplicity in design, infrastructure, and operations.

Modern Data Centers deploy networks that support applications and workloads in a multi-cloud environment. Applications are the life blood of any Data Center. They are the engines of business and application life cycle management and are the central driving force in a Data Center environment. Modern Data Centers run a complex array of applications – Business Intelligence, Payroll, CRM, Messaging, Marketing, and Customer Service to name a few.

Business Critical Applications are one of the main sources of increasing complexity in Data Centers. Since these applications are essential and have to be run as intended, most IT operators look to solving the consequences of running these application – namely rising capital and operational expenses. For example, in a Data Center with tens of thousands of servers, adding additional memory in each server or adding a port on a switch can quickly add up. Capital Expense consideration is very important as the scale of the Data Center goes up.

In addition to capital expenses, operational costs also need a careful analysis. In a typical Data Center, the operational expenses are roughly 4 times the capital expenses. They are a major component affecting the Total Cost of Ownership (TCO). Lowering operational expenses has been a key ongoing goal of IT administrators and Data Center Operators.

Data Center Networking at its core enables the connection of compute, storage and application resources with each other. As the application scale and requirements increase, the network designs can get complex. Simplification is a must for any IT Manager/Architect in order maintain the wide array of applications that support the business goals and strategy of the enterprise. Here are some areas to consider as you think about simplifying your data center networking:

  • Network design – Sub-optimal designs can lead to higher complexity and increasing operational expenses due to frequent troubleshooting. In the past, data center networking was Implemented as a Layer 2 Network Fabric with Spanning Tree Protocol as the control plane. Unfortunately, this makes very inefficient use of the available capacity (many links may be blocked due to loops) and can cause broadcast storms – one of the top reasons for network meltdowns. Data Center Network Design has matured over many years and now, most have standardized into a well-worn architecture.
  • Network Infrastructure: Expensive Modular chassis with multi-chip designs are complex to implement and manage – The operational costs of managing devices with multiple hardware components and proprietary software can be high. Also, software complexity increases exponentially with number of components and layers. Oftentimes, when using proprietary NOS, most of the components of the NOS are not used but nevertheless are “running” in your data center. Even though they are not exercised greatly, they can cause collateral damage with unexpected outages.
  • Operations and Network Management : Manual Operations and Proprietary tools (which may not interoperate) cause an increase in complexity leading to an increase in operational costs. Operational expenses can also increase due to frequent update of hardware or software components – either due to failure, or new requirements making the current gear obsolete.

Network Design

Figure 1: Spine/Leaf Data Center Network

Modern Data Centers are increasingly deploying spine-leaf architectures (also called Folded-CLOS), to meet the needs of applications: high-throughput and low-latency. Advantages of Spine-Leaf architectures are:

  • Resiliency : Each leaf switch connects to every spine switch, spanning-tree is not needed and every uplink can be used concurrently.
  • Latency: There is a maximum of 2 hops for any East-West packet flows so ultra-low-latency is standard.
  • Performance: True active-active uplinks enable traffic to flow over the least congested high-speed links available.
  • Scalability: You are able to increase leaf switch quantity to desired port capacity and add spine switches as needed for uplinks.

In today’s Data Centers, the appetite for bandwidth is insatiable. Clustered systems such as HPC and machine learning shops require higher bandwidth and can accept a higher latency but need to have fewer switches and thereby eliminate some of the hops in the switch fabric and make up for that higher latency.

In many parts of the networks (E.g.: Hyperscalers), there is a need for more bandwidth, but there is also sometimes a need to support more ports onto a switch ASIC and thereby increase the radix of the device while at the same time creating flatter topologies, thus eliminating switching devices and therefore costs from the network without sacrificing performance or bandwidth.

Applications in modern data centers are increasingly cloud-native and based on a microservices architecture. These applications pose a massive increase in East-West traffic within a data center and greater emphasis on low latencies per hop. Network congestion or delays would impact customer experience and requires rich telemetry and analytics support in the infrastructure to diagnose and troubleshoot issues.

The right switch for your network should provide high bandwidth where needed but also support higher radix, to reduce the switch footprint in the infrastructure. Switch architectures designed to accomodate these considerations of bandwidth, radix, and latency will usually provide better tradeoffs and alternatives when compared to legacy architectures.

Innovium’s TERALYNX Ethernet Switch Silicon family has been architected from the ground up to provide the most optimized high radix Ethernet solutions for data centers. Innovium delivers the world’s highest performance switch silicon with large buffers, unmatched analytics through fine-grain telemetry, low latency, and line-rate programmability. The monolithic silicon die leads to some of the industry’s best power efficiency in terms of performance per watt.

Network Infrastructure

Network disaggregation can be defined as the separation of networking equipment into functional components and allowing each component to be individually deployed:

  • separation of software OS from underlying hardware
  • open APIs to enable SDN (Software Defined Networking) control

The goal of disaggregation is to bring unprecedented agility and choice, while significantly driving down costs. While hyperscalers have embraced disaggregation for some time now, it is not yet prevalent in enterprises. The most common reason for this is that a typical enterprise does not have the resources to handle the integration among the components — Hardware, Optics, and software from different sources. Innovium has recognized this issue and has launched TERACertified solutions to solve this issue. Innovium TERACertified Platforms are hardened to be production grade, and support verified interoperability with a variety of connectivity partners. They are designed to be turn-key with fast custom feature enablement that accelerates operational efficiency.

While disaggregation is a shift in thinking, there are other issues that deserve attention. Data Center Operations expenses are a significant driver of costs – capacity changes, and service agility are some of the activities which can increase operational costs. Trouble shooting, debugging and customer satisfaction are however rated as the top operational concern among data center operators. Fault containment and isolation is paramount in designing data center networks. Fault containment means running software components that are right sized for the application. Monolithic Network Operating Systems which bundle every feature under the sun into a single image, are inherently complex and expensive to manage.

Consider the network below:

The leaf and spine switches need to only route packets – there is no other forwarding logic needed here (for the data plane). Burdening the NOS on these switches with other software components would only increase likelihood of system failures. In other words, run only what you need — not what a vendor might be offering in a single NOS image. This is where disaggregation provides greatest value – the separation of components also enables composing the NOS image to include only those components that are needed. Smaller NOS image implies smaller fault domain. It’s that simple.

If Layer 2 extensions are needed, TOR(Top of Rack) Switches and/or Hypervisors can provide overlay networks for other network constructs such as VLAN stretching, Micro-Segmentation, Firewall security etc. By keeping the core/aggregation network to be IP-routed, the complexity is reduced and helps achieve lower operational costs.

When choosing physical network infrastructure, it is important to pay attention to some critical requirements. RDMA over converged Ethernet (RoCE) is a table stakes feature to support in Data Centers. RDMA allows for servers to exchange data without involving compute processor, cache, or OS. RDMA is a key driver of faster data transfer rate and low latency networking, RoCE places certain requirements on the switching infrastructure with respect to Quality-of-Service attributes – PFC, IP ECN bits, CNP Frames etc. These typically require support from switch silicon and care mut be taken when choosing network silicon. Innovium Switch Silicon, for instance, has support for RoCE v2.

SONiC is the de-facto open-source NOS for data centers. It is fast growing and adopted by leading companies & Hyperscalers; gathering steam in enterprise and tier-2 cloud providers. SONiC provides customers faster innovation, agility, and greater control with open-source network OS compared to proprietary network OS from traditional vendors. Innovium enables rich and hardened SONiC support through the entire product line. With a SDK built from the ground up for SAI and SONiC, Innovium has deployed SONiC at some of the biggest data centers. Innovium contributes to the SONiC community and creates targeted value-added features to enable complete solutions for end-customers and partners that leverage TERALYNX’s unique capabilities such as FLASHLIGHT™ telemetry. Innovium’s hardware architecture allows for flexible scaling enabling customers to allocate available table space according to the requirements of the use case.

Operations and Network Management

Reducing Operational costs is a top priority for Data Center Operators and the deployment, training and time spent troubleshooting is a significant driver of these costs. Proprietary technologies cause proliferation of management applications and become unmanageable in a short period of time. One way to get a reduction in costs, is by leveraging existing automation and orchestration tools as much as possible. Since the personnel would already be familiar on how to use it, if the network infrastructure can be managed using a common toolset with other areas such as compute and storage, it is a win. A lot of compute infrastructure is currently managed using Linux tools and network equipment that can be managed by the same tools would be of great advantage. Innovium’s open-source NOS can be managed using existing Linux management tools – providing better experience and lower costs for Data Center Administrators.

Automation is another top priority of Cloud Operators. This relies on all components be driven through an API – an API-first approach to design helps greatly. API-first means, developing products with a goal of being managed through programmatic interfaces from day one – not as an afterthought. Innovium Hardware SDK and SONiC Operating systems have an API-first mind set and provide programmatic interfaces to unlock breakthrough telemetry and analytics capabilities. For example, Innovium silicon supports Real-Time Path-Level telemetry with ability to measure delay/jitter on a per hop basis – this level of detail is unprecedented in the industry today.

NetDevOps describes the interaction of DevOps and Networking. It attempts to automate manual network processes by treating Infrastructure as Code. The network infrastructure must be configurable programmatically for all tasks without requiring an operator interacting through CLI or GUI. It leverages the API-first design approach to achieve reduction in manual steps and lower costs. Innovium is committed to supporting NetDevOps as an operating model by providing programmatic interfaces across the portfolio.

Another business consideration is recurring costs of upgrades, and replacement of infrastructure. These are not just the expenses associated with buying new hardware and software but the loss of revenue due to downtime. Thus, care must be taken to deploy technologies that have a higher shelf life – or in the other words “Future Proof” your infrastructure to the extent possible. Innovium TERALYNX™ Silicon is a highly programmable architecture allowing operators to deploy new features and protocols (such as Geneve) without requiring a “fork-lift” upgrade of the infrastructure. Innovium silicon programmability can be used through SDK, SAI or SONiC as needed.

Increasing complexity is a fact of life in modern data centers – what helps lower costs and increase ROI is to adopt designs and architectures that are simple to understand, implement and operate. The key to simplification is selecting infrastructure vendors who embrace this philosophy in product design.