Architecting for Resiliency

Designing solutions with resiliency is one of the most critical aspects of network and edge architecture. While there is no correct answer to how much resiliency is needed, there are best practices, suggestions for different use cases, and some specific services and features that Network Edge offers.

The underlying Network Functions Virtualization (NFV) platform that provides the infrastructure for Network Edge is inherently fault-tolerant from a single virtual instance standpoint. Still, you must design high availability into the overall solution to achieve the maximum redundancy possible. This document explains how to achieve high availability using the inherent nature of the platform complemented by Network Edge fault-tolerant features.

Levels of Redundancy

Imagine a network design from the origin of a data packet and moving from inside outward to its destination. In that case, each point where traffic is processed or traversed becomes a possible point of failure. The key is to design against an impacting event at the traversal points.

In a simple network flow from a Network Edge device to an Equinix Fabric™ participant, as shown in the next graphic, the traffic flows have three distinct points: Network Edge virtual device, Equinix Fabric, and the Fabric participant connection. There is inherent redundancy between the Network Edge virtual instance and Equinix Fabric, so the Leaf or Spine architecture that interconnects the underlying NFV platform to the Fabric is redundant. There is no inherent redundancy between Equinix Fabric and the Fabric participant. To achieve maximum redundancy, you must deploy a solution that utilizes different planes of connectivity to take advantage of the complete network flow.

Primary and Secondary Planes

The concept of Primary and Secondary planes is behind the deployment of Network Edge and Equinix Fabric. Network Edge achieves the Primary and Secondary planes using compute separation through affinity. Each virtual instance in a fault-tolerant pair is deployed in its respective cluster.

We deploy Equinix Fabric switches as part of a chassis group consisting of Primary and Secondary switches. The Primary and Secondary switch designations for the chassis group are simply a way to identify the switches from a nomenclature standpoint and do not show traffic flows. Active to Active or Active to Standby from a routing standpoint are configured on Network Edge device and are completely under your control.

By combining and connecting the Primary and Secondary planes, you can create a highly available architecture.

Redundant Compared to Cluster or High-Availability (HA) Virtual Instance

Network Edge workflows ensure paired virtual devices in a fault-tolerant deployment are placed on Primary and Secondary compute planes. There are two types of fault-tolerant deployments: Redundant and Cluster (or HA). Redundant virtual instances are deployed on different compute planes for redundancy. They have no higher-level workflows, meaning the devices are  unaware of each other after initial deployment and function as two distinct virtual devices.

Cluster (or HA) virtual instances have higher-level workflows that will deploy either an Active-to-Active or Active-to-Standby device pair as defined by the respective vendor. Check the documentation to verify Cluster (or HA) support for your virtual device.

Note: You can deploy redundant virtual instances across metros but can only deploy Cluster (or HA) devices within the same metro.

Virtual Connections

You create virtual connections from Network Edge virtual device through the Fabric switch to the Fabric participant virtual instance. Virtual connection workflows are flexible, allowing for multiple scenarios to accommodate redundancy.

Note: All scenarios shown in the following illustrations assume the Provider participant connects to both the Primary and Secondary Fabric. If there are any questions about redundant Fabric connections, consult with your Equinix Global Solution Architect.

Connecting to Same Metro with Same Provider

Use this connection scenario to connect to the same provider using Network Edge devices in the same metro location. In this scenario, Network Edge supports both Redundant and Cluster (or HA) deployments. Virtual circuit workflows ensure each circuit is provisioned on the Primary and Secondary Fabric planes, respectively. An example is redundant connections to the same Fabric participants.

Connecting to Same Metro with Different Providers

Use this connection scenario to connect to different providers using Network Edge devices in the same metro location. In this scenario, Network Edge supports both Redundant and Cluster (or HA) deployments. Virtual circuit workflows ensure each circuit is provisioned on the Primary and Secondary Fabric planes, respectively. An example is redundant connections to different Fabric participants.

Connecting to Different Metro with Same Provider

Use this connection scenario to connect to the same providers using Network Edge virtual instances in different metro locations. In this scenario, Network Edge supports Redundant deployments but does not support Cluster (or HA) deployments. Virtual circuit workflows ensure each circuit is provisioned on the Primary and Secondary Fabric planes, respectively. An example is redundant connections from two different metros to the same Fabric participant.

Connecting to Different Metro with Different Provider

Use this connection scenario to connect to different providers using Network Edge devices in different metro locations to different Fabric participants. In this scenario, Network Edge supports Redundant deployments but does not support Cluster (or HA) deployments. Virtual circuit workflows ensure provisioning for each circuit on the Primary and Secondary Fabric planes, respectively. An example is redundant connections from two different metros to the different Fabric participants.

Recommendation

This document provides an overview of the fault-tolerant solutions that you can achieve using Network Edge. The available components are designed to allow maximum flexibility when used in tandem to deploy high availability solutions. Also, you can add high availability after the fact once proof-of-value testing is complete and is ready for production deployment. For any questions about building and deploying highly available solutions on Network Edge, contact your Equinix Solution Architect.