X-As-A-Y: What-As-A-Why!

Sometime in 2012, ETSI introduced NFV (Network Function Virtualization) to automate telecom service/application using cloud technologies. Soon, ETSI realized that only NFV was not sufficient for automation. To complement NFV, a few years later, ETSI started MANO (Management and Orchestration) framework. Conceptually, the buzzwords like NFV, Cloud, Orchestration (MANO) etc. may look easy to grasp; but sorting out the detail and building a real telecom automation solution is anything but simple.

To start with, let’s look at a traditional IT infrastructure. A legacy data-centre infrastructure is resource-centric, i.e. operator plans for the capacity of resources. As a result, to launch a service/applications, an operator must do three things -first, gather the resource requirement for compute, storage, and network. Second, make sure the capacity for each resource is available. Finally, provision computes and storage, and do the configuration of data and storage networks (often manually). The problem is -the resource-centric model is service/application-agnostic (i.e. infrastructure does not know about the service). So, someone needs to manually translate service/application requirement into resource availability and configure them.

In contrast, clouds technologies are service-centric, based on a model called XaaS- X-As-A-Service. All public cloud providers offer mixes of three XaaS -Infrastructure-As-A-Service (IaaS, VM based virtualization); Platform-As-A-Service (PaaS, container-based virtualization), and Software-As-A-Service -SaaS.

To make ETSI NFV a success story an operator needs IaaS (while adoption of PaaS depends on the readiness of telecom applications). An operators deployment of IaaS (or PaaS) for internal purpose is known as – private cloud.

So, how to make an IaaS? The answer is -Hyperconverged Infrastructure (HCI).

“Hyper” came from hypervisor -the compute virtualization layer of a datacenter. “Converged” means putting compute, storage and network into the same host (or cluster of hosts). The virtualization of storage and network are called SDS -Software-defined storage and SDN -Software-defined network, respectively. So, the hypervisor, SDS, SDN make a 100% HCI. From HCI to IaaS (or PaaS) needs just one step.

SDS is a must for HCI; a traditional SAN (Storage Area Network) can’t scale with growing east-west traffic. Only to handle east-west traffic, data networks are switching from three-tier architecture to Clos-topology (the leaf-spine topology); sadly, there is not such topology for the storage network. As a result, traditional SAN has to go away. SDS has both proprietary and open-source implementation. Down in detail, all of them are largely different.

For example -Cisco HX performed SDS at the hypervisor layer, i.e. hypervisor virtualizes both compute and storage. VMware and Ceph (and open-source project); on the other hand, both have a separate software stack for SDS, i.e. hypervisor is not part of storage virtualization. All HCI implementation of SDS uses DAS (direct-attached storage) inside the host. Locally attached storage creates so-called virtualized caches-tier and capacity-tier that service/application can use. All vendor-specific SDS supports advanced SAN functionality, for example -replication, deduplication, compression, encryption, etc.; while an open-source SDS might not. Apart from hiding physical disks, another critical function an SDS does is virtualizing the storage-type. All SDS implementation supports persistence object-storage along with traditional block-type and file-type into the same capacity-tier.

Whatever often said about SDN, the SDN in HCI has quite a different purpose. HCI needs SDN to makes the network service/application-centric by using so-called -Micro-segmentation. Micro-segmentation allows applying network and security policy directly on VM/application instead of infrastructure resources. In a tradition datacenter network, policy/rule applied on resources such as IP/Port in the form ACL, route-policy, PBR etc. As a result, after decommissioning a service, the configurations are left behind and forgotten. But with micro-segmentation, network and security policy born/move/die along with the lifecycle of VM.

While SDS is host-based, SDN has two camps -host-based vs switch-based networking. Cisco ACI (Application-centric Infrastructure) is a switch-base SDN, while VMware NSX and OVN (open-source) are host-based. Same as SDS, all implementations defer wildly. For example -VMware NSX is hypervisor-based, i.e. the NSX agents integrate with ESXi control plane. OVN, on the other hand, is not hypervisor-based. It needs host-OS library support and a 3rd party controller (for example -OpenStack Neutron-Server). OVN is a classical SDN; It uses OVSDB protocol on northbound toward controllers and OpenFlow on the southbound.

The last step is to integrate HCI with IaaS software. Such software (or software suite) manages the HCI end to end (hence called VIM -Virtual Infrastructure Manager) and exposes API toward first automation layer. An ETSI VNFM (VNF Manager) interact with IaaS software to automate VNF deployment. Cisco and VMware both use vSphere. OpenStack is another IaaS software. Note that, KVM (hypervisor), OVN (SDN), Ceph (SDS) and OpenStack (IaaS Software) are different open-source projects, forming the open-source league.

One crucial rationale behind doing SDS and SDN inside host is to leverage the enormous computing power of modern CPU. However, when it comes to handling tens of gigabit of traffic which need application-centric treatment, software enhancement and hardware acceleration become essential. Two such improvements are happening in NIC and server-bus technology.

Smart-NIC uses FPGA and NPU (Network Processing Unit) to offload CPU. UKB (Universal Kernel Bypass) is a software enhancement (DPDK is a well-known UKB implementation) allows NIC to bypass kernel as well as the hypervisor for fast data processing. In the bus technology, NVMe (Non-Volatile Memory express -is a protocol) over PCIe bus allows extremely fast local cache-tier, NVMe-oF (NVMe over Fabric) allows similarly fast distributed cache-tier. PCIe virtualization technology -SR-IOV, allows multiple VM to share the same PCIe bus (for example, to access same PCIe NIC). In summary, HCI has a significant dependency on the host’s hardware capability, which goes against the current wisdom of building a private-cloud with COTS hardware.

So, what you need to build an IaaS? There are two paths to follow: most comfortable one is -all-vendor solution, one example can be a combination of -Cisco HX, Cisco ACI, VMware ESXi and software suite. Another example can be all-in-all VMware solution with VMware certified hardware, VMware ESXi, NSX, vSAN and software. The second path is an open-source league with COTS solution – but for most operators, this way of deploying and maintaining IaaS is no less than rocket science.

Once an operator adopts IaaS (or PaaS), it will happen eventually that the operator will end up with multiple IaaS. And a service workflow needs deployment across numerous clouds, for example -the application may run on a cloud, throw telemetry to another cloud, talks to OSS/BSS on another etc.

When a service/application cross boundary of a single cloud, we will need something called IaaC -Infrastructure-As-A-Code. IaaC is what an orchestrator does. An orchestration platform takes multiple IaaS (or PaaS) and codifies services inside them using programing languages that each IaaS understand. As a result, orchestration platforms need SDK support (as oppose to API support) from underlying IaaS platforms.

If IaaS is the first automation layer, also known as task-automation, the IaaC is the higher layer -the cloud-automation layer. IaaC depends on underlying IaaS. IaaS, in turn, depends on service (to be application-centric). This is why a single vendor offers multiple orchestration platforms. For example -Cisco has three orchestration product -UCSD, NSO and CloudCenter; and the same goes for other vendors.

So, to implement ETSI vision, an operator will need two flavours of XaaY. For NFV -one or more IaaS (or PaaS); for MANO, one or more IaaC.