21 Dec 2015
SDN for the Cloud
paper,
slide
slide
- WAN
- Motivating scenario network engineering
- Innovation: infer traffic control routing, centralize control to
meet network-wide goals
- WAN
- TomoGravity ✓ ACM Sigmetrics 2013 Test of Time Award
- RCP Usenix NSDI 2015 Test of Time Award
- 4D ACM Sigcomm 2015 Test of Time Award
- Cloud
- VL2, sigcomm 2009
- scale-out L3 fabric, SDN and NFV
- per customer virtual networks
- Cloud provided the killer scenario for SDN
- Virtualized Data Center for each customer — prior art fell short
- High scale fault tolerant distributed systems and data management
VL2 → Azure
Consistent cloud design principles for SDN
- Scale-out N-Active Data Plane
-
Embrace and Isolate failures
- Centralized control plane: drive network to target state
- Resource managers service requests, while meeting system wide
objectives Controllers drive each component relentlessly to the target
state Stateless agents plumb the policies dictated by the controllers
- Challenges of Scale
- number of paths, dynamic set of gray failure modes, chasing app
latency
- Azure State Management
- statesman
-
App I: Automatic Failure Mitigation
-
App II: Traffic Engineering Towards High Utilization (SWAN)
- Azure Scale Monitoring — Pingmesh
- Azure Cloud Switch — Open Way to Build Switch OS
Azure Cloud Switch OS to manage the switches as we do servers
Physical and Virtual networks, NFV
- Azure is the hub of your enterprise, reach to branch offices via VPN
- VNet is the right abstraction, the counterpart of the VM for compute
- Efficient and scalable communication within and across VNets
Hyperscale SDN: All Policy is in the Host

Challenges for Hyperscale SDN Controllers
- scale up to 500k+ Hosts in a region
- scale down to small deployments too
- millions of updates per day
- support frequent updates without downtime
Regional Network Controller Stats
API execution time
- Read : < 50 milliseconds
- Write : < 150 milliseconds
Varying deployment footprint
- Smallest : <10 Hosts
- Largest : >100 Hosts
Azure SLB: Scaling Virtual Network Functions
Key Idea: Decompose Load Balancing into Tiers to achieve
scale-out data plane and centralized control plane
- Data Center-Scale Distributed Router
- extrem scale: 10k customers, 10k routes each → 100M routes
Integration of enterprise & cloud, physical & virtual
- Virtual Filtering Platform (VFP)
- acts as a virtual switch inside Hyper-V VMSwitch
- Flow Tables
- a typed Match- Action-Table API to the controller
- the Right Abstraction for the Host
http://research.microsoft.com/en-us/um/cambridge/events/sigcomm2015/papers.html
Keynote: SDN for the Cloud
ABSTRACT Cloud computing is a new paradigm, which touches everything
from enterprise data centers, to wide area networks, to massive scale
data centers, with millions of servers. Coordinating such large
systems requires a new way to control the network — setting a
target state for each component and driving each to the goal
state, so that the network meets global objectives, in a sea
of growth and change: customers creating and adapting virtual networks
at amazing pace, while new features are rolled out, new data
centers are built out and older ones decommissioned. Software Defined
Networking (SDN) is a key enabling technology, built on principles of
delivering direct control and network-wide views. At Microsoft Azure,
we pioneered these principles, and built them into the fabric of the
Cloud.
cloud optimized, low cost, high performance, super reliable and
automated
- products
- Virtual Layer-2 (VL2), Virtual Networks (VNets), Load Balancing
(Ananta), Data Center TCP (DCTCP)