29 Sep 2015

Leveraging SDN Layering to Systematically Troubleshoot Networks

HotSDN’13 pdf

Today’s networks are maintained by “masters of complexity”: … high-level intent (policies) must correctly map to low-level forwarding behavior (hardware configuration). … show recently-developed troubleshooting tools fit into a coherent workflow that detects mistranslations between layers to precisely localize sources of errant control logic.

According to a recent survey of network operators, 35% of networks generate more than 100 tickets per month. … network troubleshooting today is still a largely ad hoc process.

the limitations of our current ad hoc troubleshooting tools are a direct consequence of the architecture in which these tools must operate.

not propose any new systems; instead, … builds an overall picture of systematic troubleshooting in SDNs.

2 The SDN Stack

[Onix][15], [NOX][9]

The physical view has a one-to-one correspondence with the physical network, and it is the job of the network operating system to configure the corresponding network devices through a protocol such as OpenFlow.

4 tools for finding the code layer

whether the abstract configuration of each network device, e.g., a single match-action table, matches the low-level, hardware-specific configuration, e.g., registers to configure a forwarding pipeline. … SOFT tool

6

knowing the operator’s intent:

implicitly expressed as the combination of all protocol configurations over all nodes. To infer intent, one must gather … understand the composition of each protocol config (the precedence rules between protocols operating at the same data-plane layer)

checking network behavior against intent

7 unanswered questions

improve invariant checkers?

reference

27 H. Zeng, P. Kazemian, G. Varghese, and N. McKeown. Automatic Test Packet Generation. In CoNEXT, 2012.

15 T. Koponen, M. Casado, N. Gude, J. Stribling, L. Poutievski, M. Zhu, R. Ramanathan, Y. Iwata, H. Inoue, T. Hama, and S. Shenker. Onix: A Distributed Control Platform for Large-scale Production Networks. OSDI’10, 2010.

9 N. Gude, T. Koponen, J. Pettit, B. Pfaff, M. Casado, N. McKeown, and S. Shenker. NOX: Towards an Operating System for Networks. CCR, 38, 2008.

16 M. Kuzniar, P. Peresini, M. Canini, D. Venzano, and D. Kostic. A soft way for openflow switch interoperability testing. In CoNEXT, 2012.