19 Oct 2015

Discovering Dependencies for Network Management

http://research.microsoft.com/en-us/groups/nrg/hotnets06.pdf

Leslie Graph, a simple yet powerful abstraction describing the complex dependencies between network, host and application components in modern networked systems.

Lamport 1: “A distributed system is one in which the failure of a computer you didn’t even know existed can render your own computer unusable”

the absence of tools to identify the components that “can render your own computer unusable”: the implicit web of dependencies among these components exists only in the minds of the human experts running them. The complexity of these dependencies quickly adds up, requiring more help than traditional IT management software provides.

Leslie Graph
the graph representing the dependencies between the system components, with subgraphs representing the dependencies pertaining to a particular application or activity. Nodes represent the computers, routers and services on which user activities rely, and directed edges capture their inter-dependencies.

existing approaches

the system could evolve faster than the rules; deployment of various forms of middlebox (e.g., firewalls, proxies) can change the application’s dependencies without the rule writers even being aware;

expose dependencies by requiring all applications to run on a middleware platform instrumented to track dependencies at run-time [1, 4, 7]. …

heterogeneity defeats most such efforts in practice
While a single vendor might instrument their software, it is unlikely that all vendors will do so in a common fashion;

2 Leslie Graphs and Their Uses

Studies show that ~70% of enterprise IT budgets are spent on maintenance.

Leslie … enabling the following techniques for management and troubleshooting
Fault localization
Reconfiguration planning
Help desk optimization
Anomaly detection

3 implementation

approximating the Leslie Graph using low-level packet correlations.

3.1 Constellation

local traffic correlations are inferred by passively monitoring packets and applying machine learning techniques.

The basic premise is that a typical pattern of messages is associated with accomplishing a given task.

reference

[1] A.Brown, G.Kar, and A.Keller. An active approach to characterizing dynamic dependencies for problem determination in a distributed environment. In IFIP/IEEE IM, May 2001.

[3] W. Aiello, C. Kalmanek, P. McDaniel, S. Sen, O. Spatscheck, and J. V. der Merwe. Analysis of communities of interest in data networks. In PAM’05, Mar. 2005.

[7] M. Y. Chen, A. Accardi, E. Kıcıman, J. Lloyd, D. Patterson, A. Fox, and E. Brewer. Path-based failure and evolution management. In NSDI’04, pages 309–322, Mar. 2004.

  1. [9] L. Lamport. Quarterly quote. ACM SIGACT News, 34, Mar. 2003.