A Probabilistic Approach to Distributed System Management

Author/Creator ORCID

Date

2010-06-15

Department

Program

Citation of Original Publication

Randy Schauer and Anupam Joshi, A Probabilistic Approach to Distributed System Management, Proceedings of the Seventh International Conference on Autonomic Computing, 2010, DOI -10.1145/1809049.1809063

Rights

This item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.

Abstract

Large-scale distributed systems are playing an increasing role in computational research, production operations, information processing, and application hosting. The continuous management of such systems is a critical consideration when focusing on reliability, availability, and security. As the number of commodity components within these systems continue to grow, it becomes increasingly difficult to track the multitude of parameters required to ensure optimal performance from the system, especially in those systems that have been built through expansion and not as an initial purchase of identical nodes. In this paper, we discuss the use of statistical inference, specifically Markov Logic Networks, in a distributed multi-agent system to provide the most effective means of managing these parameters. We showcase an architecture that provides services to manage a system’s configuration throughout its life-cycle, and is capable of resolving differences after identifying potential mis-configurations using conflict discovery and resolution modules.