A Generic Model for Fault Isolation in Integrated Management Systems
Stefan Kaetker
IBM European Networking Center
Systems & Network Mgmt Dept.
Vangerowstr. 18
D-69115 Heidelberg
Germany
Email: kaetker_AT_de.ibm.com
Kurt Geihs
University of Frankfurt
Computer Science Dept.
P/O Box 111932
D-60054 Frankfurt
Germany
Email: geihs_AT_informatik.uni-frankfurt.de
Abstract
Distributed systems in enterprises as well as telecommunication environments
strongly demand more automated fault management.
A single fault in these complex systems might
cause a huge number of symptomatic error messages and side effects to occur.
The common root faults for these symptoms have to be identified to
start fault removal procedures as soon as possible and to decrease system
down-time.
This paper presents a methodology for fault isolation in integrated management
systems. A generic model is described that unifies the view of the
management system on the managed environment. It integrates the relevant
aspects of network, system, and service management layers in order to
perform integrated fault isolation. Our approach is based on a general
dependency graph model. It captures the information that is required to
determine the root cause of a fault on the one hand and the set of
fault affected services and customers on the other hand.
The layered TMN architecture serves as an example for an integrated
management environment throughout this paper.
Keywords: Fault Isolation, Event Correlation, Fault Management, Service
Management, TMN
JNSM: Vol. 5, No. 2, 1997
A Generic Model for Fault Isolation in Integrated Management Systems [Vol. 5, No. 2, 1997]
NOTE: only abstract of paper available on-line
Back to JNSM main page