학술논문

Hierarchical design and analysis of fault-tolerant multiprocessor systems using concurrent error detection
Document Type
Conference
Source
[1990] Digest of Papers. Fault-Tolerant Computing: 20th International Symposium Fault-Tolerant Computing, 1990. FTCS-20. Digest of Papers., 20th International Symposium. :130-137 1990
Subject
Computing and Processing
Fault tolerant systems
Multiprocessing systems
Fault detection
Computer errors
Concurrent computing
Contracts
Algorithm design and analysis
Reliability engineering
Design engineering
Fault diagnosis
Language
Abstract
A composition technique for building large fault-tolerant systems hierarchically using the concept of checks at different levels in the hierarchy is described. A small system of known fault detectability and locatability is replicated several times, and new checks are added at the next higher level. Such checks at different levels can be introduced into most of the existing multiprocessor systems. An analysis technique based on a matrix model is developed. Relationships between the fault detectability and locatability of a basic system are derived, and the corresponding values of the complete system are computed hierarchically. Finally, the techniques are extended to complex systems in which individual processors produce multiple sets of data elements.ETX