This paper considers software systems consisting of fault-tolerant components built from functionally equivalent but independently developed modules (versions) characterized by different reliability and execution times. The components are designed using either the N-version programming method or the recovery block scheme. In our general model, we also allow the number of versions that can run simultaneously to be limited because of hardware or computation time constraints. An analytical algorithm and a numerical procedure to evaluate system execution-time distributions are presented. This algorithm takes into account the positive correlation among failures in different versions by introducing common-cause failures (mutually exclusive and independent common causes are considered). Illustrative examples are presented.
Acknowledgments
Contributed by the Reliability Engineering Department