Full article: Distributed design for active fault diagnosis

Formulae display: $MathJax Logo$ ?Mathematical formulae have been encoded as MathML and are displayed in this HTML version using MathJax in order to improve their display. Uncheck the box to turn MathJax off. This feature requires Javascript. Click on a formula to zoom.

Abstract

The paper deals with active fault diagnosis of stochastic large-scale systems consisting of several subsystems with separate inputs and observations, which are coupled through the system state. The subsystems are described by multiple models expressing their fault-free and faulty behaviour. The transition between the models is governed by a Markov chain. The paper proposes a distributed design of an active fault diagnosis algorithm, which takes into account the coupling among the subsystems in all stages of the algorithm. This results in a higher quality of the excitation signal and consequently in better decisions. The numerical example shows the improved performance of the proposed algorithm in comparison with the algorithms based on the decentralised design.

Keywords:

1. Introduction

Complexity and degree of integration of large-scale systems (LSSs) increase their liability to faults with possible catastrophic consequences. Therefore, they have to be detected reliably and as quickly as possible by a fault diagnosis (FD) system. The literature recognises two fundamental approaches that differ in the interaction with the monitored system. In the passive approach (Blanke et al., Citation2016; Gustafsson, Citation2009; Isermann, Citation2011; Katipamula & Brambley, Citation2011; Yao et al., Citation2019), the decisions generated by an FD system are based on passive observations of the monitored system measurable quantities. When the active approach is chosen, besides the decisions, the FD system generates an input signal to excite the monitored system (Ashari et al., Citation2012; Niemann & Poulsen, Citation2014; Punčochář et al., Citation2015; Raimondo et al., Citation2016). Its purpose is to obtain more information, which helps to detect faults that may pose a challenge for the passive FD. The active FD (AFD) approach has gained in popularity in the last decade (Campbell & Nikoukhah, Citation2004; Heirung & Mesbah, Citation2019; Niemann, Citation2006; Paulson et al., Citation2017; Scott et al., Citation2014; Stoustrup & Niemann, Citation2010). Within the AFD for stochastic systems, the multiple-model framework is used almost exclusively to describe fault-free and faulty models of the system (Blackmore et al., Citation2008; Škach et al., Citation2016).

Limited communication bandwidth and available computational power are two main reasons for developing special FD algorithms for the LSSs (Ferrari et al., Citation2012; Raimondo et al., Citation2016). In Punčochář and Straka (Citation2019) and Straka and Punčochář (Citation2019), a new AFD framework for stochastic LSSs was introduced involving three architectures – centralised, decentralised, and distributed. In the centralised architecture, all calculations are performed by a single central node, while in the decentralised architecture, the calculations are performed by multiple isolated nodes each tied to a single LSS subsystem. The distributed architecture is similar to the decentralised one and in addition, the nodes communicate with each other. The AFD algorithms consist of (i) the off-line stage, dealing with the design of the excitation input and decision generators, and (ii) the on-line stage, dealing with the state estimation and utilisation of the designed generators. The LSS consists of several subsystems which are subject to certain dynamic interactions called coupling. The input generator design cannot respect the coupling among the LSS subsystems fully for computational tractability reasons even for small-scale systemsFootnote¹. To achieve reasonable computational costs of the AFD algorithm, the input generator design introduced in Punčochář and Straka (Citation2019) and Straka and Punčochář (Citation2019) rested on the decentralised architecture, which completely ignores the coupling. This, however, leads to lower quality of the excitation and consequently to lower quality of the FD.

This paper makes the following contribution: A novel distributed AFD algorithm is designed to take into account the coupling among the LSS subsystems in both stages. Compared to the AFD algorithm proposed by Straka and Punčochář (Citation2019), where the AFD node related to a subsystem uses the information received from other nodes in the on-line stage only, the novelty of the AFD algorithm proposed here lies in the off-line stage where the input signal generator is designed. In the proposed distributed AFD algorithm, the input signal generator related to a subsystem employs conveniently the information about the state of other subsystems to take into account the subsystem coupling. The contribution of the paper lies in a convenient aggregation of the effects of other subsystems (i.e. compressing the information) to achieve feasible computational costs. Note that using the information in the uncompressed form would lead to extreme computational costs of the centralised architecture. The additional information available to the generator improves the quality of the excitation input and consequently the quality of the FD.

The paper is structured as follows: Section 2 provides the LSS specification, decomposition, and the AFD problem formulation. A general solution to the AFD problem is briefly summarised in Section 3. The distributed design for the AFD is proposed in Section 4. The performance of the proposed algorithm is illustrated using two numerical examples in Section 5 and Section 6 draws concluding remarks.

2. AFD problem formulation

2.1. LSS specification

Consider an LSS Σ described at time instant $k \in T = {0, 1, 2, \dots}$ by the following state-space model (1a) $\begin{aligned} Σ : x_{k + 1} & = f (x_{k}, μ_{k}, u_{k}) + F (μ_{k}) w_{k}, \end{aligned}$ (1a) (1b) $\begin{aligned} y_{k} & = h (x_{k}, μ_{k}) + H (μ_{k}) v_{k}, \end{aligned}$ (1b) where $x_{k} \in X \subseteq R^{D_{x}}$ and $μ_{k} \in M$ are the continuous and discrete parts of the LSS state $s_{k} = [x_{k}^{T}, μ_{k}^{T}]^{T} \in S = X \times M$ , $u_{k} \in U \subseteq R^{D_{u}}$ is the input, $w_{k} \in X$ is the state noise described by the known probability density function (PDF) $p_{w_{k}}$ , $y_{k} \in Y \subseteq R^{D_{y}}$ is the output, and $v_{k} \in Y$ is the measurement noise described by the known PDF $p_{v_{k}}$ . The functions $f : X \times M \times U \mapsto X$ , $h : X \times M \mapsto Y$ , $F : M \mapsto X \times X$ , and $H : M \mapsto Y \times Y$ are knownFootnote². Each element $μ_{k}$ of the discrete set $M$ represents a multi-index into a set of M possible models describing behaviour of the LSS Σ in the fault-free and faulty conditions during a sampling period. The random process ${μ_{k}}$ is assumed to be Markov with known transition probability (2) $P r (μ_{k + 1} | μ_{k}) .$ (2) The state and measurement noises are white, mutually independent and independent of the initial condition $s_{0}$ described by $p_{s_{0}}$ so thatFootnote³ (3) $p (w_{0 : F}, v_{0 : F}, s_{0}) = p_{s_{0}} (s_{0}) \prod_{k = 0}^{F} p_{w_{k}} (w_{k}) p_{v_{k}} (v_{k})$ (3) for any $F \in T$ . Both the continuous part $x_{k}$ of the state $s_{k}$ and the discrete part $μ_{k}$ are unknown and can be inferred indirectly through available $y_{k}$ and $u_{k}$ .

2.2. LSS decomposition

Since the AFD for the LSS Σ with a centralised architecture is not computationally tractableFootnote⁴, the decentralised or distributed architectures based on a decomposition of Σ were considered in Punčochář Straka (Citation2019); Straka Punčochář (Citation2019). The LSS Σ (see Figure ) consists of N subsystemsFootnote⁵ $Σ^{n}, n \in N = {1, 2, \dots, N}$ that are coupled through the stateFootnote⁶ $x_{k}$ . Each subsystem $Σ^{n}$ has its own inputs $u_{k}^{n}$ , outputs $y_{k}^{n}$ , and a set of possible fault-free and faulty models $M^{n}$ . It can be described by the following representation (4a) $\begin{aligned} Σ^{n} : x_{k + 1}^{n} = f^{n} (x_{k}, μ_{k}^{n}, u_{k}^{n}) + F^{n} (μ_{k}^{n}) w_{k}^{n}, \end{aligned}$ (4a) (4b) $\begin{aligned} P r (μ_{k + 1}^{n} | μ_{k}^{n}) \end{aligned}$ (4b) (4c) $\begin{aligned} y_{k}^{n} = h^{n} (x_{k}^{n}, μ_{k}^{n}) + H^{n} (μ_{k}^{n}) v_{k}^{n}, \end{aligned}$ (4c) where $x_{k}^{n} \in X^{n} \subseteq R^{D_{x}^{n}}$ and $μ_{k}^{n} \in M^{n} = {1, 2, \dots, M^{n}}$ are continuous and discrete parts, respectively, of the local stateFootnote⁷ $s_{k}^{n} = [(x_{k}^{n})^{T}, μ_{k}^{n}]^{T} \in S^{n} = X^{n} \times M^{n}$ of the subsystem $Σ^{n}$ , $u_{k}^{n} \in U^{n} \subseteq R^{D_{u}^{n}}$ is the local input, $w_{k}^{n} \in X^{n}$ is the local state noise described by $p_{w_{k}^{n}}$ , $y_{k}^{n} \in Y^{n} \subseteq R^{D_{y}^{n}}$ is the local output, $v_{k}^{n} \in Y^{n}$ is the local measurement noise described by $p_{v_{k}^{n}}$ . The functions $f^{n} : X \times M^{n} \times U^{n} \mapsto X^{n}$ , $h^{n} : X^{n} \times M^{n} \mapsto Y^{n}$ , $F^{n} : M^{n} \mapsto X^{n} \times X^{n}$ , and $H^{n} : M^{n} \mapsto Y^{n} \times Y^{n}$ are known. The discrete part $μ_{k}^{n}$ represents an index into the set $M^{n}$ , which includes one model representing the behaviour of subsystem $Σ^{n}$ in fault-free condition, $μ_{k}^{n} = 1$ , and $M^{n} - 1$ models that represent the behaviour of subsystem in faulty conditions, $μ_{k}^{n} \in {2, \dots, M^{n}}$ .The LSS Σ given by (Equation1a(1a) $\begin{aligned} Σ : x_{k + 1} & = f (x_{k}, μ_{k}, u_{k}) + F (μ_{k}) w_{k}, \end{aligned}$ (1a) ) and (Equation2(2) $P r (μ_{k + 1} | μ_{k}) .$ (2) ) is assumed to satisfy the following independence conditions:

(IC-1)	The initial states $x_{0}^{n}$ and the initial model indices $μ_{0}^{n}$ are independent and mutually independentFootnote⁸, i.e. $p (x_{0}, μ_{0}) = \prod_{n = 1}^{N} p_{x_{0}^{n}} (x_{0}^{n}) P r (μ_{0}^{n})$ .
(IC-2)	The model indices $μ_{k}^{n}$ are conditionally independent, i.e. $P r (μ_{k + 1} \| μ_{k}) = \prod_{n = 1}^{N} P r (μ_{k + 1}^{n} \| μ_{k}^{n})$ .

Figure 1. The decomposition of an LSS into interconnected subsystems.

The independence conditions IC-1 and IC-2 mean that the occurrence of a fault in a subsystem does not influence the occurrence of faults in other subsystems. The condition IC-2 is considered for convenience purposes only to make the exposition clear. Note that the AFD problem with coupled faults was treated in Straka and Punčochář (Citation2020b) for conditionally dependent faults and in Straka and Punčochář (Citation2020c) for dependent faults. Relaxation of IC-2 would lead to an introduction of a central node in the on-line stage, whereas the off-line stage would not be affected.

In this paper, the subsystems are coupled only through the continuous state $x_{k}$ that appears in the dynamics (Equation4a(4a) $\begin{aligned} Σ^{n} : x_{k + 1}^{n} = f^{n} (x_{k}, μ_{k}^{n}, u_{k}^{n}) + F^{n} (μ_{k}^{n}) w_{k}^{n}, \end{aligned}$ (4a) ) of all subsystems, i.e. besides $x_{k}^{n}$ , the local continuous state $x_{k + 1}^{n}$ is affected by the local continuous states of other subsystems $x_{k}^{\bar{n}} = [(x_{k}^{1})^{T}, \dots, (x_{k}^{n - 1})^{T}, (x_{k}^{n + 1})^{T}, \dots, (x_{k}^{N})^{T}]^{T}$ . The AFD algorithm proposed in Punčochář and Straka (Citation2019) approximated the dynamics (Equation4a(4a) $\begin{aligned} Σ^{n} : x_{k + 1}^{n} = f^{n} (x_{k}, μ_{k}^{n}, u_{k}^{n}) + F^{n} (μ_{k}^{n}) w_{k}^{n}, \end{aligned}$ (4a) ) by neglecting the coupling through $x_{k}$ in both stages and the AFD algorithm proposed in Straka and Punčochář (Citation2019) neglected the coupling through $x_{k}$ in the off-line stage, where the input signal generator is designed. Such approximation is acceptable if the coupling is weak, in which case the approximation has little effect on the FD quality. However, when the coupling is strongerFootnote⁹, it should be taken into account by both stages of the AFD design at least to a certain extent. Hence, the paper focuses on the distributed design that considers the coupling.

2.3. AFD problem

For convenience, the AFD problem and its general solution are described for the centralised architecture only. The AFD strives to design a function that transforms the complete available information observed up to the time step k to a decision $d_{k}$ about the faults (subsystem models) and to an input signal $u_{k}$ , which role is to excite the system to improve the detection quality. The AFD system can be described at any time instant as (5) $Δ : [\begin{matrix} d_{k} \\ u_{k} \end{matrix}] = [\begin{matrix} σ_{k} (I_{k}) \\ γ_{k} (I_{k}) \end{matrix}],$ (5) where $I_{k} = [y_{0 : k}^{T}, u_{0 : k - 1}^{T}]^{T} \in I_{0 : k} = Y^{k + 1} \times U^{k}$ . The vector $d_{k} = [d_{k}^{1}, d_{k}^{2}, \dots, d_{k}^{N}]^{T} \in M$ consists of the decisions $d_{k}^{n} \in M^{n}$ about the model indices $μ_{k}^{n}$ , $σ_{k} : I_{0 : k} \mapsto M$ represents the AFD node decision generator at the time step k, $u_{k}$ is the excitation input and $γ_{k} : I_{0 : k} \mapsto U$ is a function describing the input signal generator. Note that by providing the decision $d_{k}$ , the AFD performs the fault detection and the fault identification simultaneously.

The optimal AFD system should minimise the following additive discounted criterionFootnote¹⁰ (6) $J (σ_{0}^{\infty}, γ_{0}^{\infty}) = lim_{F \to \infty} E {\sum_{k = 0}^{F} η^{k} L (μ_{k}, d_{k})},$ (6) where $η \in (0, 1)$ is a chosen discount factor and $L : M \times M \mapsto R^{+}$ is a detection cost function that allows different costs to be assigned for selecting the vector of decisions $d_{k}$ while the LSS behaviour is currently governed by the vector of model indices $μ_{k}$ . The cost function is versatile and may stress the cost of missed detections or false alerts or even the cost of incorrect fault identification. This paper assumes that the costs are not related across the subsystems, and thus the following additive detection cost functionFootnote¹¹ is used (7) $L (μ_{k}, d_{k}) = \sum_{n = 1}^{N} L^{n} (μ_{k}^{n}, d_{k}^{n}),$ (7) where $L^{n} : M^{n} \times M^{n} \mapsto R^{+}$ penalises discrepancy between the model index $μ_{k}^{n}$ and the decision $d_{k}^{n}$ generated by the AFD system.

For an example of two models, if missed detection ( $μ_{k}^{n} = 2$ , $d_{k}^{n} = 1$ ) and false alert ( $μ_{k}^{n} = 1$ , $d_{k}^{n} = 2$ ) are perceived as equally bad by a designer, it is common to choose $L^{n} (μ_{k}^{n}, d_{k}^{n}) = 1 - δ_{μ_{k}^{n}, d_{k}^{n}}$ , where $δ_{i, j}$ is the Kronecker delta. On the other hand, if a missed detection has more serious consequences compared to a false alarm, the choice $L^{n} (μ_{k}^{n}, d_{k}^{n}) = {\begin{cases} 10^{2} & μ_{k}^{n} = 2, d_{k}^{n} = 1 \\ 1 & μ_{k}^{n} = 1, d_{k}^{n} = 2 \\ 0 & μ_{k}^{n} = d_{k}^{n} \end{cases}$ specifies the cost of the missed detection by several orders of magnitude higher than the cost of the false alert (Nelles, Citation2014).

3. General solution to AFD problem

The AFD problem formulation presented in the previous section belongs to the class of imperfect state information problems as only the history $I_{k}$ is available for control or decision making instead of the state $s_{k}$ (Bertsekas, Citation2000). These problems are difficult to address directly for the infinite time horizon as the dimension of $I_{k}$ increases without limit. A solution is to assume that the optimal AFD node can be split into an optimal state estimator that uses the Bayesian recursive relations (Bar-Shalom et al., Citation2001) to compute a conditional PDF $p (s_{k} | I_{k})$ and a decision-making law that maps this conditional PDF into the input $u_{k}$ and decision $d_{k}$ (Bertsekas, Citation2000; Bertsekas & Shreve, Citation1996). Given the optimal state estimator, the original problem can be recast as a perfect state information problem where only the decision-making law is to be designed based on a new model consisting of the original model coupled with the optimal state estimator.

The conditional PDF $p (s_{k} | I_{k})$ calculated by the state estimator can be represented exactly or approximately using a finite number of statistics. The statistics collected into an information state $ξ_{k} \in K$ evolve in time as (8) $ξ_{k + 1} = ϕ (ξ_{k}, u_{k}, y_{k + 1}),$ (8) where $ϕ : K \times U \times Y \mapsto K$ represents the state estimator associated with the LSS Σ model. Here, the future output $y_{k + 1}$ is regarded as a random disturbance described by the conditional PDF $p (y_{k + 1} | I_{k}, u_{k})$ and the initial condition $ξ_{0}$ contains statistics describing the conditional PDF $p (s_{0} | y_{0})$ .

Given the information state $ξ_{k}$ , it suffices to consider a time invariant AFD node that is described at a time step $k \in T$ as (9) $Δ : [\begin{matrix} d_{k} \\ u_{k} \end{matrix}] = [\begin{matrix} \bar{σ} (ξ_{k}) \\ \bar{γ} (ξ_{k}) \end{matrix}],$ (9) where $\bar{σ} : K \mapsto M$ and $\bar{γ} : K \mapsto U$ are unknown functions to be sought. The detection cost function for the perfect state information model equivalent to L in (Equation6(6) $J (σ_{0}^{\infty}, γ_{0}^{\infty}) = lim_{F \to \infty} E {\sum_{k = 0}^{F} η^{k} L (μ_{k}, d_{k})},$ (6) ) can be shown (Bertsekas, Citation2000) to satisfy (10) $\bar{L} (ξ_{k}, d_{k}) = \sum_{n = 1}^{N} \sum_{μ_{k}^{n}} L^{n} (μ_{k}^{n}, d_{k}^{n}) P r (μ_{k}^{n} | I_{k}) .$ (10) Having the reformulated problem specification, the optimal AFD node is determined by the Bellman function, which can be computed off-line (Straka & Punčochář, Citation2019) because it depends only on the a priori known PDF $p (x_{k + 1} | x_{k}, μ_{k}, u_{k})$ , transition probabilities $P r (μ_{k + 1} | μ_{k})$ , measurement PDF $p (y_{k} | x_{k}, μ_{k})$ , cost function $L^{d}$ , and the discount factor η. Then, the optimal decisions and optimal inputs can be determined on-line by solving much simpler optimisation problems. More precisely, the Bellman function is required for the optimal input signal generator $u_{k} = {\bar{γ}}^{*} (ξ_{k})$ while the optimal decision generator $d_{k} = {\bar{σ}}^{*} (ξ_{k})$ works independently of the Bellman function (Straka & Punčochář, Citation2019). Thus, the AFD algorithm consists of two stages: the off-line stage involved with the design of the input signal generator, which uses the Bellman function, and the on-line stage connected with the state estimation, which generates the decisions and selects the optimal excitation according to the Bellman function.

The computation costs of the Bellman function calculation are extreme for LSSs as the dimension of the information state $ξ_{k}$ can be very high even for low-dimensional state $s_{k}$ due to the number of subsystems and their models. Consider, for example, an LSS with N subsystems each described by $M^{n} = 2$ models (one model for fault-free behaviour and one model for faulty behaviour). It leads to N-dimensional index $μ_{k}$ and the information state $ξ_{k}$ consists of $2^{N}$ statistics of $p (x_{k}^{n} | μ_{k}^{n}, I_{k},)$ and $2^{N} - 1$ probabilities $P r (μ_{k}^{n} | I_{k})$ .

For this reason, the input signal generator design proposed in Punčochář and Straka (Citation2019) and Straka and Punčochář (Citation2019) uses solely the decentralised architectureFootnote¹², which ignores the coupling among the subsystems and calculates the Bellman function for each subsystem separately using approximate models that are mutually isolated. Thus, each subsystem has its own information state of a significantly smaller dimension than the LSS information state and the computation of the Bellman function related to the subsystem is tractable.

The aim of the distributed design proposed in this paper is to take into account the coupling of the subsystems during the design of the input generator. Thus, the quality of the excitation signal and subsequently of the detection should be improved by using more information from other subsystems that are available in the distributed architecture. Such design comes with lower computational costs than the centralised design.

4. AFD with distributed design

The main idea of the AFD with distributed design is that the information about the LSS state communicated among the AFD nodes will be used not only to generate the decision (Straka & Punčochář, Citation2019) but also to generate the excitation input.

Distributed state estimation: First, the perfect state information model is constructed for the distributed architecture. The estimation algorithm in each AFD node $Δ^{n}$ consists of four steps (Straka & Punčochář,Citation2019):

Prediction – calculation of $p (s_{k}^{n} | I_{k - 1}, u_{k - 1}^{n})$ ;
Filtering – calculation of $\bar{p} (s_{k}^{n} | I_{k}^{∙, n})$ ;
Merging – calculation of the approximation $p (s_{k}^{n} | I_{k}^{∙, n})$ of the filtering PDF $\bar{p} (s_{k}^{n} | I_{k}^{∙, n})$ to prevent the computational and memory costs increase; and
Fusion – calculation of $p (s_{k} | I_{k})$ by fusingFootnote¹³ the filtering estimates received from other nodes.

The symbol $I_{k}^{∙, n}$ denotes a composition of the past globalFootnote¹⁴ data $I_{k - 1}$ related to the whole system Σ and the present data $y_{k}^{n}$ , and $u_{k - 1}^{n}$ related to the subsystem $Σ^{n}$ only, i.e. (11) $I_{k}^{∙, n} = [(I_{k - 1})^{T}, (y_{k}^{n})^{T}, (u_{k - 1}^{n})^{T}]^{T} .$ (11) The estimation algorithm is illustrated in Figure for two nodes $Δ^{1}$ and $Δ^{2}$ .The dashed arrow loop represents the estimation algorithm, which can be expressed using the perfect state information model as (12) $ξ_{k + 1} = ϕ^{'} (ξ_{k}, u_{k}, y_{k + 1}),$ (12) where $ϕ^{'} : K \times U \times Y \mapsto K$ differs from $ϕ$ in (Equation8(8) $ξ_{k + 1} = ϕ (ξ_{k}, u_{k}, y_{k + 1}),$ (8) ) because the filtering, merging, and prediction steps are performed locally at the AFD node level.

Figure 2. Scheme of distributed AFD algorithm.

Distributed AFD node : Two problems are associated with the model (Equation12(12) $ξ_{k + 1} = ϕ^{'} (ξ_{k}, u_{k}, y_{k + 1}),$ (12) ). First, the dimension of the information state $ξ_{k}$ is too large for the Bellman function calculation. Second, the Bellman function calculation for the nth node $Δ^{n}$ requires running the estimation algorithms for all subsystems. The first problem will be addressed by aggregating the effects of other subsystems to reduce the order of the statistic. The second problem will be dealt with by a suitable global model approximation.

4.1. Reducing model order by aggregation

The dimension of the statistic $ξ_{k}$ can be reduced if the dynamics function of each subsystem $Σ^{n}$ can be decomposed as (13) $f^{n} (x_{k}, μ_{k}^{n}, u_{k}^{n}) = g^{n} (x_{k}^{n}, μ_{k}^{n}, u_{k}^{n}, a^{n} (x_{k}^{\bar{n}}, μ_{k}^{n})),$ (13) where $a^{n} : X^{\bar{n}} \times M^{n} \mapsto Z^{n} = R^{D_{z}^{n}}$ is a function that aggregates the effects of other subsystems $Σ^{1} \dots Σ^{n - 1}, Σ^{n + 1} \dots Σ^{N}$ on the subsystem $Σ^{n}$ , and $g^{n} : X^{n} \times M^{n} \times U^{n} \times Z^{n} \mapsto X^{n}$ is a function that combines this aggregated effect with the local effects of $Σ^{n}$ . A reduced order form of the model (Equation4a(4a) $\begin{aligned} Σ^{n} : x_{k + 1}^{n} = f^{n} (x_{k}, μ_{k}^{n}, u_{k}^{n}) + F^{n} (μ_{k}^{n}) w_{k}^{n}, \end{aligned}$ (4a) ) for the subsystem $Σ^{n}$ can be written as (14a) $\begin{aligned} x_{k + 1}^{n} = g^{n} (x_{k}^{n}, μ_{k}^{n}, u_{k}^{n}, z_{k}^{n}) + F^{n} (μ_{k}^{n}) w_{k}^{n}, \end{aligned}$ (14a) (14b) $\begin{aligned} P r (μ_{k + 1}^{n} | μ_{k}^{n}), \end{aligned}$ (14b) (14c) $\begin{aligned} y_{k}^{n} = h^{n} (x_{k}^{n}, μ_{k}^{n}) + H^{n} (μ_{k}^{n}) v_{k}^{n}, \end{aligned}$ (14c) where $z_{k}^{n} \in Z^{n}$ is a new $D_{z}^{n}$ -dimensional random variable defined as (15) $z_{k}^{n} = a^{n} (x_{k}^{\bar{n}}, μ_{k}^{n}) .$ (15) Note that the dimension of all arguments of the function $a^{n}$ is $D_{x}^{\bar{n}} + 1$ and a reduction of the dimension is actually achieved only if $D_{z}^{n}$ is less than $D_{x}^{\bar{n}} + 1$ for all subsystems.

Typical examples of subsystem dynamics that are particularly suitable in this regard are additive models. A nonlinear additive model has the following structure (16) $\begin{aligned} f^{n} (x_{k}, μ_{k}^{n}, u_{k}^{n}) & = g^{n} (x_{k}^{n}, μ_{k}^{n}, u_{k}^{n}, a^{n} (x_{k}^{\bar{n}}, μ_{k}^{n})) \end{aligned}$ (16) (17) $\begin{aligned} = q^{n} (x_{k}^{n}, μ_{k}^{n}, u_{k}^{n}) + a^{n} (x_{k}^{\bar{n}}, μ_{k}^{n}), \end{aligned}$ (17) where $q^{n} : X^{n} \times M^{n} \times U^{n} \mapsto X^{n}$ represents the local effect and $a^{n}$ with $D_{z}^{n} = D_{x}^{n}$ aggregates effect of other subsystems. The additive linear model has an even simpler structure (18) $\begin{aligned} f^{n} (x_{k}, μ_{k}^{n}, u_{k}^{n}) & = \underset{q^{n} (x_{k}^{n}, μ_{k}^{n}, u_{k}^{n})}{\underset{⏟}{A_{n}^{n} (μ_{k}^{n}) x_{k}^{n} + B^{n} (μ_{k}^{n}) u_{k}^{n}}} \\ + \underset{a^{n} (x_{k}^{\bar{n}}, μ_{k}^{n})}{\underset{⏟}{\sum_{i \in N^{\bar{n}}} A_{i}^{n} (μ_{k}^{n}) x_{k}^{i}}}, \end{aligned}$ (18) where $N^{\bar{n}} = N ∖ {n}$ , $A_{i}^{n} (μ_{k}^{n}) \in R^{D_{x}^{n} \times D_{x}^{i}}$ , $i \in N$ and $B^{n} (μ_{k}^{n}) \in R^{D_{x}^{n} \times D_{u}^{n}}$ are matrices related to the subsystem $Σ^{n}$ . The linear additive model is attractive especially from a computational point of view because having the Gaussian conditional PDF of $x_{k}^{\bar{n}}$ , the aggregated effect (19) $a^{n} (x_{k}^{\bar{n}}, μ_{k}^{n}) = \sum_{i \in N^{\bar{n}}} A_{i}^{n} (μ_{k}^{n}) x_{k}^{i}$ (19) can be represented exactly by the mean and covariance matrix for each $μ_{k}^{n} \in M^{n}$ .

To complete the model (Equation14a(14a) $\begin{aligned} x_{k + 1}^{n} = g^{n} (x_{k}^{n}, μ_{k}^{n}, u_{k}^{n}, z_{k}^{n}) + F^{n} (μ_{k}^{n}) w_{k}^{n}, \end{aligned}$ (14a) ), the dynamics for the new random variable $z_{k}^{n}$ could be defined using (Equation15(15) $z_{k}^{n} = a^{n} (x_{k}^{\bar{n}}, μ_{k}^{n}) .$ (15) ) and the dynamics of other subsystems. Nevertheless, it is more convenient to consider the conditional PDF $p (x_{k}^{n}, μ_{k}^{n}, z_{k}^{n} | I_{k})$ , which can be obtained from $p (x_{k}, μ_{k}^{n} | I_{k})$ using (Equation15(15) $z_{k}^{n} = a^{n} (x_{k}^{\bar{n}}, μ_{k}^{n}) .$ (15) ). The evolution of the conditional PDF $p (x_{k}^{n}, μ_{k}^{n}, z_{k}^{n} | I_{k})$ can be described by the following model (20) $\begin{aligned} p (x_{k + 1}^{n}, μ_{k + 1}^{n}, z_{k + 1}^{n} | I_{k + 1}) \\ = φ_{DIS} (p (x_{k}^{n}, μ_{k}^{n}, z_{k}^{n} | I_{k}), u_{k}, y_{k + 1}), \end{aligned}$ (20) where $φ_{DIS} : P \times U \times Y \mapsto P$ is a mapping that encapsulates the distributed estimation algorithm and the model of the LSS Σ. The dimension of the statistic needed to describe this conditional PDF is less than the dimension of statistic that would be needed without aggregating the effect of other subsystems. Nevertheless, the mapping $φ_{DIS}$ is not suitable for the distributed design of the AFD because the inputs $u_{k}$ and observations $y_{k + 1}$ of the whole LSS are involved. This issue is treated in the following section.

4.2. Local approximation of the reduced order model

Since the dependence on inputs and observations of other subsystems in the model (Equation20(20) $\begin{aligned} p (x_{k + 1}^{n}, μ_{k + 1}^{n}, z_{k + 1}^{n} | I_{k + 1}) \\ = φ_{DIS} (p (x_{k}^{n}, μ_{k}^{n}, z_{k}^{n} | I_{k}), u_{k}, y_{k + 1}), \end{aligned}$ (20) ) comes from the distributed estimation algorithm through $z_{k}^{n}$ , the proposed approximation aims to neglect this influence by assuming that $z_{k}^{n}$ is independent of $I_{k}$ and its stochastic properties are time-invariant. The particular consequences of this approximation for the distributed estimation algorithm are as follows. The conditional PDF $p (x_{k}^{n}, μ_{k}^{n}, z_{k}^{n} | I_{k})$ can be factorised using the conditional independence of $x_{k}^{n}$ and $z_{k}^{n}$ as (21) $\begin{aligned} p (x_{k}^{n}, μ_{k}^{n}, z_{k}^{n} | I_{k}) \\ = p (x_{k}^{n}, z_{k}^{n} | μ_{k}^{n}, I_{k}) P r (μ_{k}^{n} | I_{k}) \\ = p (x_{k}^{n} | μ_{k}^{n}, I_{k}) p (z_{k}^{n} | μ_{k}^{n}, I_{k}) P r (μ_{k}^{n} | I_{k}) \\ \approx p (x_{k}^{n} | μ_{k}^{n}, I_{k}) p_{z^{n} | μ^{n}} (z_{k}^{n} | μ_{k}^{n}) P r (μ_{k}^{n} | I_{k}) \\ = p (x_{k}^{n}, μ_{k}^{n} | I_{k}) p_{z^{n} | μ^{n}} (z_{k}^{n} | μ_{k}^{n}), \end{aligned}$ (21) where $p_{z^{n} | μ^{n}}$ is a known PDF of $z_{k}^{n}$ conditioned by $μ_{k}^{n}$ . This conditional PDF approximates $p (z_{k}^{n} | μ_{k}^{n}, I_{k})$ at all time steps, i.e. (22) $p (z_{k}^{n} | μ_{k}^{n}, I_{k}) \approx p_{z^{n} | μ^{n}} (z_{k}^{n} | μ_{k}^{n}) .$ (22) Note that the time index is left out intentionally to emphasise that this conditional PDF is time-invariant. The prediction step of the distributed estimation algorithm uses this approximation to compute (23) $\begin{aligned} p (x_{k + 1}^{n}, μ_{k + 1}^{n} | I_{k}, u_{k}^{n}) \\ = \sum_{μ_{k}^{n}} \iint p (x_{k + 1}^{n} | x_{k}^{n}, μ_{k}^{n}, u_{k}^{n}, z_{k}^{n}) P r (μ_{k + 1}^{n} | μ_{k}^{n}) \\ \times p (x_{k}^{n}, μ_{k}^{n} | I_{k}) p_{z^{n} | μ^{n}} (z_{k}^{n} | μ_{k}^{n}) d x_{k}^{n} d z_{k}^{n}, \end{aligned}$ (23) where $p (x_{k + 1}^{n} | x_{k}^{n}, μ_{k}^{n}, u_{k}^{n}, z_{k}^{n})$ is given by (Equation14c(14c) $\begin{aligned} y_{k}^{n} = h^{n} (x_{k}^{n}, μ_{k}^{n}) + H^{n} (μ_{k}^{n}) v_{k}^{n}, \end{aligned}$ (14c) ) and the state noise PDF $p_{w_{k}^{n}}$ . The filtering and merging steps of the distributed estimation algorithm are performed without any change and result in the conditional PDF $p (x_{k + 1}^{n}, μ_{k + 1}^{n} | I_{k + 1}^{∙, n})$ . The fusion step does not need to be performed due to the independence induced by the approximation (Equation22(22) $p (z_{k}^{n} | μ_{k}^{n}, I_{k}) \approx p_{z^{n} | μ^{n}} (z_{k}^{n} | μ_{k}^{n}) .$ (22) ).

The final local approximate model of a reduced order can be written using the statistic computed by the estimation algorithm. If the conditional PDF $p (x_{k}^{n}, μ_{k}^{n} | I_{k}^{∙, n})$ is described by the statistic $ξ_{k}^{n} \in K^{n}$ and the conditional PDF $p_{z^{n} | μ^{n}} (z_{k}^{n} | μ_{k}^{n})$ is described by the statistic $ζ_{k}^{n} \in Z^{n}$ , the local approximate model can be written as (24) $[\begin{matrix} ξ_{k + 1}^{n} \\ ζ_{k + 1}^{n} \end{matrix}] = [\begin{matrix} ϕ_{DIS}^{n} (ξ_{k}^{n}, ζ_{k}^{n}, u_{k}^{n}, y_{k + 1}^{n}) \\ ζ_{k}^{n} \end{matrix}],$ (24) where $ϕ_{DIS}^{n} : K^{n} \times Z^{n} \times U^{n} \times Y^{n} \mapsto K^{n}$ represents a modified distributed estimation algorithm with the reduced order model (Equation14(14a) $\begin{aligned} x_{k + 1}^{n} = g^{n} (x_{k}^{n}, μ_{k}^{n}, u_{k}^{n}, z_{k}^{n}) + F^{n} (μ_{k}^{n}) w_{k}^{n}, \end{aligned}$ (14a) ).

Since the overall detection cost function (Equation7(7) $L (μ_{k}, d_{k}) = \sum_{n = 1}^{N} L^{n} (μ_{k}^{n}, d_{k}^{n}),$ (7) ) is additive over individual subsystems, the detection cost function for the AFD node $Δ^{n}$ is given as (25) ${\bar{L}}^{n} (ξ_{k}^{n}, ζ_{k}^{n}, d_{k}^{n}) = \sum_{μ_{k}^{n}} L^{n} (μ_{k}^{n}, d_{k}^{n}) P r (μ_{k}^{n} | I_{k}) .$ (25) The distributed AFD node for the subsystem $Σ^{n}$ is a time invariant system that is described at a time step k as (26) $Δ^{n} : [\begin{matrix} d_{k}^{n} \\ u_{k}^{n} \end{matrix}] = [\begin{matrix} {\bar{σ}}_{DIS}^{n} (ξ_{k}^{n}, ζ_{k}^{n}) \\ {\bar{γ}}_{DIS}^{n} (ξ_{k}^{n}, ζ_{k}^{n}) \end{matrix}],$ (26) where ${\bar{σ}}_{DIS}^{n} : K^{n} \times Z^{n} \mapsto M^{n}$ and ${\bar{γ}}_{DIS}^{n} : K^{n} \times Z^{n} \mapsto U^{n}$ are unknown functions. They can be designed in a way similar to Straka and Punčochář (Citation2019), which considered the AFD description (Equation9(9) $Δ : [\begin{matrix} d_{k} \\ u_{k} \end{matrix}] = [\begin{matrix} \bar{σ} (ξ_{k}) \\ \bar{γ} (ξ_{k}) \end{matrix}],$ (9) ).

The proposed approximation can be interpreted as follows: In addition to the time-varying statistic $ξ_{k}^{n}$ , which represents an estimate of the local state, the Bellman function is also parametrised by time-invariant statistic $ζ_{k}^{n}$ that represents the influence of other subsystems. As a result, the Bellman function takes into account the coupling among the LSS subsystems.

4.3. Algorithm of AFD with distributed design

Now, each step of the algorithm is specified for a node $Δ^{n}$ , with a special focus on the steps important for the distributed design. Details of other steps can be found in Straka and Punčochář (Citation2020a).

Assume: The prediction PDF $p (s_{k}^{n} | I_{k - 1}, u_{k - 1}^{n})$ is available.

Filtering: Infer the filtering PDF $\bar{p} (s_{k}^{n} | I_{k}^{∙, n})$ using the Bayesian rule.

Merging: The generalised pseudo-Bayesian method of the second-order (GPB2) (Watanabe & Tzafestas, Citation1993) is used to compute the approximation $p (s_{k}^{n} | I_{k}^{∙, n})$ .

Fusion: The AFD nodes communicate their filtering estimates. The estimates received by the AFD node $Δ^{n}$ are fused to obtain $p (s_{k} | I_{k})$ .

Statistic construction: The statistic $ξ_{k}^{n}$ corresponding to $p (s_{k}^{n} | I_{k})$ and the statistic $ζ_{k}^{n}$ are computed from the statistic $ξ_{k}$ for the fused PDF $p (x_{k}, μ_{k}^{n} | I_{k})$ using relation (Equation15(15) $z_{k}^{n} = a^{n} (x_{k}^{\bar{n}}, μ_{k}^{n}) .$ (15) ).

Decision generation: The decision $d_{k}^{n}$ is given by (Equation26(26) $Δ^{n} : [\begin{matrix} d_{k}^{n} \\ u_{k}^{n} \end{matrix}] = [\begin{matrix} {\bar{σ}}_{DIS}^{n} (ξ_{k}^{n}, ζ_{k}^{n}) \\ {\bar{γ}}_{DIS}^{n} (ξ_{k}^{n}, ζ_{k}^{n}) \end{matrix}],$ (26) ).

Input generation: The input $u_{k}^{n}$ is given by (Equation26(26) $Δ^{n} : [\begin{matrix} d_{k}^{n} \\ u_{k}^{n} \end{matrix}] = [\begin{matrix} {\bar{σ}}_{DIS}^{n} (ξ_{k}^{n}, ζ_{k}^{n}) \\ {\bar{γ}}_{DIS}^{n} (ξ_{k}^{n}, ζ_{k}^{n}) \end{matrix}],$ (26) ).

Prediction: The prediction PDF $p (s_{k + 1}^{n} | I_{k}, u_{k}^{n})$ is calculated using the Chapman–Kolmogorov equation.

5. Numerical illustration

The performance of the proposed distributed AFD is illustrated by means of two simple numerical examples.

Example 5.1

Consider the system Σ that consists of two coupled multiple-model linear subsystems $\begin{aligned} Σ^{n} : x_{k + 1}^{n} & = A^{n} (μ_{k}^{n}) x_{k} + B^{n} (μ_{k}^{n}) u_{k}^{n} + F^{n} (μ_{k}^{n}) w_{k}^{n}, \\ y_{k}^{n} & = C^{n} (μ_{k}^{n}) x_{k}^{n} + H^{n} (μ_{k}^{n}) v_{k}^{n}, \end{aligned}$ where n = 1, 2, $x_{k} = [x_{k}^{1}, x_{k}^{2}]^{T} \in X = R^{2}$ , both subsystems have two models ( $M^{n} = 2$ ) with $\begin{aligned} A^{1} (1) & = [0.3 0.15], B^{1} (1) = 1, F^{1} (1) = 0.1, \\ C^{1} (1) & = - 2, H^{1} (1) = 0.5, \\ A^{1} (2) & = [0.5 0.1], B^{1} (2) = 1.5, F^{1} (2) = 0.1, \\ C^{1} (2) & = 1.5, H^{1} (2) = 0.5, \\ A^{2} (1) & = [0.15 0.3], B^{2} (1) = 1, F^{2} (1) = 0.1, \\ C^{2} (1) & = - 2, H^{2} (1) = 0.5, \\ A^{2} (2) & = [0.15 0.5], B^{2} (2) = 1.5, F^{2} (2) = 0.1 \\ C^{2} (2) & = 1.5, H^{2} (2) = 0.5. \end{aligned}$ The models $μ_{k}^{1} = 1$ and $μ_{k}^{2} = 1$ represent the fault-free behaviour and the models $μ_{k}^{1} = 2$ and $μ_{k}^{2} = 2$ represent the faulty behaviour. The transition probabilities of the models for each subsystem are given in Table .

Table 1. Transition probabilities of the models.

Display Table

The state noises $w_{k}^{n}$ and measurement noises $v_{k}^{n}$ have all standard Gaussian PDF $p_{w_{k}^{n}} = p_{v_{k}^{n}} = N {0, 1}$ . The initial condition $x_{0}$ has Gaussian PDF $N {0, 0.01 \cdot I}$ and initial $μ_{0}$ has probability $P (μ_{0} = [1 1]^{T}) = 1$ , which means that each subsystem is fault-free at the beginning. The admissible inputs of subsystems are $U^{1} = U^{2} = {- 0.5, 0.5}$ . The detection cost function $L^{n}$ penalises missed detections and false alerts equally using the zero-one function (28) $L^{n} (μ_{k}^{n}, d_{k}^{n}) = 1 - δ_{μ_{k}^{n}, d_{k}^{n}},$ (28) The discount factor is $η = 0.9$ .

Performance of the following algorithms is analysed:

passive FD (PFD) with random input generator with $P r (u_{k}^{n} = - 0.5) = P r (u_{k}^{n} = - 0.5) = 0.5$ with both decentralised and distributed estimation.
PFD with switching input generator $u_{k}^{n} = 0.5 sign (\sin (0.2 k))$ with both decentralised and distributed estimation.
AFD with decentralised design and decentralised estimation proposed in Straka and Punčochář (Citation2019), which neglects the coupling in both stages.
AFD with decentralised design and distributed estimation proposed in Straka and Punčochář (Citation2019), which neglects the coupling in the off-line stage only.
The proposed distributed AFD (see Section 4) that respects the coupling among the subsystems during both on-line and off-line stages.

The state estimation was carried out by a bank of Kalman filters. The dimension of the information state was (29) $D_{ξ}^{n} = M^{n} (D_{x}^{n} + \frac{D_{x}^{n} (D_{x}^{n} + 1)}{2}) + M^{n} - 1,$ (29) where $(D_{x}^{n} + \frac{D_{x}^{n} (D_{x}^{n} + 1)}{2})$ is the dimension of the sufficient statistics for $p (x^{n} | I_{k}, μ_{k}^{n})$ consisting of $D_{x}^{n}$ -dimensional mean and $\frac{D_{x}^{n} (D_{x}^{n} + 1)}{2}$ elements of the covariance matrix and $M^{n} - 1$ is the number of probabilities $P r (μ_{k}^{n} | I_{k})$ in the information state. For the decentralised design, the information state of each AFD node has dimension $D_{ξ}^{n} = 5$ while for the distributed design the information states were $D_{ξ}^{n} = 7$ dimensional with the increase caused by the statistic $ζ_{k}^{n}$ aggregating the coupling effect. It should be noted, that the centralised design of the AFD would require a single $D_{ξ} = 23$ dimensional information state, which means immense memory requirements even for such a simple system.

The Bellman function was calculated by the value iteration algorithm over a grid of discrete information states. The distributed design used the grid $H^{DIS} = A \times A \times B \times B \times C \times \bar{A} \times \bar{B}$ , where

$A = [- 1.7 : 0.5 : 1.7]$ is the grid for the conditional mean $E [x_{k}^{n} | I_{k}]$ ,
$B = {0.01, 0.03, 0.05, 0.5, 1}$ is the grid for the conditional variance $var [x_{k}^{n} | I_{k}]$ ,
$C = [0 : 0.05 : 1]$ is the grid for the probability of the first model $P r (μ_{k}^{n} = 1)$ ,
$\bar{A} = A$ is the grid for the conditional mean of the effect of the other subsystem $E [z_{k}^{n} | I_{k}]$ , and
$\bar{B} = B$ is the grid for the conditional variance of the effect of the other subsystem $var [z_{k}^{n} | I_{k}]$ .

Thus, the grids $\bar{A}$ and $\bar{B}$ are related to the aggregated effect of other subsystems $ζ_{k}^{n}$ . The decentralised design used the grid $H^{DEC} = A \times A \times B \times B \times C$ . The number of discrete states was 900375 for the grid of $H^{DIS}$ and 25725 for the grid $H^{DEC}$ .

The performance of the algorithms was evaluated using $10^{5}$ Monte Carlo (MC) simulations where each MC simulation was run over the finite time horizon F = 400. The estimate $\hat{J}$ of the criterion (Equation6(6) $J (σ_{0}^{\infty}, γ_{0}^{\infty}) = lim_{F \to \infty} E {\sum_{k = 0}^{F} η^{k} L (μ_{k}, d_{k})},$ (6) ) obtained by the MC simulations, the probabilities of missed detection ( $P_{MD}$ ) and false alerts ( $P_{FA}$ ), and time requirementsFootnote¹⁵ $T_{on - line}$ of a single MC run are given in Table .

Table 2. Performance of decentralised and distributed PFD and AFD architectures for Example 5.1.

Display Table

By comparing the results, it is clear that AFD achieves performance superior to PFD. Within the PFD algorithms, the switching input generator excites the system better than the random input generator. Also, when using the distributed estimation, the detection quality is better than when using the decentralised estimation. Within the AFD algorithms, the best detection quality (in terms of the criterion, missed detections and false alerts) is achieved by the proposed distributed AFD algorithm. The values achieved by the decentralised and distributed estimation confirm that it is worth respecting the coupling during the on-line stage of the algorithm even if the Bellman function calculation ignores the coupling.

The computational times of the on-line stage of the algorithms indicate that using the decentralised estimation is computationally cheaper than using the distributed estimation because it does not execute the fusion step. When analysing the computational times of the proposed algorithm, it can be seen that the usage of the extended information state is computationally very cheap with respect to other steps of the on-line stage.

Example 5.2

This example is adapted from Harirchi et al. (Citation2017). An apartment consisting of four rooms equipped with a radiant heating system is considered. The system can be described by a linear state-space continuous-time model $\begin{aligned} c_{1} {\dot{T}}_{1} (t) & = k_{r, 1} (T_{c, 1} (t) - T_{1} (t)) + k_{1} (T_{a} (t) - T_{1} (t)) \\ + \sum_{j \in {2, 3}} k_{1 j} (T_{j} (t) - T_{1} (t)), \\ c_{2} {\dot{T}}_{2} (t) & = k_{r, 2} (T_{c, 2} (t) - T_{2} (t)) + k_{2} (T_{a} (t) - T_{2} (t)) \\ + \sum_{j \in {1, 4}} k_{2 j} (T_{j} (t) - T_{2} (t)), \\ c_{3} {\dot{T}}_{3} (t) & = k_{3} (T_{a} (t) - T_{3} (t)) + \sum_{j \in {1, 4}} k_{3 j} (T_{j} (t) - T_{3} (t)), \\ c_{4} {\dot{T}}_{4} (t) & = k_{r, 4} (T_{c, 4} (t) - T_{4} (t)) + k_{4} (T_{a} (t) - T_{4} (t)) \\ + \sum_{j \in {2, 3}} k_{4 j} (T_{j} (t) - T_{4} (t)), \end{aligned}$ where the list of parameters and variables is given in Table and the parameter values are $k_{1} = k_{2} = k_{3} = \frac{1}{2.1}$ , $k_{4} = \frac{1}{1.9}$ , $k_{r, 1} = k_{r, 2} = k_{r, 3} = k_{r, 4} = \frac{1}{0.125}$ , $k_{12} = k_{21} = k_{13} = k_{31} = k_{34} = k_{43} = \frac{1}{0.16}$ , $k_{24} = k_{42} = \frac{1}{0.20}$ , $c_{1} = c_{2} = 1800$ , $c_{3} = 2000$ , $c_{4} = 2100$ .

Table 3. Building parameters and variables.

Display Table

The continuous-time model was discretised with a sampling period of 5 min and an error-state model was set up with a separate input, an independent state noise, and direct measurement of the room air temperature for each room. The steady-state values were $T_{1} (\infty) = 22.93$ $T_{2} (\infty) = 23.04$ , $T_{3} (\infty) = 22.44$ , $T_{4} (\infty) = 22.89$ , $T_{c, n} (\infty) = 24$ , $T_{a} (\infty) = 10$ . The fault-free model is then $\begin{aligned} Σ^{n} : x_{k + 1}^{n} & = A^{n} (μ_{k}^{n}) x_{k} + B^{n} (μ_{k}^{n}) u_{k}^{n} + F^{n} (μ_{k}^{n}) w_{k}^{n}, \\ y_{k}^{n} & = C^{n} (μ_{k}^{n}) x_{k}^{n} + H^{n} (μ_{k}^{n}) v_{k}^{n}, \end{aligned}$ with $x_{k}^{n} = T_{n} (t_{k}) - T_{n} (\infty)$ , $u_{k}^{n} = T_{c, n} (t_{k}) - T_{c, n} (\infty)$ , noise $w_{k}^{n} = T_{a} (t_{k}) - T_{a} (\infty)$ being the deviation from a nominal ambient air temperature modelled as Gaussian random variable $p_{w_{k}^{n}} = N {0, 1}$ , $\begin{aligned} A^{1} (1) & = [0.091 0.066 0.123 0.067], \\ B^{1} (1) & = 0.430, F^{1} (1) = 0.026, \\ C^{1} (1) & = 1, H^{1} (1) = 0.01, \\ A^{2} (1) & = [0.066 0.082 0.076 1.073], \\ B^{2} (1) & = 0.437, F^{2} (1) = 0.026, \\ C^{2} (1) & = 1, H^{2} (1) = 0.01, \\ A^{3} (1) & = [0.111 0.068 0.256 0.134], \\ B^{3} (1) & = 0, F^{3} (1) = 0.036, \\ C^{3} (1) & = 1, H^{3} (1) = 0.01, \\ A^{4} (1) & = [0.058 0.062 0.128 0.123], \\ B^{4} (1) & = 0.423, F^{4} (1) = 0.028, \\ C^{4} (1) & = 1, H^{4} (1) = 0.01. \end{aligned}$ The measurement noise has standard Gaussian PDF $p_{v_{k}^{n}} = N {0, 1}$ , the initial condition has Gaussian PDF $N {0, 0.1 \cdot I}$ . The faulty behaviour represented sensor faults $C^{n} (2) = 0.9$ , $A^{n} (1) = A^{n} (2)$ , $B^{n} (1) = B^{n} (2)$ , $F^{n} (1) = F^{n} (2)$ , $H^{n} (1) = H^{n} (2)$ , $n = 1, \dots, 4$ . The matrix of transition probabilities of the models was $P r (μ_{k + 1}^{n} | μ_{k}^{n}) = [\begin{matrix} 0.99 & 0 \\ 0.01 & 1 \end{matrix}]$ and the LSS behaviour was initially fault-free. The admissible inputs of subsystems are $U^{n} = {- 0.5, 0.5}, n = 1, 2, 4$ . The detection cost function $L^{n}$ is the zero-one function (Equation28(28) $L^{n} (μ_{k}^{n}, d_{k}^{n}) = 1 - δ_{μ_{k}^{n}, d_{k}^{n}},$ (28) ). The discount factor is $η = 0.98$ .

The state estimation was again carried out by a bank of the Kalman filters. The Bellman function was calculated by the value iteration algorithm over grids of discrete information states with $H^{DIS} = A \times A \times B \times B \times C \times \bar{A} \times \bar{B}$ and $H^{DEC} = A \times A \times B \times B \times C$ where

$A = [- 0.4 : 0.1 : 0.4]$ is the grid for the conditional mean $E [x_{k}^{n} | I_{k}]$ ,
$B = {0.85, 0.9, 0.95, 1.05, 1.1, 1.15} \times 10^{- 4}$ is the grid for the conditional variance $var [x_{k}^{n} | I_{k}]$ ,
$C = [0 : 0.05 : 1]$ is the grid for the probability of the first model $P r (μ_{k}^{n} = 1)$ ,
$\bar{A} = [- 0.04 : 0.01 : 0.04]$ is the grid for the conditional mean of the effect of the other subsystem $E [z_{k}^{n} | I_{k}]$ , and
$\bar{B} = B$ is the grid for the conditional variance of the effect of the other subsystem $var [z_{k}^{n} | I_{k}]$ .

The number of discrete states was 3, 306, 744 for the grid $H^{DIS}$ and 61, 236 for the grid $H^{DEC}$ . Here, it should be emphasised, that the centralised architecture would require a single $D_{ξ} = 239$ dimensional information state, whereas the decentralised and distributed architectures require four $D_{ξ}^{n} = 5$ dimensional and $D_{ξ}^{n} = 7$ dimensional information states, respectively.

The performance of the algorithms was evaluated using the $10^{4}$ Monte Carlo (MC) simulations where each MC simulation was run over the finite time horizon F = 400, i.e. 2000 min. The estimate $\hat{J}$ of the criterion (Equation6(6) $J (σ_{0}^{\infty}, γ_{0}^{\infty}) = lim_{F \to \infty} E {\sum_{k = 0}^{F} η^{k} L (μ_{k}, d_{k})},$ (6) ) obtained by the MC simulations, the probabilities of missed detection ( $P_{MD}$ ) and false alerts ( $P_{FA}$ ), and time requirements $T_{on - line}$ of a single MC run are given in Table .

Table 4. Performance of decentralised and distributed PFD and AFD architectures for Example 5.2.

Display Table

The results confirm superior performance of the AFD algorithms. Among them, the best performance is achieved by the proposed distributed AFD. The reason lies again in the utilisation of the additional information from the other nodes. The results also indicate that using the distributed architecture at least in the on-line stage improves the detection quality. When analysing the computational costs, it follows that the costs are affected by (i) the architecture used by the on-line stage (decentralized architecture neglecting the coupling is cheaper than the distributed architecture, in which the nodes must fuse the estimates obtained from other nodes) and (ii) the dimension of the Bellman function representation as the proposed AFD with distributed design leads to its higher dimension and consequently to computational costs related to the manipulations with the Bellman function.

6. Conclusion

The paper dealt with active fault diagnosis for large-scale systems that can be decomposed into subsystems with separate measurements and inputs. The subsystem behaviour was described by stochastic multiple models representing the fault-free and faulty behaviour. The inputs were utilised by the AFD to excite the system to achieve better fault detection. An AFD algorithm was developed by means of the distributed design so that the coupling among the subsystems is taken into account in both the on-line and off-line stages. The excitation generated by the proposed distributed AFD algorithm leads to better detection of the faults appearing in the subsystems. The improved performance of the proposed AFD algorithm was confirmed by two numerical examples, which also showed that the computational costs of the on-line part of the algorithm are comparable with the AFD algorithms with decentralised design and the price for the improved performance is paid by the increased memory requirements of the Bellman function representation. Nevertheless, the memory requirements are still significantly lower than for the centralised design.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Additional information

Funding

This work was supported by the Czech Science Foundation, project no. GA18-08531S.

Notes on contributors

Ondřej Straka

Ondřej Straka received the master's degree in cybernetics and control engineering and the Ph.D. degree in cybernetics from the University of West Bohemia, Pilsen, Czech Republic, in 1998 and 2004, respectively. Since 2015, he has been an Associate Professor with the Department of Cybernetics, University of West Bohemia. He is the Head of the Identification and Decision Making Research Group (IDM), NTIS–New Technologies for the Information Society. He has participated in a number of projects of fundamental research and in several project of applied research (e.g. GNSS- based safe train localisation and attitude and heading reference system). He was involved in the development of several software frameworks for nonlinear state estimation and system identification. He has published over 70 journal and conference papers in journals, such as Automatica, the IEEE Transactions on Automatic Control, the IEEE Transactions on Aerospace and Electronic Systems, the IEEE Transactions on Cybernetics, and Signal Processing and at international conferences such as American Control Conference, World Congresses and Symposia of the IFAC, and FUSION Conferences. His current research interests include local and global nonlinear state estimation methods, system identification, performance evaluation, and fault detection. Dr. Straka was a recipient of Werner von Siemens Excellence Award in 2014 for the most important result in the basic research.

Ivo Punčochář

Ivo Punčochář received the master's degree in cybernetics and control engineering and the Ph.D. degree in cybernetics from the University of West Bohemia, Pilsen, Czech Republic, in 2003 and 2008, respectively. Since 2014 he is a senior researcher at the research centre New Technologies for the Information Society at the University of West Bohemia. He is also a member of the Identification and Decision Making Research Group established there. He has participated in several projects of fundamental and applied research that dealt with fault detection, state estimation, and GNSS-based safe positioning. He has published over 30 conference and journal papers. His primary research interests include active fault detection, optimal stochastic control and global navigation satellite systems. He was a member of team that received the Werner von Siemens Excellence Award in 2014 for the most important outcome of basic research.

Notes

1 As will be shown in the numerical example.

2 The functions are assumed to be Borel measurable.

3 The variable denoted

x_{i : j} = [x_{i}^{T}, x_{i + 1}^{T}, \dots, x_{j}^{T}]^{T}

with j>i stands for the whole sequence of variables from time i to time j stacked into a column vector.

4 The centralised architecture is connected with a high-dimensional information state (see Section 3), which prohibits design of the input generator.

5 A symbol with the superscript pertains to the corresponding subsystem (e.g. $Σ^{n}$ ), whereas a symbol without the superscript relates to the whole LSS (e.g. Σ).

6 The quantities of the LSS are compositions of the quantities of its subsystems as $x_{k} = [(x_{k}^{1})^{T}, \dots, (x_{k}^{N})^{T}]^{T}$ , $μ_{k} = [μ_{k}^{1}, \dots, μ_{k}^{N}]^{T}$ , $y_{k} = [(y_{k}^{1})^{T}, \dots, (y_{k}^{N})^{T}]^{T}$ , $u_{k} = [(u_{k}^{1})^{T}, \dots, (u_{k}^{N})^{T}]^{T}$ , $v_{k} = [(v_{k}^{1})^{T}, \dots, (v_{k}^{N})^{T}]^{T}$ , and $w_{k} = [(w_{k}^{1})^{T}, \dots, (w_{k}^{N})^{T}]^{T}$ .

7 Note that $s_{k}^{n}$ is called local state for convenience, even though it is not technically a state of the subsystem due to the coupling.

8 With a slight abuse of terminology, the function $p (s_{k}) = p (x_{k}, μ_{k})$ will be called PDF although $μ_{k}$ is a discrete random variable. A more formal notation would require using the cumulative distribution function or the Dirac delta function instead of the PDF.

9 For example, neglecting the energy flow between the subsystems would dramatically worsen the model quality.

10 The operator $E {\cdot}$ denotes the expectation over all involved random variables.

11 The cost function is assumed to be lower semi-analytic.

12 The algorithms differ in the architecture used for the on-line stage.

13 If the information from other nodes is not available (e.g. due to communication problems), approximate models have to be used, which neglect the coupling. This corresponds to the (partially) decentralised architecture.

14 It should be reminded that this global information was obtained by the AFD nodes during the information communication and subsequent fusion at the previous time instant k−1.

15 All the numerical simulations in the paper were performed using the R2019a version of Matlab $®$ software running on the PC equipped with Intel $®$ CoreTM i7–4790 CPU (3.60 [GHz]). Note that although the on-line stages should run in parallel at each AFD node, they were executed sequentially for all nodes in the simulations.

References

Ashari, A. E., Nikoukhah, R., & Campbell, S. L. (2012). Active robust fault detection in closed-loop systems: Quadratic optimization approach. IEEE Transactions on Automatic Control, 57(10), 2532–2544. https://doi.org/10.1109/TAC.2012.2188430
Web of Science ®Google Scholar
Bar-Shalom, Y., Li, X. R., & Kirubarajan, T. (2001). Estimation with applications to tracking and navigation. John Wiley & Sons.
Google Scholar
Bertsekas, D. P. (2000). Dynamic programming and optimal control (2nd ed., Vol. I). Athena Scientific.
Google Scholar
Bertsekas, D. P., & Shreve, S. E. (1996). Stochastic optimal control: The discrete-time case. Athena Scientific.
Google Scholar
Blackmore, L., Rajamanoharan, S., & Williams, B. C. (2008). Active estimation for jump Markov linear systems. IEEE Transactions on Automatic Control, 53(10), 2223–2236. https://doi.org/10.1109/TAC.2008.2006100
Web of Science ®Google Scholar
Blanke, M., Kinnaert, M., Lunze, J., & Staroswiecki, M. (2016). Diagnosis and fault-tolerant control (3rd ed.). Springer-Verlag.
Google Scholar
Campbell, S. L., & Nikoukhah, R. (2004). Auxiliary signal design for failure detection. Princeton University Press.
Google Scholar
Ferrari, R. M. G., Parisini, T., & Polycarpou, M. M. (2012). Distributed fault detection and isolation of large-scale discrete-time nonlinear systems: An adaptive approximation approach. IEEE Transactions on Automatic Control, 57(2), 275–290. https://doi.org/10.1109/TAC.2011.2164734
Web of Science ®Google Scholar
Gustafsson, F. (2009). Automotive safety systems. IEEE Signal Processing Magazine, 26(4), 32–47. https://doi.org/10.1109/MSP.2009.932618
Web of Science ®Google Scholar
Harirchi, F., Yong, S. Z., Jacobsen, E., & Ozay, N. (2017). Active model discrimination with applications to fraud detection in smart buildings. IFAC Papers Online, 50(1), 9527–9534. https://doi.org/10.1016/j.ifacol.2017.08.1616. (20th IFAC World Congress)
Google Scholar
Heirung, T. A. N., & Mesbah, A. (2019). Input design for active fault diagnosis. Annual Reviews in Control, 47(9), 35–50. https://doi.org/10.1016/j.arcontrol.2019.03.002
Web of Science ®Google Scholar
Isermann, R. (2011). Fault-diagnosis applications. Springer.
Google Scholar
Katipamula, S., & M. R. Brambley (2011). Review article: Methods for fault detection, diagnostics, and prognostics for building systems – a review, part II. HVAC&R Research, 11(2), 169–187. https://doi.org/10.1080/10789669.2005.10391133
Google Scholar
Nelles, O. (2014). Nonlinear system identification: From classical approaches to neural networks and fuzzy models. Springer.
Google Scholar
Niemann, H., & Poulsen, N. K. (2014, June). Active fault detection in MIMO systems. In Proceedings of the 2014 American Control Conference (pp. 1975–1980). Institute of Electrical and Electronics Engineers Inc.
Google Scholar
Niemann, H. H. (2006). A setup for active fault diagnosis. IEEE Transactions on Automatic Control, 51(9), 1572–1578. https://doi.org/10.1109/TAC.2006.878724
Web of Science ®Google Scholar
Paulson, J. A., Martin-Casas, M., & Mesbah, A. (2017). Input design for online fault diagnosis of nonlinear systems with stochastic uncertainty. Industrial & Engineering Chemistry Research, 56(34), 9593–9605. https://doi.org/10.1021/acs.iecr.7b00602
Web of Science ®Google Scholar
Punčochář, I, & Straka, O. (2019). Non-centralized active fault diagnosis for stochastic systems. In 2019 American control conference. Institute of Electrical and Electronics Engineers Inc.
Google Scholar
Punčochář, I., Široký, J., & Šimandl, M. (2015). Constrained active fault detection and control. IEEE Transactions on Automatic Control, 60(1), 253–258. https://doi.org/10.1109/TAC.2014.2326274
Web of Science ®Google Scholar
Raimondo, D. M., Boem, F., Gallo, A., & Parisini, T. (2016). A decentralized fault-tolerant control scheme based on active fault diagnosis. In Proceedings of the 55th IEEE conference on decision and control (pp. 2164–2169). Institute of Electrical and Electronics Engineers Inc.
Google Scholar
Raimondo, D. M., G. R. Marseglia, Braatz, R. D., & Scott, J. K. (2016). Closed-loop input design for guaranteed fault diagnosis using set-valued observers. Automatica, 74(2), 107–117. https://doi.org/10.1016/j.automatica.2016.07.033
Web of Science ®Google Scholar
Scott, J. K., Findeisen, R., Braatz, R. D., & Raimondo, D. M. (2014). Input design for guaranteed fault diagnosis using zonotopes. Automatica, 50(6), 1580–1589. https://doi.org/10.1016/j.automatica.2014.03.016
Web of Science ®Google Scholar
Škach, J., Punčochář, I., & Lewis, F. L. (2016). Optimal active fault diagnosis by temporal-difference learning. In Proceedings of the 55th IEEE conference on decision and control (pp. 2146–2151). Institute of Electrical and Electronics Engineers Inc.
Google Scholar
Stoustrup, J., & Niemann, H. H. (2010). Active fault diagnosis by controller modification. International Journal of Systems Science, 41(8), 925–936. https://doi.org/10.1080/00207720903470197
Web of Science ®Google Scholar
Straka, O., & Punčochář, I. (2019). Decentralized and distributed active fault diagnosis for stochastic systems with indirect observations. In 22nd international conference on information fusion. Institute of Electrical and Electronics Engineers Inc.
Google Scholar
Straka, O., & Punčochář, I. (2020a). Decentralized and distributed active fault diagnosis based on interactive multiple models. International Journal of Applied Mathematics and Computer Science, 30(2), 239–249. https://doi.org/10.34768/amcs-2020-0019
Web of Science ®Google Scholar
Straka, O., & Punčochář, I. (2020b). Distributed active faults diagnosis for systems with conditionally dependent faults. IFAC Papers Online, 53(2), 13613–13618. https://doi.org/10.1016/j.ifacol.2020.12.857
Google Scholar
Straka, O., & Punčochář, I. (2020c). Hierarchical active fault diagnosis for stochastic large scale systems with coupled faults. In 2020 IEEE 23rd international conference on information fusion (p. 1–8). Institute of Electrical and Electronics Engineers Inc.
Google Scholar
Watanabe, K., & Tzafestas, S. G. (1993). Generalized pseudo-Bayes estimation and detection for abruptly changing systems. Journal of Intelligent and Robotic Systems, 7(1), 95–112. https://doi.org/10.1007/BF01258214
Web of Science ®Google Scholar
Yao, L., Li, L., Guan, Y., & Wang, H. (2019). Fault diagnosis and fault-tolerant control for non-Gaussian nonlinear stochastic systems via entropy optimisation. International Journal of Systems Science, 50(13), 2552–2564. https://doi.org/10.1080/00207721.2019.1671535
Web of Science ®Google Scholar

Distributed design for active fault diagnosis

Abstract

1. Introduction