514
Views
7
CrossRef citations to date
0
Altmetric
Regular papers

Distributed design for active fault diagnosis

ORCID Icon & ORCID Icon
Pages 562-574 | Received 17 Mar 2021, Accepted 28 Jul 2021, Published online: 25 Aug 2021

Abstract

The paper deals with active fault diagnosis of stochastic large-scale systems consisting of several subsystems with separate inputs and observations, which are coupled through the system state. The subsystems are described by multiple models expressing their fault-free and faulty behaviour. The transition between the models is governed by a Markov chain. The paper proposes a distributed design of an active fault diagnosis algorithm, which takes into account the coupling among the subsystems in all stages of the algorithm. This results in a higher quality of the excitation signal and consequently in better decisions. The numerical example shows the improved performance of the proposed algorithm in comparison with the algorithms based on the decentralised design.

1. Introduction

Complexity and degree of integration of large-scale systems (LSSs) increase their liability to faults with possible catastrophic consequences. Therefore, they have to be detected reliably and as quickly as possible by a fault diagnosis (FD) system. The literature recognises two fundamental approaches that differ in the interaction with the monitored system. In the passive approach (Blanke et al., Citation2016; Gustafsson, Citation2009; Isermann, Citation2011; Katipamula & Brambley, Citation2011; Yao et al., Citation2019), the decisions generated by an FD system are based on passive observations of the monitored system measurable quantities. When the active approach is chosen, besides the decisions, the FD system generates an input signal to excite the monitored system (Ashari et al., Citation2012; Niemann & Poulsen, Citation2014; Punčochář et al., Citation2015; Raimondo et al., Citation2016). Its purpose is to obtain more information, which helps to detect faults that may pose a challenge for the passive FD. The active FD (AFD) approach has gained in popularity in the last decade (Campbell & Nikoukhah, Citation2004; Heirung & Mesbah, Citation2019; Niemann, Citation2006; Paulson et al., Citation2017; Scott et al., Citation2014; Stoustrup & Niemann, Citation2010). Within the AFD for stochastic systems, the multiple-model framework is used almost exclusively to describe fault-free and faulty models of the system (Blackmore et al., Citation2008; Škach et al., Citation2016).

Limited communication bandwidth and available computational power are two main reasons for developing special FD algorithms for the LSSs (Ferrari et al., Citation2012; Raimondo et al., Citation2016). In Punčochář and Straka (Citation2019) and Straka and Punčochář (Citation2019), a new AFD framework for stochastic LSSs was introduced involving three architectures – centralised, decentralised, and distributed. In the centralised architecture, all calculations are performed by a single central node, while in the decentralised architecture, the calculations are performed by multiple isolated nodes each tied to a single LSS subsystem. The distributed architecture is similar to the decentralised one and in addition, the nodes communicate with each other. The AFD algorithms consist of (i) the off-line stage, dealing with the design of the excitation input and decision generators, and (ii) the on-line stage, dealing with the state estimation and utilisation of the designed generators. The LSS consists of several subsystems which are subject to certain dynamic interactions called coupling. The input generator design cannot respect the coupling among the LSS subsystems fully for computational tractability reasons even for small-scale systemsFootnote1. To achieve reasonable computational costs of the AFD algorithm, the input generator design introduced in Punčochář and Straka (Citation2019) and Straka and Punčochář (Citation2019) rested on the decentralised architecture, which completely ignores the coupling. This, however, leads to lower quality of the excitation and consequently to lower quality of the FD.

This paper makes the following contribution: A novel distributed AFD algorithm is designed to take into account the coupling among the LSS subsystems in both stages. Compared to the AFD algorithm proposed by Straka and Punčochář (Citation2019), where the AFD node related to a subsystem uses the information received from other nodes in the on-line stage only, the novelty of the AFD algorithm proposed here lies in the off-line stage where the input signal generator is designed. In the proposed distributed AFD algorithm, the input signal generator related to a subsystem employs conveniently the information about the state of other subsystems to take into account the subsystem coupling. The contribution of the paper lies in a convenient aggregation of the effects of other subsystems (i.e. compressing the information) to achieve feasible computational costs. Note that using the information in the uncompressed form would lead to extreme computational costs of the centralised architecture. The additional information available to the generator improves the quality of the excitation input and consequently the quality of the FD.

The paper is structured as follows: Section 2 provides the LSS specification, decomposition, and the AFD problem formulation. A general solution to the AFD problem is briefly summarised in Section 3. The distributed design for the AFD is proposed in Section 4. The performance of the proposed algorithm is illustrated using two numerical examples in Section 5 and Section 6 draws concluding remarks.

2. AFD problem formulation

2.1. LSS specification

Consider an LSS Σ described at time instant kT={0,1,2,} by the following state-space model (1a) Σ:xk+1=f(xk,μk,uk)+F(μk)wk,(1a) (1b) yk=h(xk,μk)+H(μk)vk,(1b) where xkXRDx and μkM are the continuous and discrete parts of the LSS state sk=[xkT,μkT]TS=X×M, ukURDu is the input, wkX is the state noise described by the known probability density function (PDF) pwk, ykYRDy is the output, and vkY is the measurement noise described by the known PDF pvk. The functions f:X×M×UX, h:X×MY, F:MX×X, and H:MY×Y are knownFootnote2. Each element μk of the discrete set M represents a multi-index into a set of M possible models describing behaviour of the LSS Σ in the fault-free and faulty conditions during a sampling period. The random process {μk} is assumed to be Markov with known transition probability (2) Pr(μk+1|μk).(2) The state and measurement noises are white, mutually independent and independent of the initial condition s0 described by ps0 so thatFootnote3 (3) p(w0:F,v0:F,s0)=ps0(s0)k=0Fpwk(wk)pvk(vk)(3) for any FT. Both the continuous part xk of the state sk and the discrete part μk are unknown and can be inferred indirectly through available yk and uk.

2.2. LSS decomposition

Since the AFD for the LSS Σ with a centralised architecture is not computationally tractableFootnote4, the decentralised or distributed architectures based on a decomposition of Σ were considered in Punčochář Straka (Citation2019); Straka Punčochář (Citation2019). The LSS Σ (see Figure ) consists of N subsystemsFootnote5 Σn,nN={1,2,,N} that are coupled through the stateFootnote6 xk. Each subsystem Σn has its own inputs ukn, outputs ykn, and a set of possible fault-free and faulty models Mn. It can be described by the following representation (4a) Σn:xk+1n=fn(xk,μkn,ukn)+Fn(μkn)wkn,(4a) (4b) Pr(μk+1n|μkn)(4b) (4c) ykn=hn(xkn,μkn)+Hn(μkn)vkn,(4c) where xknXnRDxn and μknMn={1,2,,Mn} are continuous and discrete parts, respectively, of the local stateFootnote7 skn=[(xkn)T,μkn]TSn=Xn×Mn of the subsystem Σn, uknUnRDun is the local input, wknXn is the local state noise described by pwkn, yknYnRDyn is the local output, vknYn is the local measurement noise described by pvkn. The functions fn:X×Mn×UnXn, hn:Xn×MnYn, Fn:MnXn×Xn, and Hn:MnYn×Yn are known. The discrete part μkn represents an index into the set Mn, which includes one model representing the behaviour of subsystem Σn in fault-free condition, μkn=1, and Mn1 models that represent the behaviour of subsystem in faulty conditions, μkn{2,,Mn}.The LSS Σ given by (Equation1a) and (Equation2) is assumed to satisfy the following independence conditions:

(IC-1)

The initial states x0n and the initial model indices μ0n are independent and mutually independentFootnote8, i.e. p(x0,μ0)=n=1Npx0n(x0n)Pr(μ0n).

(IC-2)

The model indices μkn are conditionally independent, i.e. Pr(μk+1|μk)=n=1NPr(μk+1n|μkn).

Figure 1. The decomposition of an LSS into interconnected subsystems.

Figure 1. The decomposition of an LSS into interconnected subsystems.

The independence conditions IC-1 and IC-2 mean that the occurrence of a fault in a subsystem does not influence the occurrence of faults in other subsystems. The condition IC-2 is considered for convenience purposes only to make the exposition clear. Note that the AFD problem with coupled faults was treated in Straka and Punčochář (Citation2020b) for conditionally dependent faults and in Straka and Punčochář (Citation2020c) for dependent faults. Relaxation of IC-2 would lead to an introduction of a central node in the on-line stage, whereas the off-line stage would not be affected.

In this paper, the subsystems are coupled only through the continuous state xk that appears in the dynamics (Equation4a) of all subsystems, i.e. besides xkn, the local continuous state xk+1n is affected by the local continuous states of other subsystems xkn¯=[(xk1)T,,(xkn1)T,(xkn+1)T,,(xkN)T]T. The AFD algorithm proposed in Punčochář and Straka (Citation2019) approximated the dynamics (Equation4a) by neglecting the coupling through xk in both stages and the AFD algorithm proposed in Straka and Punčochář (Citation2019) neglected the coupling through xk in the off-line stage, where the input signal generator is designed. Such approximation is acceptable if the coupling is weak, in which case the approximation has little effect on the FD quality. However, when the coupling is strongerFootnote9, it should be taken into account by both stages of the AFD design at least to a certain extent. Hence, the paper focuses on the distributed design that considers the coupling.

2.3. AFD problem

For convenience, the AFD problem and its general solution are described for the centralised architecture only. The AFD strives to design a function that transforms the complete available information observed up to the time step k to a decision dk about the faults (subsystem models) and to an input signal uk, which role is to excite the system to improve the detection quality. The AFD system can be described at any time instant as (5) Δ:[dkuk]=[σk(Ik)γk(Ik)],(5) where Ik=[y0:kT,u0:k1T]TI0:k=Yk+1×Uk. The vector dk=[dk1,dk2,,dkN]TM consists of the decisions dknMn about the model indices μkn, σk:I0:kM represents the AFD node decision generator at the time step k, uk is the excitation input and γk:I0:kU is a function describing the input signal generator. Note that by providing the decision dk, the AFD performs the fault detection and the fault identification simultaneously.

The optimal AFD system should minimise the following additive discounted criterionFootnote10 (6) J(σ0,γ0)=limFE{k=0FηkL(μk,dk)},(6) where η(0,1) is a chosen discount factor and L:M×MR+ is a detection cost function that allows different costs to be assigned for selecting the vector of decisions dk while the LSS behaviour is currently governed by the vector of model indices μk. The cost function is versatile and may stress the cost of missed detections or false alerts or even the cost of incorrect fault identification. This paper assumes that the costs are not related across the subsystems, and thus the following additive detection cost functionFootnote11 is used (7) L(μk,dk)=n=1NLn(μkn,dkn),(7) where Ln:Mn×MnR+ penalises discrepancy between the model index μkn and the decision dkn generated by the AFD system.

For an example of two models, if missed detection (μkn=2, dkn=1) and false alert (μkn=1, dkn=2) are perceived as equally bad by a designer, it is common to choose Ln(μkn,dkn)=1δμkn,dkn, where δi,j is the Kronecker delta. On the other hand, if a missed detection has more serious consequences compared to a false alarm, the choice Ln(μkn,dkn)={102μkn=2,dkn=11μkn=1,dkn=20μkn=dknspecifies the cost of the missed detection by several orders of magnitude higher than the cost of the false alert (Nelles, Citation2014).

3. General solution to AFD problem

The AFD problem formulation presented in the previous section belongs to the class of imperfect state information problems as only the history Ik is available for control or decision making instead of the state sk (Bertsekas, Citation2000). These problems are difficult to address directly for the infinite time horizon as the dimension of Ik increases without limit. A solution is to assume that the optimal AFD node can be split into an optimal state estimator that uses the Bayesian recursive relations (Bar-Shalom et al., Citation2001) to compute a conditional PDF p(sk|Ik) and a decision-making law that maps this conditional PDF into the input uk and decision dk (Bertsekas, Citation2000; Bertsekas & Shreve, Citation1996). Given the optimal state estimator, the original problem can be recast as a perfect state information problem where only the decision-making law is to be designed based on a new model consisting of the original model coupled with the optimal state estimator.

The conditional PDF p(sk|Ik) calculated by the state estimator can be represented exactly or approximately using a finite number of statistics. The statistics collected into an information state ξkK evolve in time as (8) ξk+1=ϕ(ξk,uk,yk+1),(8) where ϕ:K×U×YK represents the state estimator associated with the LSS Σ model. Here, the future output yk+1 is regarded as a random disturbance described by the conditional PDF p(yk+1|Ik,uk) and the initial condition ξ0 contains statistics describing the conditional PDF p(s0|y0).

Given the information state ξk, it suffices to consider a time invariant AFD node that is described at a time step kT as (9) Δ:[dkuk]=[σ¯(ξk)γ¯(ξk)],(9) where σ¯:KM and γ¯:KU are unknown functions to be sought. The detection cost function for the perfect state information model equivalent to L in (Equation6) can be shown (Bertsekas, Citation2000) to satisfy (10) L¯(ξk,dk)=n=1NμknLn(μkn,dkn)Pr(μkn|Ik).(10) Having the reformulated problem specification, the optimal AFD node is determined by the Bellman function, which can be computed off-line (Straka & Punčochář, Citation2019) because it depends only on the a priori known PDF p(xk+1|xk,μk,uk), transition probabilities Pr(μk+1|μk), measurement PDF p(yk|xk,μk), cost function Ld, and the discount factor η. Then, the optimal decisions and optimal inputs can be determined on-line by solving much simpler optimisation problems. More precisely, the Bellman function is required for the optimal input signal generator uk=γ¯(ξk) while the optimal decision generator dk=σ¯(ξk) works independently of the Bellman function (Straka & Punčochář, Citation2019). Thus, the AFD algorithm consists of two stages: the off-line stage involved with the design of the input signal generator, which uses the Bellman function, and the on-line stage connected with the state estimation, which generates the decisions and selects the optimal excitation according to the Bellman function.

The computation costs of the Bellman function calculation are extreme for LSSs as the dimension of the information state ξk can be very high even for low-dimensional state sk due to the number of subsystems and their models. Consider, for example, an LSS with N subsystems each described by Mn=2 models (one model for fault-free behaviour and one model for faulty behaviour). It leads to N-dimensional index μk and the information state ξk consists of 2N statistics of p(xkn|μkn,Ik,) and 2N1 probabilities Pr(μkn|Ik).

For this reason, the input signal generator design proposed in Punčochář and Straka (Citation2019) and Straka and Punčochář (Citation2019) uses solely the decentralised architectureFootnote12, which ignores the coupling among the subsystems and calculates the Bellman function for each subsystem separately using approximate models that are mutually isolated. Thus, each subsystem has its own information state of a significantly smaller dimension than the LSS information state and the computation of the Bellman function related to the subsystem is tractable.

The aim of the distributed design proposed in this paper is to take into account the coupling of the subsystems during the design of the input generator. Thus, the quality of the excitation signal and subsequently of the detection should be improved by using more information from other subsystems that are available in the distributed architecture. Such design comes with lower computational costs than the centralised design.

4. AFD with distributed design

The main idea of the AFD with distributed design is that the information about the LSS state communicated among the AFD nodes will be used not only to generate the decision (Straka & Punčochář, Citation2019) but also to generate the excitation input.

Distributed state estimation: First, the perfect state information model is constructed for the distributed architecture. The estimation algorithm in each AFD node Δn consists of four steps (Straka & Punčochář,Citation2019):

  • Prediction – calculation of p(skn|Ik1,uk1n);

  • Filtering – calculation of p¯(skn|Ik,n);

  • Merging – calculation of the approximation p(skn|Ik,n) of the filtering PDF p¯(skn|Ik,n) to prevent the computational and memory costs increase; and

  • Fusion – calculation of p(sk|Ik) by fusingFootnote13 the filtering estimates received from other nodes.

The symbol Ik,n denotes a composition of the past globalFootnote14 data Ik1 related to the whole system Σ and the present data ykn, and uk1n related to the subsystem Σn only, i.e. (11) Ik,n=[(Ik1)T,(ykn)T,(uk1n)T]T.(11) The estimation algorithm is illustrated in Figure  for two nodes Δ1 and Δ2.The dashed arrow loop represents the estimation algorithm, which can be expressed using the perfect state information model as (12) ξk+1=ϕ(ξk,uk,yk+1),(12) where ϕ:K×U×YK differs from ϕ in (Equation8) because the filtering, merging, and prediction steps are performed locally at the AFD node level.

Figure 2. Scheme of distributed AFD algorithm.

Figure 2. Scheme of distributed AFD algorithm.

Distributed AFD node : Two problems are associated with the model (Equation12). First, the dimension of the information state ξk is too large for the Bellman function calculation. Second, the Bellman function calculation for the nth node Δn requires running the estimation algorithms for all subsystems. The first problem will be addressed by aggregating the effects of other subsystems to reduce the order of the statistic. The second problem will be dealt with by a suitable global model approximation.

4.1. Reducing model order by aggregation

The dimension of the statistic ξk can be reduced if the dynamics function of each subsystem Σn can be decomposed as (13) fn(xk,μkn,ukn)=gn(xkn,μkn,ukn,an(xkn¯,μkn)),(13) where an:Xn¯×MnZn=RDzn is a function that aggregates the effects of other subsystems Σ1Σn1,Σn+1ΣN on the subsystem Σn, and gn:Xn×Mn×Un×ZnXn is a function that combines this aggregated effect with the local effects of Σn. A reduced order form of the model (Equation4a) for the subsystem Σn can be written as (14a) xk+1n=gn(xkn,μkn,ukn,zkn)+Fn(μkn)wkn,(14a) (14b) Pr(μk+1n|μkn),(14b) (14c) ykn=hn(xkn,μkn)+Hn(μkn)vkn,(14c) where zknZn is a new Dzn-dimensional random variable defined as (15) zkn=an(xkn¯,μkn).(15) Note that the dimension of all arguments of the function an is Dxn¯+1 and a reduction of the dimension is actually achieved only if Dzn is less than Dxn¯+1 for all subsystems.

Typical examples of subsystem dynamics that are particularly suitable in this regard are additive models. A nonlinear additive model has the following structure (16) fn(xk,μkn,ukn)=gn(xkn,μkn,ukn,an(xkn¯,μkn))(16) (17) =qn(xkn,μkn,ukn)+an(xkn¯,μkn),(17) where qn:Xn×Mn×UnXn represents the local effect and an with Dzn=Dxn aggregates effect of other subsystems. The additive linear model has an even simpler structure (18) fn(xk,μkn,ukn)=Ann(μkn)xkn+Bn(μkn)uknqn(xkn,μkn,ukn)+iNn¯Ain(μkn)xkian(xkn¯,μkn),(18) where Nn¯=N{n}, Ain(μkn)RDxn×Dxi, iN and Bn(μkn)RDxn×Dun are matrices related to the subsystem Σn. The linear additive model is attractive especially from a computational point of view because having the Gaussian conditional PDF of xkn¯, the aggregated effect (19) an(xkn¯,μkn)=iNn¯Ain(μkn)xki(19) can be represented exactly by the mean and covariance matrix for each μknMn.

To complete the model (Equation14a), the dynamics for the new random variable zkn could be defined using (Equation15) and the dynamics of other subsystems. Nevertheless, it is more convenient to consider the conditional PDF p(xkn,μkn,zkn|Ik), which can be obtained from p(xk,μkn|Ik) using (Equation15). The evolution of the conditional PDF p(xkn,μkn,zkn|Ik) can be described by the following model (20) p(xk+1n,μk+1n,zk+1n|Ik+1)=φDIS(p(xkn,μkn,zkn|Ik),uk,yk+1),(20) where φDIS:P×U×YP is a mapping that encapsulates the distributed estimation algorithm and the model of the LSS Σ. The dimension of the statistic needed to describe this conditional PDF is less than the dimension of statistic that would be needed without aggregating the effect of other subsystems. Nevertheless, the mapping φDIS is not suitable for the distributed design of the AFD because the inputs uk and observations yk+1 of the whole LSS are involved. This issue is treated in the following section.

 

4.2. Local approximation of the reduced order model

Since the dependence on inputs and observations of other subsystems in the model (Equation20) comes from the distributed estimation algorithm through zkn, the proposed approximation aims to neglect this influence by assuming that zkn is independent of Ik and its stochastic properties are time-invariant. The particular consequences of this approximation for the distributed estimation algorithm are as follows. The conditional PDF p(xkn,μkn,zkn|Ik) can be factorised using the conditional independence of xkn and zkn as (21) p(xkn,μkn,zkn|Ik)=p(xkn,zkn|μkn,Ik)Pr(μkn|Ik)=p(xkn|μkn,Ik)p(zkn|μkn,Ik)Pr(μkn|Ik)p(xkn|μkn,Ik)pzn|μn(zkn|μkn)Pr(μkn|Ik)=p(xkn,μkn|Ik)pzn|μn(zkn|μkn),(21) where pzn|μn is a known PDF of zkn conditioned by μkn. This conditional PDF approximates p(zkn|μkn,Ik) at all time steps, i.e. (22) p(zkn|μkn,Ik)pzn|μn(zkn|μkn).(22) Note that the time index is left out intentionally to emphasise that this conditional PDF is time-invariant. The prediction step of the distributed estimation algorithm uses this approximation to compute (23) p(xk+1n,μk+1n|Ik,ukn)=μknp(xk+1n|xkn,μkn,ukn,zkn)Pr(μk+1n|μkn)×p(xkn,μkn|Ik)pzn|μn(zkn|μkn)dxkndzkn,(23) where p(xk+1n|xkn,μkn,ukn,zkn) is given by (Equation14c) and the state noise PDF pwkn. The filtering and merging steps of the distributed estimation algorithm are performed without any change and result in the conditional PDF p(xk+1n,μk+1n|Ik+1,n). The fusion step does not need to be performed due to the independence induced by the approximation (Equation22).

The final local approximate model of a reduced order can be written using the statistic computed by the estimation algorithm. If the conditional PDF p(xkn,μkn|Ik,n) is described by the statistic ξknKn and the conditional PDF pzn|μn(zkn|μkn) is described by the statistic ζknZn, the local approximate model can be written as (24) [ξk+1nζk+1n]=[ϕDISn(ξkn,ζkn,ukn,yk+1n)ζkn],(24) where ϕDISn:Kn×Zn×Un×YnKn represents a modified distributed estimation algorithm with the reduced order model (Equation14).

Since the overall detection cost function (Equation7) is additive over individual subsystems, the detection cost function for the AFD node Δn is given as (25) L¯n(ξkn,ζkn,dkn)=μknLn(μkn,dkn)Pr(μkn|Ik).(25) The distributed AFD node for the subsystem Σn is a time invariant system that is described at a time step k as (26) Δn:[dknukn]=[σ¯DISn(ξkn,ζkn)γ¯DISn(ξkn,ζkn)],(26) where σ¯DISn:Kn×ZnMn and γ¯DISn:Kn×ZnUn are unknown functions. They can be designed in a way similar to Straka and Punčochář (Citation2019), which considered the AFD description (Equation9).

The proposed approximation can be interpreted as follows: In addition to the time-varying statistic ξkn, which represents an estimate of the local state, the Bellman function is also parametrised by time-invariant statistic ζkn that represents the influence of other subsystems. As a result, the Bellman function takes into account the coupling among the LSS subsystems.

4.3. Algorithm of AFD with distributed design

Now, each step of the algorithm is specified for a node Δn, with a special focus on the steps important for the distributed design. Details of other steps can be found in Straka and Punčochář (Citation2020a).

Assume: The prediction PDF p(skn|Ik1,uk1n) is available.

Filtering: Infer the filtering PDF p¯(skn|Ik,n) using the Bayesian rule.

Merging: The generalised pseudo-Bayesian method of the second-order (GPB2) (Watanabe & Tzafestas, Citation1993) is used to compute the approximation p(skn|Ik,n).

Fusion: The AFD nodes communicate their filtering estimates. The estimates received by the AFD node Δn are fused to obtain p(sk|Ik).

Statistic construction: The statistic ξkn corresponding to p(skn|Ik) and the statistic ζkn are computed from the statistic ξk for the fused PDF p(xk,μkn|Ik) using relation (Equation15).

Decision generation: The decision dkn is given by (Equation26).

Input generation: The input ukn is given by (Equation26).

Prediction: The prediction PDF p(sk+1n|Ik,ukn) is calculated using the Chapman–Kolmogorov equation.

5. Numerical illustration

The performance of the proposed distributed AFD is illustrated by means of two simple numerical examples.

Example 5.1

Consider the system Σ that consists of two coupled multiple-model linear subsystems Σn:xk+1n=An(μkn)xk+Bn(μkn)ukn+Fn(μkn)wkn,ykn=Cn(μkn)xkn+Hn(μkn)vkn,where n = 1, 2, xk=[xk1,xk2]TX=R2, both subsystems have two models (Mn=2) with A1(1)=[0.30.15],B1(1)=1,F1(1)=0.1,C1(1)=2,H1(1)=0.5,A1(2)=[0.50.1],B1(2)=1.5,F1(2)=0.1,C1(2)=1.5,H1(2)=0.5,A2(1)=[0.150.3],B2(1)=1,F2(1)=0.1,C2(1)=2,H2(1)=0.5,A2(2)=[0.150.5],B2(2)=1.5,F2(2)=0.1C2(2)=1.5,H2(2)=0.5.The models μk1=1 and μk2=1 represent the fault-free behaviour and the models μk1=2 and μk2=2 represent the faulty behaviour. The transition probabilities of the models for each subsystem are given in Table .

Table 1. Transition probabilities of the models.

The state noises wkn and measurement noises vkn have all standard Gaussian PDF pwkn=pvkn=N{0,1}. The initial condition x0 has Gaussian PDF N{0,0.01I} and initial μ0 has probability P(μ0=[11]T)=1, which means that each subsystem is fault-free at the beginning. The admissible inputs of subsystems are U1=U2={0.5,0.5}. The detection cost function Ln penalises missed detections and false alerts equally using the zero-one function (28) Ln(μkn,dkn)=1δμkn,dkn,(28) The discount factor is η=0.9.

Performance of the following algorithms is analysed:

  • passive FD (PFD) with random input generator with Pr(ukn=0.5)=Pr(ukn=0.5)=0.5 with both decentralised and distributed estimation.

  • PFD with switching input generator ukn=0.5sign(sin(0.2k)) with both decentralised and distributed estimation.

  • AFD with decentralised design and decentralised estimation proposed in Straka and Punčochář (Citation2019), which neglects the coupling in both stages.

  • AFD with decentralised design and distributed estimation proposed in Straka and Punčochář (Citation2019), which neglects the coupling in the off-line stage only.

  • The proposed distributed AFD (see Section 4) that respects the coupling among the subsystems during both on-line and off-line stages.

The state estimation was carried out by a bank of Kalman filters. The dimension of the information state was (29) Dξn=Mn(Dxn+Dxn(Dxn+1)2)+Mn1,(29) where (Dxn+Dxn(Dxn+1)2) is the dimension of the sufficient statistics for p(xn|Ik,μkn) consisting of Dxn-dimensional mean and Dxn(Dxn+1)2 elements of the covariance matrix and Mn1 is the number of probabilities Pr(μkn|Ik) in the information state. For the decentralised design, the information state of each AFD node has dimension Dξn=5 while for the distributed design the information states were Dξn=7 dimensional with the increase caused by the statistic ζkn aggregating the coupling effect. It should be noted, that the centralised design of the AFD would require a single Dξ=23 dimensional information state, which means immense memory requirements even for such a simple system.

The Bellman function was calculated by the value iteration algorithm over a grid of discrete information states. The distributed design used the grid HDIS=A×A×B×B×C×A¯×B¯, where

  • A=[1.7:0.5:1.7] is the grid for the conditional mean E[xkn|Ik],

  • B={0.01,0.03,0.05,0.5,1} is the grid for the conditional variance var[xkn|Ik],

  • C=[0:0.05:1] is the grid for the probability of the first model Pr(μkn=1),

  • A¯=A is the grid for the conditional mean of the effect of the other subsystem E[zkn|Ik], and

  • B¯=B is the grid for the conditional variance of the effect of the other subsystem var[zkn|Ik].

Thus, the grids A¯ and B¯ are related to the aggregated effect of other subsystems ζkn. The decentralised design used the grid HDEC=A×A×B×B×C. The number of discrete states was 900375 for the grid of HDIS and 25725 for the grid HDEC.

The performance of the algorithms was evaluated using 105 Monte Carlo (MC) simulations where each MC simulation was run over the finite time horizon F = 400. The estimate J^ of the criterion (Equation6) obtained by the MC simulations, the probabilities of missed detection (PMD) and false alerts (PFA), and time requirementsFootnote15 Tonline of a single MC run are given in Table .

Table 2. Performance of decentralised and distributed PFD and AFD architectures for Example 5.1.

By comparing the results, it is clear that AFD achieves performance superior to PFD. Within the PFD algorithms, the switching input generator excites the system better than the random input generator. Also, when using the distributed estimation, the detection quality is better than when using the decentralised estimation. Within the AFD algorithms, the best detection quality (in terms of the criterion, missed detections and false alerts) is achieved by the proposed distributed AFD algorithm. The values achieved by the decentralised and distributed estimation confirm that it is worth respecting the coupling during the on-line stage of the algorithm even if the Bellman function calculation ignores the coupling.

The computational times of the on-line stage of the algorithms indicate that using the decentralised estimation is computationally cheaper than using the distributed estimation because it does not execute the fusion step. When analysing the computational times of the proposed algorithm, it can be seen that the usage of the extended information state is computationally very cheap with respect to other steps of the on-line stage.

Example 5.2

This example is adapted from Harirchi et al. (Citation2017). An apartment consisting of four rooms equipped with a radiant heating system is considered. The system can be described by a linear state-space continuous-time model c1T˙1(t)=kr,1(Tc,1(t)T1(t))+k1(Ta(t)T1(t))+j{2,3}k1j(Tj(t)T1(t)),c2T˙2(t)=kr,2(Tc,2(t)T2(t))+k2(Ta(t)T2(t))+j{1,4}k2j(Tj(t)T2(t)),c3T˙3(t)=k3(Ta(t)T3(t))+j{1,4}k3j(Tj(t)T3(t)),c4T˙4(t)=kr,4(Tc,4(t)T4(t))+k4(Ta(t)T4(t))+j{2,3}k4j(Tj(t)T4(t)),where the list of parameters and variables is given in Table  and the parameter values are k1=k2=k3=12.1, k4=11.9, kr,1=kr,2=kr,3=kr,4=10.125, k12=k21=k13=k31=k34=k43=10.16, k24=k42=10.20, c1=c2=1800, c3=2000, c4=2100.

Table 3. Building parameters and variables.

The continuous-time model was discretised with a sampling period of 5 min and an error-state model was set up with a separate input, an independent state noise, and direct measurement of the room air temperature for each room. The steady-state values were T1()=22.93 T2()=23.04, T3()=22.44, T4()=22.89, Tc,n()=24, Ta()=10. The fault-free model is then Σn:xk+1n=An(μkn)xk+Bn(μkn)ukn+Fn(μkn)wkn,ykn=Cn(μkn)xkn+Hn(μkn)vkn,with xkn=Tn(tk)Tn(), ukn=Tc,n(tk)Tc,n(), noise wkn=Ta(tk)Ta() being the deviation from a nominal ambient air temperature modelled as Gaussian random variable pwkn=N{0,1}, A1(1)=[0.0910.0660.1230.067],B1(1)=0.430,F1(1)=0.026,C1(1)=1,H1(1)=0.01,A2(1)=[0.0660.0820.0761.073],B2(1)=0.437,F2(1)=0.026,C2(1)=1,H2(1)=0.01,A3(1)=[0.1110.0680.2560.134],B3(1)=0,F3(1)=0.036,C3(1)=1,H3(1)=0.01,A4(1)=[0.0580.0620.1280.123],B4(1)=0.423,F4(1)=0.028,C4(1)=1,H4(1)=0.01.The measurement noise has standard Gaussian PDF pvkn=N{0,1}, the initial condition has Gaussian PDF N{0,0.1I}. The faulty behaviour represented sensor faults Cn(2)=0.9, An(1)=An(2), Bn(1)=Bn(2), Fn(1)=Fn(2), Hn(1)=Hn(2), n=1,,4. The matrix of transition probabilities of the models was Pr(μk+1n|μkn)=[0.9900.011] and the LSS behaviour was initially fault-free. The admissible inputs of subsystems are Un={0.5,0.5},n=1,2,4. The detection cost function Ln is the zero-one function (Equation28). The discount factor is η=0.98.

The state estimation was again carried out by a bank of the Kalman filters. The Bellman function was calculated by the value iteration algorithm over grids of discrete information states with HDIS=A×A×B×B×C×A¯×B¯ and HDEC=A×A×B×B×C where

  • A=[0.4:0.1:0.4] is the grid for the conditional mean E[xkn|Ik],

  • B={0.85,0.9,0.95,1.05,1.1,1.15}×104 is the grid for the conditional variance var[xkn|Ik],

  • C=[0:0.05:1] is the grid for the probability of the first model Pr(μkn=1),

  • A¯=[0.04:0.01:0.04] is the grid for the conditional mean of the effect of the other subsystem E[zkn|Ik], and

  • B¯=B is the grid for the conditional variance of the effect of the other subsystem var[zkn|Ik].

The number of discrete states was 3, 306, 744 for the grid HDIS and 61, 236 for the grid HDEC. Here, it should be emphasised, that the centralised architecture would require a single Dξ=239 dimensional information state, whereas the decentralised and distributed architectures require four Dξn=5 dimensional and Dξn=7 dimensional information states, respectively.

The performance of the algorithms was evaluated using the 104 Monte Carlo (MC) simulations where each MC simulation was run over the finite time horizon F = 400, i.e. 2000 min. The estimate J^ of the criterion (Equation6) obtained by the MC simulations, the probabilities of missed detection (PMD) and false alerts (PFA), and time requirements Tonline of a single MC run are given in Table .

Table 4. Performance of decentralised and distributed PFD and AFD architectures for Example 5.2.

The results confirm superior performance of the AFD algorithms. Among them, the best performance is achieved by the proposed distributed AFD. The reason lies again in the utilisation of the additional information from the other nodes. The results also indicate that using the distributed architecture at least in the on-line stage improves the detection quality. When analysing the computational costs, it follows that the costs are affected by (i) the architecture used by the on-line stage (decentralized architecture neglecting the coupling is cheaper than the distributed architecture, in which the nodes must fuse the estimates obtained from other nodes) and (ii) the dimension of the Bellman function representation as the proposed AFD with distributed design leads to its higher dimension and consequently to computational costs related to the manipulations with the Bellman function.

6. Conclusion

The paper dealt with active fault diagnosis for large-scale systems that can be decomposed into subsystems with separate measurements and inputs. The subsystem behaviour was described by stochastic multiple models representing the fault-free and faulty behaviour. The inputs were utilised by the AFD to excite the system to achieve better fault detection. An AFD algorithm was developed by means of the distributed design so that the coupling among the subsystems is taken into account in both the on-line and off-line stages. The excitation generated by the proposed distributed AFD algorithm leads to better detection of the faults appearing in the subsystems. The improved performance of the proposed AFD algorithm was confirmed by two numerical examples, which also showed that the computational costs of the on-line part of the algorithm are comparable with the AFD algorithms with decentralised design and the price for the improved performance is paid by the increased memory requirements of the Bellman function representation. Nevertheless, the memory requirements are still significantly lower than for the centralised design.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Additional information

Funding

This work was supported by the Czech Science Foundation, project no. GA18-08531S.

Notes on contributors

Ondřej Straka

Ondřej Straka received the master's degree in cybernetics and control engineering and the Ph.D. degree in cybernetics from the University of West Bohemia, Pilsen, Czech Republic, in 1998 and 2004, respectively. Since 2015, he has been an Associate Professor with the Department of Cybernetics, University of West Bohemia. He is the Head of the Identification and Decision Making Research Group (IDM), NTIS–New Technologies for the Information Society. He has participated in a number of projects of fundamental research and in several project of applied research (e.g. GNSS- based safe train localisation and attitude and heading reference system). He was involved in the development of several software frameworks for nonlinear state estimation and system identification. He has published over 70 journal and conference papers in journals, such as Automatica, the IEEE Transactions on Automatic Control, the IEEE Transactions on Aerospace and Electronic Systems, the IEEE Transactions on Cybernetics, and Signal Processing and at international conferences such as American Control Conference, World Congresses and Symposia of the IFAC, and FUSION Conferences. His current research interests include local and global nonlinear state estimation methods, system identification, performance evaluation, and fault detection. Dr. Straka was a recipient of Werner von Siemens Excellence Award in 2014 for the most important result in the basic research.

Ivo Punčochář

Ivo Punčochář received the master's degree in cybernetics and control engineering and the Ph.D. degree in cybernetics from the University of West Bohemia, Pilsen, Czech Republic, in 2003 and 2008, respectively. Since 2014 he is a senior researcher at the research centre New Technologies for the Information Society at the University of West Bohemia. He is also a member of the Identification and Decision Making Research Group established there. He has participated in several projects of fundamental and applied research that dealt with fault detection, state estimation, and GNSS-based safe positioning. He has published over 30 conference and journal papers. His primary research interests include active fault detection, optimal stochastic control and global navigation satellite systems. He was a member of team that received the Werner von Siemens Excellence Award in 2014 for the most important outcome of basic research.

Notes

1 As will be shown in the numerical example.

2 The functions are assumed to be Borel measurable.

3 The variable denoted xi:j=[xiT,xi+1T,,xjT]T with j>i stands for the whole sequence of variables from time i to time j stacked into a column vector.

4 The centralised architecture is connected with a high-dimensional information state (see Section 3), which prohibits design of the input generator.

5 A symbol with the superscript pertains to the corresponding subsystem (e.g. Σn), whereas a symbol without the superscript relates to the whole LSS (e.g. Σ).

6 The quantities of the LSS are compositions of the quantities of its subsystems as xk=[(xk1)T,,(xkN)T]T, μk=[μk1,,μkN]T, yk=[(yk1)T,,(ykN)T]T, uk=[(uk1)T,,(ukN)T]T, vk=[(vk1)T,,(vkN)T]T, and wk=[(wk1)T,,(wkN)T]T.

7 Note that skn is called local state for convenience, even though it is not technically a state of the subsystem due to the coupling.

8 With a slight abuse of terminology, the function p(sk)=p(xk,μk) will be called PDF although μk is a discrete random variable. A more formal notation would require using the cumulative distribution function or the Dirac delta function instead of the PDF.

9 For example, neglecting the energy flow between the subsystems would dramatically worsen the model quality.

10 The operator E{} denotes the expectation over all involved random variables.

11 The cost function is assumed to be lower semi-analytic.

12 The algorithms differ in the architecture used for the on-line stage.

13 If the information from other nodes is not available (e.g. due to communication problems), approximate models have to be used, which neglect the coupling. This corresponds to the (partially) decentralised architecture.

14 It should be reminded that this global information was obtained by the AFD nodes during the information communication and subsequent fusion at the previous time instant k−1.

15 All the numerical simulations in the paper were performed using the R2019a version of Matlab® software running on the PC equipped with Intel® CoreTM i7–4790 CPU (3.60 [GHz]). Note that although the on-line stages should run in parallel at each AFD node, they were executed sequentially for all nodes in the simulations.

References

  • Ashari, A. E., Nikoukhah, R., & Campbell, S. L. (2012). Active robust fault detection in closed-loop systems: Quadratic optimization approach. IEEE Transactions on Automatic Control, 57(10), 2532–2544. https://doi.org/10.1109/TAC.2012.2188430
  • Bar-Shalom, Y., Li, X. R., & Kirubarajan, T. (2001). Estimation with applications to tracking and navigation. John Wiley & Sons.
  • Bertsekas, D. P. (2000). Dynamic programming and optimal control (2nd ed., Vol. I). Athena Scientific.
  • Bertsekas, D. P., & Shreve, S. E. (1996). Stochastic optimal control: The discrete-time case. Athena Scientific.
  • Blackmore, L., Rajamanoharan, S., & Williams, B. C. (2008). Active estimation for jump Markov linear systems. IEEE Transactions on Automatic Control, 53(10), 2223–2236. https://doi.org/10.1109/TAC.2008.2006100
  • Blanke, M., Kinnaert, M., Lunze, J., & Staroswiecki, M. (2016). Diagnosis and fault-tolerant control (3rd ed.). Springer-Verlag.
  • Campbell, S. L., & Nikoukhah, R. (2004). Auxiliary signal design for failure detection. Princeton University Press.
  • Ferrari, R. M. G., Parisini, T., & Polycarpou, M. M. (2012). Distributed fault detection and isolation of large-scale discrete-time nonlinear systems: An adaptive approximation approach. IEEE Transactions on Automatic Control, 57(2), 275–290. https://doi.org/10.1109/TAC.2011.2164734
  • Gustafsson, F. (2009). Automotive safety systems. IEEE Signal Processing Magazine, 26(4), 32–47. https://doi.org/10.1109/MSP.2009.932618
  • Harirchi, F., Yong, S. Z., Jacobsen, E., & Ozay, N. (2017). Active model discrimination with applications to fraud detection in smart buildings. IFAC Papers Online, 50(1), 9527–9534. https://doi.org/10.1016/j.ifacol.2017.08.1616. (20th IFAC World Congress)
  • Heirung, T. A. N., & Mesbah, A. (2019). Input design for active fault diagnosis. Annual Reviews in Control, 47(9), 35–50. https://doi.org/10.1016/j.arcontrol.2019.03.002
  • Isermann, R. (2011). Fault-diagnosis applications. Springer.
  • Katipamula, S., & M. R. Brambley (2011). Review article: Methods for fault detection, diagnostics, and prognostics for building systems – a review, part II. HVAC&R Research, 11(2), 169–187. https://doi.org/10.1080/10789669.2005.10391133
  • Nelles, O. (2014). Nonlinear system identification: From classical approaches to neural networks and fuzzy models. Springer.
  • Niemann, H., & Poulsen, N. K. (2014, June). Active fault detection in MIMO systems. In Proceedings of the 2014 American Control Conference (pp. 1975–1980). Institute of Electrical and Electronics Engineers Inc.
  • Niemann, H. H. (2006). A setup for active fault diagnosis. IEEE Transactions on Automatic Control, 51(9), 1572–1578. https://doi.org/10.1109/TAC.2006.878724
  • Paulson, J. A., Martin-Casas, M., & Mesbah, A. (2017). Input design for online fault diagnosis of nonlinear systems with stochastic uncertainty. Industrial & Engineering Chemistry Research, 56(34), 9593–9605. https://doi.org/10.1021/acs.iecr.7b00602
  • Punčochář, I, & Straka, O. (2019). Non-centralized active fault diagnosis for stochastic systems. In 2019 American control conference. Institute of Electrical and Electronics Engineers Inc.
  • Punčochář, I., Široký, J., & Šimandl, M. (2015). Constrained active fault detection and control. IEEE Transactions on Automatic Control, 60(1), 253–258. https://doi.org/10.1109/TAC.2014.2326274
  • Raimondo, D. M., Boem, F., Gallo, A., & Parisini, T. (2016). A decentralized fault-tolerant control scheme based on active fault diagnosis. In Proceedings of the 55th IEEE conference on decision and control (pp. 2164–2169). Institute of Electrical and Electronics Engineers Inc.
  • Raimondo, D. M., G. R. Marseglia, Braatz, R. D., & Scott, J. K. (2016). Closed-loop input design for guaranteed fault diagnosis using set-valued observers. Automatica, 74(2), 107–117. https://doi.org/10.1016/j.automatica.2016.07.033
  • Scott, J. K., Findeisen, R., Braatz, R. D., & Raimondo, D. M. (2014). Input design for guaranteed fault diagnosis using zonotopes. Automatica, 50(6), 1580–1589. https://doi.org/10.1016/j.automatica.2014.03.016
  • Škach, J., Punčochář, I., & Lewis, F. L. (2016). Optimal active fault diagnosis by temporal-difference learning. In Proceedings of the 55th IEEE conference on decision and control (pp. 2146–2151). Institute of Electrical and Electronics Engineers Inc.
  • Stoustrup, J., & Niemann, H. H. (2010). Active fault diagnosis by controller modification. International Journal of Systems Science, 41(8), 925–936. https://doi.org/10.1080/00207720903470197
  • Straka, O., & Punčochář, I. (2019). Decentralized and distributed active fault diagnosis for stochastic systems with indirect observations. In 22nd international conference on information fusion. Institute of Electrical and Electronics Engineers Inc.
  • Straka, O., & Punčochář, I. (2020a). Decentralized and distributed active fault diagnosis based on interactive multiple models. International Journal of Applied Mathematics and Computer Science, 30(2), 239–249. https://doi.org/10.34768/amcs-2020-0019
  • Straka, O., & Punčochář, I. (2020b). Distributed active faults diagnosis for systems with conditionally dependent faults. IFAC Papers Online, 53(2), 13613–13618. https://doi.org/10.1016/j.ifacol.2020.12.857
  • Straka, O., & Punčochář, I. (2020c). Hierarchical active fault diagnosis for stochastic large scale systems with coupled faults. In 2020 IEEE 23rd international conference on information fusion (p. 1–8). Institute of Electrical and Electronics Engineers Inc.
  • Watanabe, K., & Tzafestas, S. G. (1993). Generalized pseudo-Bayes estimation and detection for abruptly changing systems. Journal of Intelligent and Robotic Systems, 7(1), 95–112. https://doi.org/10.1007/BF01258214
  • Yao, L., Li, L., Guan, Y., & Wang, H. (2019). Fault diagnosis and fault-tolerant control for non-Gaussian nonlinear stochastic systems via entropy optimisation. International Journal of Systems Science, 50(13), 2552–2564. https://doi.org/10.1080/00207721.2019.1671535