Search in:

Mathematical and Computer Modelling of Dynamical Systems

Methods, Tools and Applications in Engineering and Related Sciences

Volume 24, 2018 - Issue 3

Submit an article Journal homepage

Free access

589

Views

CrossRef citations to date

Altmetric

Listen

Original Articles

A new framework for H2-optimal model reduction

Alessandro CastagnottoChair of Automatic Control, Technical University of Munich, GarchingCorrespondence[email protected]

http://orcid.org/0000-0002-8251-6275 View further author information

Boris LohmannChair of Automatic Control, Technical University of Munich, GarchingView further author information

Pages 236-257 | Received 21 Sep 2017, Accepted 09 Apr 2018, Published online: 25 Apr 2018

Cite this article
https://doi.org/10.1080/13873954.2018.1464030
CrossMark

In this article

ABSTRACT
1. Introduction
2. Preliminaries
3. The cost of H2-optimal reduction
4. A new framework for H2-optimal reduction
5. Numerical results
6. Application to different system classes
7. Conclusions and outlook
Supplemental material
Disclosure statement
Additional information
Footnotes
References

Full Article
Figures & data
References
Supplemental
Citations
Metrics
Reprints & Permissions
View PDF PDF

Formulae display: $MathJax Logo$ ?Mathematical formulae have been encoded as MathML and are displayed in this HTML version using MathJax in order to improve their display. Uncheck the box to turn MathJax off. This feature requires Javascript. Click on a formula to zoom.

ABSTRACT

In this contribution, a new framework for $H_{2}$ -optimal reduction is presented, motivated by the local nature of both (tangential) interpolation and $H_{2}$ -optimal approximations. The main advantage is given by a decoupling of the cost of reduction from the cost of optimization, resulting in a significant speedup in $H_{2}$ -optimal reduction. In addition, a middle-sized surrogate model is produced at no additional cost and can be used e.g. for error estimation. Numerical examples illustrate the new framework, showing its effectiveness in producing $H_{2}$ -optimal reduced models at a far lower cost than conventional algorithms. Detailed discussions and optimality proofs are presented for applying this framework to the reduction of multiple-input, multiple-output linear dynamical systems. The paper ends with a brief discussion on how this framework could be extended to other system classes, thus indicating how this truly is a general framework for interpolatory $H_{2}$ reduction.

KEYWORDS:

Large scale
model reduction
model reduction and model simplification
$H_{2}$ -optimal reduction

View correction statement:

Erratum

1. Introduction

Recent advances in the development of complex technical systems are partly driven by the advent of software tools that allow their computerized modelling and analysis, drastically reducing the resources required during development. Using information about the system’s geometry, material properties and boundary conditions, a dynamical model can be derived in an almost automated way, allowing for design optimization and verifications using a virtual prototype. Depending on the complexity of the model at hand and the required accuracy during investigation, such models quite easily reach a high complexity. As a consequence, simulations and design optimizations may require excessive computational resources or even become unfeasible. In addition, in some applications the models are required to be evaluated in real-time, e.g. in embedded controllers or digital twins. In this latter case, the computational resources available are particularly limited.

Model reduction is an active field of research aimed at finding numerically efficient algorithms to construct low-order, high-fidelity approximations to the high-order models . Amongst all, the most prominent and numerically tractable methods for linear systems include approximate balanced truncation [Citation1-Citation3] and rational interpolation [Citation4,Citation5]. As these methods generally require only the solution of large sparse systems of equations, they can be applied efficiently even to models of very high order. In a context where the admissible model order is low, e.g. in real-time applications, it is of particular interest to find the best possible approximation for a given order. This problem has been addressed in terms of different error norms (see [Citation6] for Hankel norm, [Citation7] for $H_{\infty}$ norm), but only for the case of $H_{2}$ -optimal approximations there exist algorithms [Citation8–Citation11] that are both numerically tractable and satisfy optimality conditions. $H_{2}$ -optimal reduction methods are iterative, therefore their numerical efficiency is diminished at slow convergence.

In this contribution, we describe a new reduction framework to increase the numerical efficiency of $H_{2}$ -optimal reduction methods. In this framework, firstly introduced in [Citation12] and further developed in [Citation13] for single-input, single-output (SISO) linear time-invariant (LTI) models, $H_{2}$ -driven optimization is performed in a reduced subspace, lowering the cost with respect to conventional methods. Through the update of this subspace, optimality of the resulting reduced-order model can be established. In this paper we generalize the framework, proving its validity for linear time-invariant system with multiple-inputs and multiple-outputs (MIMO) and indicating possible extensions to further system classes. By applying this framework for $H_{2}$ -optimal reduction, substantial speedup can be achieved. In addition, this framework bears the advantage of producing a Model Function, i.e. a middle-sized surrogate model, at no additional cost. We will give first indications on how to exploit this Model Function in further applications and outline current research endeavours towards this direction.

The remainder of the paper is structured as follows: Section 2 briefly revises $H_{2}$ -optimal reduction for MIMO LTI systems, whereas in Section 3 we analyse the computational cost tied to these methods. Section 4 presents the main result of this contribution, i.e. a new framework for $H_{2}$ -optimal reduction. Section 5 will compare conventional $H_{2}$ -optimal reduction to the new framework in numerical examples and show the potential for significant speedup in reduction time. In Section 6, we indicate how to apply this framework to $H_{2}$ -optimal approaches for different system classes, motivating its general nature. Finally, Section 7 will summarize and conclude the discussion.

2. Preliminaries

2.1. Model reduction by tangential interpolation

Linear dynamical systems are often described by state-space models of the form

(1)

\begin{matrix} E \dot{x} (t) & = A x (t) + B u (t) \\ y (t) & = C x (t) \end{matrix}\} Σ

(1)

where $E \in R^{N \times N}$ is the regular descriptor matrix, $A \in R^{N \times N}$ is the system matrix and $x \in R^{N}$ , $u \in R^{m}$ , $y \in R^{p}$ ( $p, m ≪ N$ ) represent the state, input and output of the system, respectively. $Σ$ denotes the system (1) by its state-space representation. The input-output behaviour of a linear system (1) can be characterized in the frequency domain by $y (s) = G (s) u (s)$ , with the rational transfer function matrix

(2)

G (s) : = C {(s E - A)}^{- 1} B .

(2)

The construction of a reduced-order model (ROM) from the full-order model (FOM) (1) can be obtained by means of a Petrov-Galerkin projection

(3)

\begin{matrix} \overset{E_{r}}{\overset{⏞}{W^{T} E V}} {\dot{x}}_{r} (t) & = \overset{A_{r}}{\overset{⏞}{W^{T} A V}} x_{r} (t) + \overset{B_{r}}{\overset{⏞}{W^{T} B}} u (t) \\ y_{r} (t) & = \underset{C_{r}}{\underset{⏟}{C V}} x_{r} (t) \end{matrix}\} Σ_{r}

(3)

with projection matrices $V, W \in C^{N \times n}$ and reduced state vector $x_{r} \in R^{n}$ $(n ≪ N)$ . We will use the shorthand notation $Σ_{r} = W^{T} Σ V$ to specify the projection matrices used.

The primary goal of model reduction in the following will be the approximation of the output $y (t) \approx y_{r} (t)$ for all admissible inputs $u (t)$ . This is equivalent to approximating the transfer function $G (s) \approx G_{r} (s)$ . To achieve this goal, the appropriate design of projection matrices V,W becomes the primary task of model reduction. In this contribution, we will choose projection matrices that achieve bitangential Hermite interpolation according to following result.

Theorem 2.1 (Bitangential Hermite Interpolation [Citation5,Citation14]) Consider a full-order model $Σ$ as in (1) with transfer function $G (s)$ and let scalar frequencies $σ_{i} \in C$ and vectors $r_{i} \in C^{m}$ , $l_{j} \in C^{p}$ be given such that $σ_{i} E - A$ is nonsingular for $i = 1, \dots, n$ . Consider a reduced-order model $Σ_{r}$ as in (3) with transfer function $G_{r} (s)$ , obtained through projection $Σ_{r} = W^{T} Σ V$ . Let $R (V)$ denote the range of the matrix $V$ .

(1) If

(4)

(A - σ_{i} E)^{- 1} B r_{i} \in R (V), i = 1, \dots, n

(4)

then $G (σ_{i}) r_{i} = G_{r} (σ_{i}) r_{i}$ .

(2) If

(5)

(A - σ_{i} E)^{- T} C^{T} l_{i} \in R (W), i = 1, \dots, n

(5)

then $l_{i}^{T} G (σ_{i}) = l_{i}^{T} G_{r} (σ_{i})$ .

(3) If both (4) and (5) hold, then

(6)

l_{i}^{T} G^{'} (σ_{i}) r_{i} = l_{i}^{T} G_{r^{'}} (σ_{i}) r_{i}, i = 1, \dots, n,

(6)

where $G^{'} (s)$ denotes the first derivative with respect to $s$ .

In addition, note that it is possible to tangentially interpolate higher order derivatives $G^{(k)} (s)$ at frequencies $σ_{i} \in C$ by spanning appropriate Krylov subspaces [Citation4,Citation14]. In general, Krylov subspaces are defined through a matrix $M \in C^{N \times N}$ , a vector $v \in C^{N}$ and a scalar $n \in N$ as follows:

(7)

K_{n} (M, v) = R ([\begin{matrix} v & M v & M^{2} v & \dots & M^{n - 1} v \end{matrix}]) .

(7)

Theorem 2.2 (Tangential Moment Matching [Citation4,Citation14]) Consider a full-order model $Σ$ as in (1) with transfer function $G (s)$ and let a scalar frequency $σ \in C$ and nonzero vectors $r \in C^{m}$ , $l \in C^{p}$ be given, such that $σ E - A$ is nonsingular. Consider a reduced-order model $Σ_{r}$ as in (3) with transfer function $G_{r} (s)$ , obtained through the projection $Σ_{r} = W^{T} Σ V$ .

(1) If

(8)

K_{n} ({(A - σ E)}^{- 1} E, (A - σ E)^{- 1} B r) \subseteq R (V),

(8)

then $G^{(i)} (σ) r = G_{r}^{(i)} (σ) r$ for $i = 0, \dots, n - 1$ .

(2) If

(9)

K_{n} ({(A - σ E)}^{- T} E^{T}, (A - σ E)^{- T} C^{T} l) \subseteq R (W),

(9)

then $l^{T} G^{(i)} (σ) = l^{T} G_{r}^{(i)} (σ)$ for $i = 0, \dots, n - 1$ .

(3) If both (8) and (9) hold, then

(10)

l^{T} G^{(i)} (σ) r = l^{T} G_{r}^{(i)} (σ) r, i = n, \dots, 2 n - 1.

(10)

Any bases V and W satisfying (4) and (5) are valid; for theoretical considerations, primitive bases are of particular interest, as defined in the following.

Definition 2.3 (Primitive bases) Consider a full-order model $Σ$ as in (1). Let interpolation frequencies $σ_{i} \in C$ and tangential directions $r_{i} \in C^{m}$ and $l_{i} \in C^{p}$ be given such that $A - σ_{i} E$ is invertible for all $i = 1, \dots, n$ . Then the primitive projection bases $V^{P}$ , $W^{P}$ are defined as

(11a)

V^{P} = [{(A - σ_{1} E)}^{- 1} B r_{1}, \dots, {(A - σ_{n} E)}^{- 1} B r_{n}]

(11a)

(11b)

W^{P} = [{(A - σ_{1} E)}^{- T} C^{T} l_{1}, \dots, {(A - σ_{n} E)}^{- T} C^{T} l_{n}]

(11b)

From a numerical standpoint, $V$ and $W$ are preferably (bi-)orthogonal, real bases, improving the conditioning and resulting in a $Σ_{R}$ with real matrices. Finally, note that the primitive bases (11) can be defined as solutions of particular generalized dense-sparse Sylvester equations.

Lemma 2.4 ([Citation14,Citation15]) Consider a full-order model $Σ$ as in (1). Let interpolation frequencies $σ_{i} \in C$ and tangential directions $r_{i} \in C^{m}$ be given such that $A - σ_{i} E$ is invertible for all $i = 1, \dots, n$ . Define matrices $S^{P} = d i a g (σ_{1}, \dots, σ_{n})$ , $R^{P} = [r_{1}, \dots, r_{n}]$ . Then the primitive basis $V^{P}$ satisfying (11a) solves the generalized sparse-dense Sylvester equation

(12)

A V^{P} - E V^{P} S^{P} - B R^{P} = 0.

(12)

This relationship is particularly useful for theoretical considerations and will be exploited in Section 4 in the proofs. A dual result for $W^{P}$ holds as well.

Lemma 2.5 Consider a full-order model $Σ$ as in (1). Let interpolation frequencies $σ_{i} \in C$ and tangential directions $l_{i} \in C^{p}$ be given such that $A - σ_{i} E$ is invertible for all $i = 1, \dots, n$ . Define matrices $S^{P} = d i a g (σ_{1}, \dots, σ_{n})$ , $L^{P} = [l_{1}, \dots, l_{n}]$ . Then the primitive basis $W^{P}$ satisfying (11b) solves the generalized sparse-dense Sylvester equation

(13)

A^{T} W^{P} - E^{T} W^{P} S^{P} - C^{T} L^{P} = 0.

(13)

2.2. $ℋ_{2}$ -optimal reduction

For the design of V and W as of Section 2.1, an appropriate choice for the interpolation frequencies ${\{σ_{i}\}}_{i = 1}^{n}$ and tangential directions ${\{r_{i}\}}_{i = 1}^{n}$ , ${\{l_{i}\}}_{i = 1}^{n}$ needs to be made. This is a non-trivial task, as the inspection of the FOM is a computationally challenging task in the large-scale setting. For this reason, an automatic selection of reduction parameters minimizing the approximation error $∥ G - G_{r} ∥_{}$ for some chosen norm is highly desirable.

In this contribution, we address the problem of finding an optimal ROM of prescribed order $n$ that minimizes the error measured in the $H_{2}$ norm, i.e.

(14)

G_{r} = arg min_{d e g {\hat{G}}_{r} = n} ∥ G - {\hat{G}}_{r} ∥_{H_{2}},

(14)

where the $H_{2}$ norm is defined as [Citation16]

(15)

∥ G ∥_{H_{2}} : = {(\frac{1}{2 π} \int_{- \infty}^{\infty} t r (G^{H} (- j ω) G (j ω)) d ω)}^{1 / 2} .

(15)

Note that there exists a direct relation between the approximation error in the frequency domain in terms of the $H_{2}$ norm and a bound for the $L_{\infty}$ norm of the output error in the time domain, according to [Citation17]

(16)

∥ y - y_{r} ∥_{L_{\infty}} \leq∥ G - G_{r} ∥_{H_{2}} ∥ u ∥_{L_{2}} .

(16)

The optimization problem (14) is non-convex, therefore in general only local optima can be found. Necessary conditions for local $H_{2}$ -optimality in terms of bitangential Hermite interpolation are available.

Theorem 2.6 ([Citation8,Citation9,Citation18]) Consider a full-order model (1) with transfer function $G (s)$ . Consider a reduced-order model with transfer function $G_{r} (s) = \sum_{i = 1}^{n} \frac{{\hat{c}}_{i} {\hat{b}}_{i}}{s - λ_{r, i}}$ with reduced poles $λ_{r, i} \in C$ and input resp. output residue directions ${\hat{b}}_{i}^{T} \in C^{m}$ , ${\hat{c}}_{i} \in C^{p}$ .

If $G_{r} (s)$ satisfies (14) locally, then

(17a)

G (- {\overset{ˉ}{λ}}_{r, i}) {\hat{b}}_{i}^{T} = G_{r} (- {\overset{ˉ}{λ}}_{r, i}) {\hat{b}}_{i}^{T}

(17a)

(17b)

{\hat{c}}_{i}^{T} G (- {\overset{ˉ}{λ}}_{r, i}) = {\hat{c}}_{i}^{T} G_{r} (- {\overset{ˉ}{λ}}_{r, i})

(17b)

(17c)

{\hat{c}}_{i}^{T} G^{'} (- {\overset{ˉ}{λ}}_{r, i}) {\hat{b}}_{i}^{T} = {\hat{c}}_{i}^{T} G_{r}^{'} (- {\overset{ˉ}{λ}}_{r, i}) {\hat{b}}_{i}^{T}

(17c)

for $i = 1, \dots, n$ .

Theorem 2.1 indicates how to construct bitangential Hermite interpolants for given interpolation data. However, in general, the eigenvalues and residue direction of the reduced-order model are not known a priori. For this reason, an iterative scheme known as Iterative Rational Krylov Algorithm (IRKA) has been developed [Citation8,Citation9] to iteratively adapt the interpolation data until the conditions (17) are satisfied. A sketch is given in Algorithm 1.

Algorithm 1 MIMO $H_{2}$ -Optimal Tangential Interpolation (IRKA)

Input: FOM $Σ$ ; Initial interpolation data $S^{P}$ , $R^{P}$ , $L^{P}$

Output:locally $H_{2}$ -optimal reduced model $Σ_{r}$

1:while not converged do

2: $V^{P} \leftarrow A V^{P} - E V^{P} S^{P} - B R^{P} = 0$ //projection matrix (11a)

3: $W^{P} \leftarrow A^{T} W^{P} - E^{T} W^{P} S^{P} - C^{T} L^{P} = 0$ //projection matrix (11b)

4: $V \leftarrow q r (V^{P})$ ; $W \leftarrow q r (W^{P})$ //compute orthonormal bases

5: $Σ_{r} \leftarrow W^{T} Σ V$ //compute reduced model by projection

6: [ $X, D, Y$ ] = eig( $Σ_{r}$ ) //eigendecomposition

7: $S^{P} \leftarrow - D; R^{P} \leftarrow B_{r}^{T} Y; L^{P} \leftarrow C_{r} X$ //update interpolation data

8: end while

Note that the primitive bases in lines 2 and 3 are not computed solving Sylvester equations but rather the sparse linear systems in (11). The equivalent representation using Sylvester equations (cf. Theorem 2.4) is used for brevity.

Finally, note that Algorithm 1 is not the only $H_{2}$ -optimal method for linear systems present in literature, but is certainly best known due to its simplicity and effectiveness. Other approaches worth mentioning include the trust-region algorithms in [Citation10] and [Citation11]. In addition [Citation19], derives a residue correction algorithm to optimize the tangential directions for fixed poles, speeding up convergence for models with many inputs and outputs. Even though for brevity we will not treat all these algorithms individually, the proofs of Section 4 will make evident that the new framework presented in this paper applies to these algorithms as well, as all methods are targeted at satisfying the optimality conditions (17).

3. The cost of $H_{2}$ -optimal reduction

The computational cost of model reduction by IRKA (cf. Algorithm 1) is dominated by the large-scale linear systems of equations (LSE) involved in computing $V^{P}$ , $W^{P}$ according to

(18a)

\overset{=: A_{σ_{i}}}{\overset{⏞}{(A - σ_{i} E)}} V_{i}^{P} = B r_{i}, i = 1, \dots, n,

(18a)

(18b)

A_{σ_{i}}^{T} W_{i}^{P} = C^{T} l_{i} i = 1, \dots, n .

(18b)

In fact, the orthogonalization, the matrix-matrix multiplications involved in the projection $Σ_{r} = W^{T} Σ V$ , and the low-dimensional eigenvalue decomposition are in general of subordinated importanceFootnote¹. As the matrix $A_{σ_{i}}$ of large-scale systems is generally sparse [Citation20], the actual cost involved in solving one LSE depends on a series of factors (including sparsity pattern, number of nonzero elements, conditioning …) as well as the effective exploitation of available hardware resources. It is therefore not possible (or even meaningful) to perform asymptotic operation counts as in the dense case. Nonetheless, to demonstrate that the reduction cost is indeed dominated by the solution of the sparse LSE in (18), compares the average execution times for the different computation steps of Algorithm 1 using MATLAB® R2016b on an Intel® Core™ i7-2640 CPU @ 2.80 GHz computer with 8GB RAM.

Figure 1. Cumulated execution times involved in each step of algorithm 1 for different benchmark models ( $n = 10$ ).

The comparison includes the sparse lu decomposition of the matrix $A - E$ , the economy-sized qr decomposition of the projection matrix $V^{P}$ , the matrix products involved in $W^{T} Σ V$ as well as the small dimensional generalized eig decompositions. The reduced order is set to n = 10 for all cases, while the original model order is given on the x-axis. The times given are averaged amongst several executions and cumulated for each IRKA step: At each iteration of IRKA, n lu decompositions, two qr decompositions, one projection $W^{T} Σ V$ and one eig decomposition are performed. The models used are taken from the benchmark collections [Citation21,Citation22].

As it can be seen, the execution time for the sparse lu decompositions grows more and more dominant as the problem size increases. This becomes even more evident in , where the execution times are given as percentage of the total time for one IRKA iteration. The two models shown represent the extreme cases of , i.e. where the execution time for the lu decompositions has the smallest and largest share.

Figure 2. Cumulated execution times as total share of each step of Algorithm 1.

IRKA (and alternative $H_{2}$ reduction methods) require the repeated reduction of an Nth-order model until, after $k_{H_{2}}$ steps, a set of $H_{2}$ -optimal parameters ${\{σ_{*, i}\}}_{i = 1}^{n}$ , ${\{r_{*, i}\}}_{i = 1}^{n}$ , and ${\{l_{*, i}\}}_{i = 1}^{n}$ is found. Following the results of and , its cost can be approximated by

(19)

C_{N} (I R K A) \approx \underset{o p t i m i z a t i o n}{\underset{⏟}{k_{H_{2}}}} \cdot \underset{r e d u c t i o n}{\underset{⏟}{2 n C_{N} (L S E)}},

(19)

where the cost of a $N$ -dimensional LSE $C_{N} (L S E)$ depends on, amongst others, the model at hand, as well as the chosen LSE solverFootnote². While the second factor in (19) represents the cost of a single reduction, the factor $k_{H_{2}}$ represents the cost introduced by the optimization. From this representation, it becomes evident that the cost of optimization is tied to – in fact weighted with – the cost of a full reduction. The efficiency of $H_{2}$ -optimal reduction methods can hence be significantly deteriorated by bad convergence, which can be a result of a bad initialization or the selection of an unsuitable reduced order. A very similar discussion applies to trust-region-based $H_{2}$ -optimal reduction methods [Citation10,Citation11], where the evaluation of gradient and Hessian in each step also involves a full reduction.

Clearly, a more desirable setting would be to obtain $H_{2}$ -optimal reduction parameters at a far lower cost than the cost of reduction, having to reduce the full-order model only once. In the following, we introduce a new framework that effectively decouples the cost of reduction from the cost of finding $H_{2}$ -optimal reduction parameters without compromising optimality.

4. A new framework for $H_{2}$ -optimal reduction

The framework discussed in this section was first introduced in [Citation12, p. 83] for SISO models, under the name of model function, a heuristic to reduce the cost of $H_{2}$ -optimal reduction within the SPARK algorithm. This heuristic was later applied to IRKA and proved in [Citation13] to yield $H_{2}$ -optimal reduced-order models under certain update conditions. In this contribution, we give a more extensive discussion of the framework, extending its validity to MIMO models. In Section 6, we will indicate how to apply this framework also to further system classes for which $H_{2}$ approximation algorithms are available. The main motivation for the new framework arises from simple considerations on the locality of $H_{2}$ -optimal model reduction by tangential interpolation. In fact,

tangential interpolation only guarantees to yield a good approximation locally around the frequencies ${\{σ_{i}\}}_{i = 1}^{n}$ , tangentially along directions ${\{r_{i}\}}_{i = 1}^{n}$ and ${\{l_{i}\}}_{i = 1}^{n}$ ,
as $∥ G - G_{r} ∥_{H_{2}}$ is a non-convex function, in general only local optima can be achieved.

This can be exploited during reduction as follows: Suppose a full-order model $Σ$ and initial tangential interpolation data ${\{σ_{i}^{0}\}}_{i = 1}^{n}$ , ${\{r_{i}^{0}\}}_{i = 1}^{n}$ , and ${\{l_{i}^{0}\}}_{i = 1}^{n}$ are given. Conventional $H_{2}$ reduction approaches would initialize e.g. IRKA and run for $k_{H_{2}}$ steps until convergence. In contrast, as our goal is to find a local optimum close to the initialization, then optimization with respect to a surrogate model – a good local approximation with respect to the initial interpolation data – may suffice. For this reason, the new framework starts by building an intermediate model $Σ_{μ}$ – in the following denoted as Model Function, in accordance to its first introduction in [Citation12] – of order $n_{μ}$ , with $N ≫ n_{μ} > n$ .

Definition 4.1 Consider a full-order model $Σ$ as in (1). Let interpolation data $σ_{μ, i} \in C$ , $r_{μ, i} \in C^{m}$ , $l_{μ, i} \in C^{p}$ , $i = 1, \dots, n_{μ}$ , be given and define the matrices $S_{μ} = d i a g (σ_{μ, 1}, \dots, σ_{μ, n_{μ}})$ , $R_{μ} = [\begin{matrix} r_{μ, 1} & \dots & r_{μ, n_{μ}} \end{matrix}]$ , $L_{μ} = [\begin{matrix} l_{μ, 1} & \dots & l_{μ, n_{μ}} \end{matrix}]$ . Then the Model Function $Σ_{μ}$ is defined as

(20)

Σ_{μ} : = W_{μ} Σ V_{μ},

(20)

where $V_{μ}, W_{μ}$ solve the Sylvester equations

(21a)

A V_{μ} - E V_{μ} S_{μ} - B R_{μ} = 0,

(21a)

(21b)

A^{T} W_{μ} - E^{T} W_{μ} S_{μ} - C^{T} L_{μ} = 0.

(21b)

From Theorem 4.1 and the results of Theorem 2.4, Theorem 2.5, and Theorem 2.1 it follows that $Σ_{μ}$ is a bitangential Hermite interpolant of $Σ$ with interpolation frequencies and tangential directions specified in $S_{μ}$ , $R_{μ}$ and $L_{μ}$ .

4.1. The Model Function framework

Step 1: Initialization of $Σ_{μ}$ The goal of the Model Function $Σ_{μ}$ is to be a good approximation of $Σ$ , locally with respect to the initial interpolation data ${\{σ_{i}^{0}\}}_{i = 1}^{n}$ , ${\{r_{i}^{0}\}}_{i = 1}^{n}$ , and ${\{l_{i}^{0}\}}_{i = 1}^{n}$ , for it to be used as a surrogate of $Σ$ during the $H_{2}$ optimization. For obvious reasons, it must hold $n_{μ} > n$ , making it not possible to choose $S_{μ} = S^{0} : = d i a g (σ_{1}^{0}, \dots, σ_{n}^{0})$ , $R_{μ} = R^{0} : = [\begin{matrix} r_{1}^{0} & \dots & r_{n}^{0} \end{matrix}]$ , and $L_{μ} = L^{0} : = [\begin{matrix} l_{, 1}^{0} & \dots & l_{n}^{0} \end{matrix}]$ . In the following, we introduce two possible alternatives:

I.1 Include the initial interpolation data $S^{0}, R^{0}, L^{0}$ with additional interpolation of $G (s)$ and its first $n_{μ} - n - 1$ derivatives at the frequency $σ = 0$ for a sum of all input and output channels. This can be achieved through

(22)

\begin{matrix} \begin{matrix} S_{μ} = [\begin{matrix} S^{0} & 0 \\ 0 & J_{0} \end{matrix}], R_{μ} & = [\begin{matrix} R^{0} & e_{1} & 0 & \dots & 0 \end{matrix}], \\ L_{μ} & = [\begin{matrix} L^{0} & e_{1} & 0 & \dots & 0 \end{matrix}], \end{matrix} \end{matrix}

(22)

where $J_{0}$ is a Jordan block of size $n_{μ} - n$ with eigenvalue $0$ and $e_{1} : = {[\begin{matrix} 1 & 1 & \dots & 1 \end{matrix}]}^{T}$ of appropriate dimensions. In this case, the choice of $n_{μ}$ is free. Due to Theorem 2.2, we expect the approximation quality of $Σ_{μ}$ to improve locally around $σ = 0$ as $n_{μ}$ grows.

1.2 Tangentially interpolate higher-order derivatives with respect to the data $S^{0}, R^{0}, L^{0}$ . This can be achieved through

(23)

S_{μ} = [\begin{matrix} S^{0} & J_{0} \\ 0 & S^{0} \end{matrix}], R_{μ} = [\begin{matrix} R^{0} & 0 \end{matrix}], L_{μ} = [\begin{matrix} L^{0} & 0 \end{matrix}],

(23)

(cf. Theorem 2.2). In this case $n_{μ} = 2 n$ and we expect the approximation quality of $Σ_{μ}$ to increase locally around the frequencies ${\{σ_{i}^{0}\}}_{i = 1}^{n}$ with respect to the tangential directions ${\{r_{i}^{0}\}}_{i = 1}^{n}$ and ${\{l_{i}^{0}\}}_{i = 1}^{n}$ .

Many more choices for $S_{μ}$ , $R_{μ}$ , and $L_{μ}$ are conceivable and valid. Nonetheless, the approaches presented above bear the advantage of minimizing the additional cost tied to the initialization of $Σ_{μ}$ . In fact, note that for every value $σ_{μ, i} = σ_{μ, j}$ , the LSEs in (18) share the same left-hand side, allowing e.g. the recycling of LU decompositions or pre-conditioners. That said, one may choose to use the additional degrees of freedom to enforce tangential interpolation with respect to different data. As there is no rigorous rule for this – just like there is no rigorous rule on how to initialize $H_{2}$ -optimal reduction – our philosophy is to minimize the initialization cost and let the optimizer choose new frequencies of interest. Finally, note that the specific choice of initialization may affect both the speed of convergence and resulting approximation quality.

Step 2: $H$ ₂ optimization with respect to $Σ_{μ}$ As the Model Function $Σ_{μ}$ is a good approximation of $Σ$ locally with respect to the initial interpolation data, we can run the $H_{2}$ optimization with respect to $Σ_{μ}$ with initialization $S^{0}$ , $R^{0}$ , $L^{0}$ . The optimal interpolation data found at convergence is denoted by $S_{*} = d i a g (σ_{*, 1}, \dots, σ_{*, n})$ , $R_{*} = [\begin{matrix} r_{*, 1} & \dots & r_{*, n} \end{matrix}]$ , and $L_{*} = [\begin{matrix} l_{*_{,} 1} & \dots & l_{*, n} \end{matrix}]$ . As typically $n_{μ} ≪ N$ , we are expecting the cost of this optimization to be significantly lower than a full optimization over $Σ$ . As we assume the validity of the resulting reduced-order model $Σ_{μ, r}$ to be confined locally to the frequency region defined by $S_{μ}$ , we call the combination of IRKA with this Model Function framework Confined IRKA (CIRKA).

In fact, by approximating $Σ$ through $Σ_{μ}$ , we have lost any optimality condition between $Σ$ and $Σ_{μ, r}$ . Whether $Σ_{μ, r}$ is an acceptable approximation of $Σ$ highly depends on the approximation quality of $Σ_{μ}$ itself. This loss of connection between $Σ$ and $Σ_{μ, r}$ is a typical drawback of so-called two-step model reduction approaches. Nonetheless, by means of surrogate optimization, a new set of interpolation data $S_{*}$ , $R_{*}$ , $L_{*}$ has been found in a cost-effective way. At this point, an update of $Σ_{μ}$ with local information about these frequencies is required.

Step 3: Update of $Σ_{μ}$ and fixed-point iteration To restore the relationship between $Σ$ and $Σ_{μ, r}$ , the Model Function $Σ_{μ}^{k}$ from the current step $k$ can be updated to a new $Σ_{μ}^{k + 1}$ by enforcing tangential interpolation with respect to the optimal data $S_{*}^{k}$ , $R_{*}^{k}$ , $L_{*}^{k}$ . This can be achieved by updating the projection matrices

(24a)

V_{μ}^{k + 1} = [V_{μ}^{k} V_{μ}^{k, +}],

(24a)

(24b)

W_{μ}^{k + 1} = [W_{μ}^{k} W_{μ}^{k, +}],

(24b)

with new directions $V_{μ}^{k, +}$ and $W_{μ}^{k, +}$ . As for the initialization of $Σ_{μ}^{0}$ , several strategies are possible. In the following, we present only a few most relevant ones:

U.1 Update $Σ_{μ}^{k}$ with tangential interpolation with respect to all optimal frequencies and tangential directions $S_{*}^{k}$ , $R_{*}^{k}$ , $L_{*}^{k}$ . This implies that $n_{μ}^{k}$ increases by n in every step and that higher-order derivatives are tangentially interpolated in case specific combinations of optimal frequencies and tangential directions are repeated. In this approach, we expect the approximation quality of $Σ_{μ}^{k + 1}$ to increase around all optimal frequencies ${\{σ_{*, i}\}}_{i = 1}^{n}$ , with respect to the tangential directions ${\{r_{*, i}\}}_{i = 1}^{n}$ and ${\{l_{*, i}\}}_{i = 1}^{n}$ .

U.2 Update $Σ_{μ}^{k}$ only with tangential interpolation with respect to new interpolation data. This implies that the order of the Model Function increases by at most $n$ in every step. In this approach, we expect the approximation quality of $Σ_{μ}^{k + 1}$ to increase only around new optimal frequencies $σ_{*, i}$ with respect to the respective tangential directions $r_{*, i}$ and $l_{*, i}$ , for $i = 1, \dots, n_{μ}^{k, +}$ , where $n_{μ}^{k, +}$ denotes the number of new frequencies.

U.3 Reinitialize $Σ_{μ}^{k}$ by including ‘only’ tangential interpolation about the new optimal data. In this approach, we expect the approximation quality of $Σ_{μ}^{k + 1}$ to increase in the regions around the new optimal frequencies ${\{σ_{*, i}\}}_{i = 1}^{n}$ , with respect to the tangential directions ${\{r_{*, i}\}}_{i = 1}^{n}$ and ${\{l_{*, i}\}}_{i = 1}^{n}$ , but potentially decrease around previous interpolation data. However, the order $n_{μ}^{k}$ can be kept constant through iterations using this method, reducing the cost of $H_{2}$ optimization. As $n_{μ}^{k} > n$ , similar considerations apply as for the Initialization in Step 1.

The updated Model Function $Σ_{μ}^{k + 1}$ can be used again to perform a low-dimensional $H_{2}$ optimization and potentially improve the optimal frequencies and directions $S_{*}^{k + 1}$ , $R_{*}^{k + 1}$ and $L_{*}^{k + 1}$ , which in general may differ from those of the previous step. This leads to a fixed-point iteration, until, after $k_{μ}$ steps, an update of $Σ_{μ}^{k_{μ} - 1}$ does not result in new optimal interpolation data.

The overall procedure is summarized in Algorithm 2 for the case where $H_{2}$ optimization is performed with IRKA. Note however that Line 6 can be replaced by any $H_{2}$ -optimal reduction method, leading to the more general Model Function framework. Also note that Line 5 covers both the initialization and the update of the Model Function. For brevity, details on the implementation are omitted at this point. MATLAB® code for the proposed algorithms is provided with this paper as functions of the sssMOR toolbox [Citation24].

Algorithm 2 Confined IRKA (CIRKA)

Input: FOM $Σ$ ; Initial interpolation data $S^{0}$ , $R^{0}$ , $L^{0}$

Output: reduced model $Σ_{μ, r}$ , Model Function $Σ_{μ}^{k_{μ}}$ , error estimation ${\tilde{ϵ}}_{H_{2}}$

1: $k \leftarrow 0$ ; $[Σ_{μ}^{k}, S_{*}^{t o t}, R_{*}^{t o t}, L_{*}^{t o t}] \leftarrow$ empty; //Initialization

2: $S_{*}^{k} \leftarrow S^{0}$ , $R_{*}^{k} \leftarrow R^{0}$ , $L_{*}^{k} \leftarrow L^{0}$ ;

3: while not converged do

4: $k \leftarrow k + 1$

5: $Σ_{μ}^{k} \leftarrow$ updateModelFunction $(Σ, Σ_{μ}^{k - 1}, S_{*}^{t o t}, R_{*}^{t o t}, L_{*}^{t o t}, S_{*}^{k - 1}, R_{*}^{k - 1}, L_{*}^{k - 1})$

6: $[Σ_{μ, r}, S_{*}^{k}, R_{*}^{k}, L_{*}^{k}] \leftarrow$ IRKA $(Σ_{μ}^{k}, S_{*}^{k - 1}, R_{*}^{k - 1}, L_{*}^{k - 1})$

7: $S_{*}^{t o t} \leftarrow$ blkdiag $(S_{*}^{t o t}, S_{*}^{k})$ ; $R_{*}^{t o t} \leftarrow [R_{*}^{t o t}, R_{*}^{k}]$ ; $L_{*}^{t o t} \leftarrow [L_{*}^{t o t}, L_{*}^{k}]$

8: end while

9: $k_{μ} \leftarrow k$

10: ${\tilde{ϵ}}_{H_{2}} \leftarrow$ norm $(G_{μ}^{k_{μ}} - G_{μ, r})$

4.2. An illustrative example

To make the explanation of the framework clearer, we include a simple numerical example to accompany the discussion. As a test model we consider the benchmark model beam [Citation21] of order $N = 348$ with $m = p = 1$ . Our aim is to find a model of reduced order $n = 4$ satisfying $H_{2}$ -optimality conditions. We assume that no previous information about the relevant frequency region is given, hence the initialization is chosen with interpolation frequencies $σ_{1}^{0} = \dots = σ_{n}^{0} = 0$ , corresponding to $S^{0} = J_{0}$ . As the model is SISO, there is no need to specify tangential directions.

Step 1: Initialization of $Σ_{μ}$ As initial order for the Model Function we select $n_{μ} = 2 n = 8$ . In this particular case, both initializations I.1 and I.2 of $Σ_{μ}$ coincide. shows the Bode plot of $Σ$ and the resulting $Σ_{μ}$ Footnote³. The orange circles indicate the imaginary part of the frequencies used to initialize the Model Function.

Figure 3. Illustrative Bode plot of $Σ$ and its Model Function $Σ_{μ}$ .

Step 2: $H$ ₂ optimization with respect to $Σ_{μ}$ shows the reduced-order model $Σ_{μ, r}$ resulting from the optimization with respect to $Σ_{μ}$ .

Figure 4. Comparison of $Σ$ , $Σ_{μ}$ and $Σ_{μ, r}$ after $H_{2}$ optimization.

In this example, IRKA converged in $k_{H_{2}} = 4$ steps at optimal frequencies $S_{*} = d i a g (\begin{matrix} 0.005 \pm i 0.104 & 0.006 \pm i 0.569 \end{matrix})$ , whose imaginary parts are depicted in by green diamonds. In this case, by inspection of the Bode plot, we see that $Σ_{μ, r}$ appears to be an acceptable approximation also of the original model $Σ$ for the given order $n$ . This is a result of $Σ_{μ}$ being a valid approximation of $Σ$ also locally around the new frequency region represented by $S_{*}$ . However, this need not hold true in general. For this reason, an update of the Model Function is performed.

Step 3: Update of $Σ_{μ}$ and fixed-point iteration Using $S_{*}^{1}$ , we update $Σ_{μ}^{1}$ to $Σ_{μ}^{2}$ by including interpolation of the full-order model $Σ$ with respect to the frequencies in $S_{*}^{1}$ . As all optimal frequencies differ from those used for initialization of $Σ_{μ}$ , update strategies U.1 and U.2 coincide in this case. shows the updated Model Function $Σ_{μ}^{2}$ of order $n_{μ}^{2} = 12$ for the beam example, as well as the reduced-order model $Σ_{μ, r}^{2}$ and new optimal frequencies $S_{*}^{2}$ resulting from IRKA on $Σ_{μ}^{2}$ .

Figure 5. Comparison of $Σ$ , $Σ_{μ}^{2}$ and $Σ_{μ, r}^{2}$ after repeated $H_{2}$ optimization.

In this example, IRKA converges (within some tolerance) to the same optimal frequencies $S_{*}^{2} \approx S_{*}^{1}$ and hence the whole framework required $k_{μ} = 2$ iterations until convergence. The fact that an update of the Model Function does not yield a new optimum indicates that the Model Function $Σ_{μ}^{1}$ was already accurate enough in the region around $S_{*}^{1}$ .

Finally, note that while this framework has required 3 full-sized LU decompositions to generate and update the Model Function, a direct application of IRKA to $Σ$ requires $1 + (k_{H_{2}} - 1) \cdot n / 2 = 1 + 3 \cdot 2 = 7$ full-sized LU decompositions, i.e. more than double the amount. A deeper discussion and comparison of the complexities will be given in Section 4.5 and Section 5. It is also worth noting that both CIRKA and IRKA converge to the same local optimum.

4.3. Optimality of the new framework

So far, the new framework has been presented as a heuristic to perform $H_{2}$ surrogate optimization and – hopefully – reduce the overall reduction cost. The update of $Σ_{μ}^{k}$ has been introduced to increase the accuracy of the surrogate model in the frequency regions the optimizer seemed to deem as important. However our original intent was to obtain a reduced model $Σ_{r}$ satisfying the $H_{2}$ -optimality conditions (17). The question still remains as to whether this goal has been achieved. In this section, we prove that the update of $Σ_{μ}^{k}$ is sufficient to satisfy the optimality conditions (17) at convergence.

Theorem 4.2. Consider a full-order model $Σ$ as in (1) with transfer function $G (s)$ . Let $V_{μ}, W_{μ} \in C^{N \times n_{μ}}$ satisfy

(25a)

A V_{μ} - E V_{μ} S_{μ} - B R_{μ} = 0,

(25a)

(25b)

A^{T} W_{μ} - E^{T} W_{μ} S_{μ} - C^{T} L_{μ} = 0,

(25b)

with matrices $S_{μ} = d i a g (σ_{μ, 1}, \dots, σ_{μ, n_{μ}})$ , $R_{μ} = [\begin{matrix} r_{μ, 1}, \dots, r_{μ, n_{μ}} \end{matrix}]$ , $L_{μ} = [\begin{matrix} l_{μ, 1}, \dots, l_{μ, n_{μ}} \end{matrix}]$ and $σ_{μ, j} \in C$ , $r_{μ, j} \in C^{m}$ , $l_{μ, j} \in C^{p}$ . Consider the Model Function $Σ_{μ} = W_{μ}^{T} Σ V_{μ}$ with transfer function $G_{μ} (s)$ . Let $V_{μ, r}, W_{μ, r} \in C^{n_{μ} \times n}$ satify

(26a)

A_{μ} V_{μ, r} - E_{μ} V_{μ, r} S_{*} - B_{μ} R_{*} = 0,

(26a)

(26b)

A_{μ}^{T} W_{μ, r} - E_{μ}^{T} W_{μ, r} S_{*} - C_{μ}^{T} L_{*} = 0,

(26b)

with matrices $S_{*} = d i a g (σ_{*, 1}, \dots, σ_{*, n})$ , $R_{*} = [\begin{matrix} r_{*, 1}, \dots, r_{*, n} \end{matrix}]$ , $L_{*} = [\begin{matrix} l_{*, 1}, \dots, l_{*, n} \end{matrix}]$ and $σ_{*, i} \in C$ , $r_{*, i} \in C^{m}$ , $l_{*, i} \in C^{p}$ . Consider the reduced Model Function $Σ_{μ, r} = {(W_{μ, r})}^{T} Σ_{μ} V_{μ, r}$ with transfer function $G_{μ, r} (s)$ . Further, assume that for every $i = 1, \dots, n$ , the triplets $(σ_{*, i}, r_{*, i}, l_{*, i})$ satisfy $(σ_{*, i}, r_{*, i}, l_{*, i}) = (- {\overset{ˉ}{λ}}_{r, i}, {\hat{b}}_{i}^{T}, {\hat{c}}_{i})$ where $G_{μ, r} (s) = \sum_{i = 1}^{n} \frac{{\hat{c}}_{i} {\hat{b}}_{i}}{s - λ_{r, i}}$ is the pole/residue representation of the reduced Model Function.

If, for every $i = 1, \dots, n$ , there exists a $j = 1, \dots, n_{μ}$ , such that

(27)

(σ_{μ, j}, r_{μ, j}, l_{μ, j}) = (σ_{*, i}, r_{*, i}, l_{*, i}),

(27)

then $Σ_{μ, r}$ satisfies the first-order $H_{2}$ optimality conditions

G (- {\overset{ˉ}{λ}}_{r, i}) {\hat{b}}_{i}^{T} = G_{μ, r} (- {\overset{ˉ}{λ}}_{r, i}) {\hat{b}}_{i}^{T}

{\hat{c}}_{i}^{T} G (- {\overset{ˉ}{λ}}_{r, i}) = {\hat{c}}_{i}^{T} G_{μ, r} (- {\overset{ˉ}{λ}}_{r, i})

{\hat{c}}_{i}^{T} G^{'} (- {\overset{ˉ}{λ}}_{r, i}) {\hat{b}}_{i}^{T} = {\hat{c}}_{i}^{T} G_{μ, r}^{'} (- {\overset{ˉ}{λ}}_{r, i}) {\hat{b}}_{i}^{T}

for all $i = 1, \dots n$

Proof. By Theorem 2.1 and Theorem 2.4, construction of $Σ_{μ, r}$ with the projection matrices satisfying (26) results in a reduced model that is a bitangential Hermite interpolant of the Model Function $Σ_{μ}$ with respect to the interpolation data $S_{*}$ , $R_{*}$ , and $L_{*}$ . In combination with the assumption $(σ_{*, i}, r_{*, i}, l_{*, i}) = (- {\overset{ˉ}{λ}}_{r, i}, {\hat{b}}_{i}^{T}, {\hat{c}}_{i})$ , this yields the relationship

(29)

\begin{matrix} G_{μ} (- {\overset{ˉ}{λ}}_{r, i}) {\hat{b}}_{i}^{T} & = G_{μ, r} (- {\overset{ˉ}{λ}}_{r, i}) {\hat{b}}_{i}^{T}, \\ {\hat{c}}_{i}^{T} G_{μ} (- {\overset{ˉ}{λ}}_{r, i}) & = {\hat{c}}_{i}^{T} G_{μ, r} (- {\overset{ˉ}{λ}}_{r, i}), \\ {\hat{c}}_{i}^{T} {G^{'}}_{μ} (- {\overset{ˉ}{λ}}_{r, i}) {\hat{b}}_{i}^{T} & = {\hat{c}}_{i}^{T} {G^{'}}_{μ, r} (- {\overset{ˉ}{λ}}_{r, i}) {\hat{b}}_{i}^{T}, \end{matrix}

(29)

for all $i = 1, \dots n$ . In addition, from (27) and (25) follows

(30)

\begin{matrix} G (- {\overset{ˉ}{λ}}_{r, i}) {\hat{b}}_{i}^{T} & = G_{μ} (- {\overset{ˉ}{λ}}_{r, i}) {\hat{b}}_{i}^{T}, \\ {\hat{c}}_{i}^{T} G (- {\overset{ˉ}{λ}}_{r, i}) & = {\hat{c}}_{i}^{T} G_{μ} (- {\overset{ˉ}{λ}}_{r, i}), \\ {\hat{c}}_{i}^{T} G^{'} (- {\overset{ˉ}{λ}}_{r, i}) {\hat{b}}_{i}^{T} & = {\hat{c}}_{i}^{T} {G^{'}}_{μ} (- {\overset{ˉ}{λ}}_{r, i}) {\hat{b}}_{i}^{T}, \end{matrix}

(30)

for all $i = 1, \dots n$ . Equating (29) and (30) completes the proof. ◸

All assumptions of Theorem 4.2 are satisfied if $Σ_{μ, r}$ results from an $H_{2}$ -optimal reduction of $Σ_{μ}$ and the Model Function $Σ_{μ}$ is properly updated during algorithm 2. In fact, the assumption (27) can be seen as an update condition for the Model Function $Σ_{μ}$ . Theorem 4.2 proves that the new framework results – at convergence – in a reduced model $Σ_{μ, r}$ locally satisfying the optimality conditions (17) and hence effectively solves the $H_{2}$ -optimal reduction problem (14). In addition, it is possible to show that the reduced model $Σ_{μ, r}$ has the same state-space realization as one obtained through direct projection of the full model $Σ$ .

Corollary 4.3. Consider a full-order model $Σ$ as in (1). Let all assumptions of Theorem 4.2 hold. Further, let $V, W \in C^{N \times n}$ be projection matrices satisfying Sylvester equations of the form

(31a)

A V - E V S_{*} - B R_{*} = 0,

(31a)

(31b)

A^{T} W - E^{T} W S_{*} - C^{T} L_{*} = 0.

(31b)

Consider the reduced model $Σ_{r}$ resulting from the projection $Σ_{r} = W^{T} Σ V$ . Then it holds

(32)

Σ = Σ_{μ, r} .

(32)

Proof. The proof amounts to showing that the projection matrices used to obtain $Σ_{r}$ and $Σ_{μ, r}$ from projection of $Σ$ are equal, hence

(33a)

V = V_{μ} V_{μ, r},

(33a)

(33b)

W = W_{μ} W_{μ, r} .

(33b)

Consider the Sylvester equation (26a), for which holds

(34)

{(W_{μ})}^{T} (A V_{μ} V_{μ, r} - E V_{μ} V_{μ, r} S_{*} - B R_{*}) = 0.

(34)

Obviously, by comparing this equation to (31a), the relation (33a) is sufficient to show that (34) is satisfied. In order to show that (33a) is also necessary, we assume that the term in the brackets does not vanish. By defining $\tilde{V} = V_{μ} V_{μ, r}$ , the product (34) can be rewritten as

(35)

[\begin{matrix} ⋮ \\ l_{μ, j}^{T} C {(A - σ_{μ, j} E)}^{- 1} \\ ⋮ \end{matrix}] \cdot (A \tilde{V} - E \tilde{V} [\begin{matrix} ⋱ \\ σ_{*, i} \\ ⋱ \end{matrix}] - B [\begin{matrix} \dots & r_{*, i} & \dots \end{matrix}]) = 0,

(35)

which needs to hold true for all $j = 1, \dots, n_{μ}$ and $i = 1, \dots, n$ . Due to the update condition (27), for every $i$ there exists a $j$ such that $(σ_{μ, j}, r_{μ, j}, l_{μ, j}) = (σ_{*, i}, r_{*, i}, l_{*, i})$ . For all such $i, j$ combinations, we can hence consider the inner product between the jth row of ${(W_{μ})}^{T}$ and the ith column in the bracket

l_{*, i}^{T} C {(A - σ_{*, i} E)}^{- 1} ((A - σ_{*, i} E) \tilde{V} e_{i} - B r_{*, i}) = 0,

(36)

\Leftrightarrow l_{*, i}^{T} C (\tilde{V} e_{i} - {(A - σ_{*, i} E)}^{- 1} B r_{*, i}) = 0,

(36)

where $e_{i} \in R^{n}$ is a vector with 1 on the ith entry and otherwise 0. Since (36) needs to hold true for all possible $C$ , it finally follows that

(37)

\tilde{V} e_{i} = {(A - σ_{*, i} E)}^{- 1} B r_{*, i},

(37)

which by Theorem 2.4 and (31a) corresponds exactly to $V e_{i}$ . As (37) holds true for all columns $i = 1, \dots, n$ , we finally obtain $\tilde{V} = V_{μ} V_{μ, r} = V$ . The proof for $W$ is analogous. □

The results of Theorem 4.3 require all projection matrices to be primitive bases (cf. Theorem 2.3). In practical applications, this is not the case. In general, the realizations $Σ_{μ, r}$ and $Σ_{r}$ will be restricted system equivalent, i.e. sharing the same order and transfer function.

4.4. Interpretations of the new framework

The new framework introduced in this section, which we refer to as the Model Function framework in general or Confined IRKA (CIRKA) when applied to IRKA, can be given different interpretations. On the one hand, it is a form of surrogate optimization [Citation25] in that the $H_{2}$ optimization is not conducted on the actual cost function $J =∥ G - G_{r} ∥_{H_{2}}$ but on an approximation $J \approx \hat{J} =∥ G_{μ} - G_{r} ∥_{H_{2}}$ . In fact, the framework itself is an application of reduced-model based optimization [Citation26]. On the other hand, the framework can be seen as $H_{2}$ optimization in a subspace, defined by the matrices $V_{μ}$ and $W_{μ}$ , that is updated at every iteration. In fact, this technique could be interpreted as a subspace acceleration method [Citation27] to recycle information obtained in a previous optimization step. Traditionally, subspace acceleration methods in numerics are used to ameliorate the convergence of iterative methods. In our setting, this becomes the convergence to a set of optimal reduction parameters. Finally, one may think of it as a restarted $H_{2}$ optimization, where e.g. IRKA is restarted in a higher-dimensional subspace after convergence.

4.5. The cost of the new framework

The discussion so far has shown how the reduced-order model $Σ_{μ, r}$ , resulting from the Model Function framework, indeed satisfies $H_{2}$ -optimality conditions. What still remains to be discussed is whether we can expect this framework to be computationally less demanding than simply applying $H_{2}$ -optimal reduction to the full-order model. To address this question, we compare the cost of IRKA, estimated in (19), to the cost of CIRKA. Following similar considerations as in Section 3, we obtain the relationship

(38)

C_{N} (C I R K A) \approx \underset{c o s t o f r e d u c t i o n}{\underset{⏟}{\sum_{k = 1}^{k_{μ}} n_{μ}^{k, +} \cdot C_{N} (L S E)}} + \underset{c o s t o f o p t i m i z a t i o n}{\underset{⏟}{\sum_{k = 1}^{k_{μ}} 2 n \cdot k_{H_{2}}^{k} \cdot C_{n_{μ}^{k}} (L S E)}}

(38)

where $k_{μ}$ is the number of iterations of the new framework, $n_{μ}^{k, +}$ indicates the size of the Model Function update in each step, and $k_{H_{2}}^{k}$ is the number of IRKA iterations in each step of the new framework. The first summand represents the cost of updating the Model Function with information of the full-order model and could be interpreted as the cost of reduction. The second summand adds up the cost of IRKA in each iteration and depends on the number of optimization steps in IRKA $k_{H_{2}}^{k}$ , as well as $n_{μ}^{k}$ -dimensional LSEs. As it can be seen from (38), the cost involved in finding optimal parameters is in some sense decoupled from the cost of reduction, although a link through $k_{μ}$ still remains. Obviously, as $N$ increases, we expect $C_{N} (L S E) ≫ C_{n_{μ}^{i}} (L S E)$ . Hence the dominant cost is represented by the cost of reduction, whereas the number of optimization steps $k_{H_{2}}^{k}$ plays no role. Therefore, comparing (19) and (38) we expect the new framework to be cost effective as long as

(39)

\sum_{k = 1}^{k_{μ}} n_{μ}^{k, +} < 2 n \cdot k_{H_{2}},

(39)

where $k_{H_{2}}$ in this case represents the number of iterations required by IRKA to directly reduce the full order model. In Section 5, numerical results will illustrate the substantial speedup that can be achieved by using this framework.

It becomes apparent from (39) that the number of iterations of the new framework $k_{μ}$ may have an impact on its performance. Unfortunately, to date there are no results on the convergence speed or even a guarantee of convergence for the new framework. In practice, we observe that its convergence is affected by the convergence behaviour of the $H_{2}$ optimization itself. Whenever IRKA has difficulties in converging, e.g. due to inconvenient initialization or choice of reduced order, then the frequencies and tangential directions returned after the maximum number of iterations is reached are likely to change after every iteration. In this respect, the new framework offers new opportunities to deal with IRKA convergence issues, in allowing a high number of maximum iterations at a low cost, with the possibly of reinitializing IRKA with different parameters if no convergence is reached. Conversely, for $H_{2}$ descent methods as [Citation11], no convergence issues have been observed.

4.6. Further advantages of the new framework

We conclude this section by addressing another advantage of this framework, compared to conventional $H_{2}$ optimization. In fact, in the process of generating an $H_{2}$ -optimal reduced-order model $Σ_{r}$ , we naturally obtain an intermediate reduced-order model $Σ_{μ}$ at no additional cost. This Model Function has a reduced order $n_{μ} > n$ and is – though not being $H_{2}$ -optimal – in general a better approximation to $Σ$ than $Σ_{μ, r}$ . This is especially true if update strategies U.1 or U.2 are used. For this reason, by assuming $G_{μ} (s) \approx G (s)$ , we can estimate the relative reduction error $ϵ_{H_{2}}$ through

(49)

ϵ_{H_{2}} \approx {\tilde{ϵ}}_{H_{2}} = \frac{∥ G_{μ} - G_{r} ∥_{H_{2}}}{∥ G_{μ} ∥_{H_{2}}} .

(49)

Obviously, the validity of the estimation (40) highly depends on the approximation quality of $G_{μ} (s)$ . Nonetheless, as rigorous error estimation is still a big open challenge in model reduction by tangential interpolation, this type of indicator may be of interest, even though no claim about its rigorosity can be made. Note that this error estimation is very similar to the one in [Citation7], where data-driven approaches are used to generate a surrogate model from data acquired during IRKA. As reduced models obtained by projection can become unstable, in general it is necessary to take the stable part of $Σ_{μ}$ after convergence to be able to evaluate (40). As $n_{μ}$ is small, this can be easily computed, for example in MATLAB using the stabsep command. Note that even though in theory no claim can be made about the stability of reduced-order models obtained by IRKA, in practice more often then not the resulting reduced models are stable. We make the same observation for CIRKA. Finally, by making the cost of optimization negligible, the new framework allows to introduce globalized local optimization approaches [Citation28], where local reduction is performed from different start points, increasing the chance of finding the global minimum [Citation29].

5. Numerical results

In this section, we compare IRKA and CIRKA in the reduction of a benchmark model taken from the collection [Citation22]. The model (gyro) represents a micro-electro mechanical butterfly gyroscope [Citation30] and has a full order of $N \approx 35^{'} 000$ , $m = 1$ input and $p = 12$ outputs. The computations were conducted on an Intel® Xeon® CPU @ 2.27 GHz computer with four cores and 48 GB RAM. $H_{2}$ -optimal reduced models for the reduced orders $n = 10, 20, 30$ are constructed using both IRKA and CIRKA, run using the sssMOR toolbox implementations. The execution parameters were set to default, in particular the chosen convergence tolerance was set to Opts.tol = 1e-3 and the convergence was determined by inspection of the optimal frequencies $σ_{*, i}$ (Opts.stopCrit = ‘s0ʹ) in IRKA and optimal frequencies $σ_{*, i}$ and tangential directions $r_{*, i}$ , $l_{*, i}$ (Opts.stopCrit = ‘s0+tanDir’) in CIRKA. The initial frequencies in $S^{0}$ were all set to $0$ , whereas all initial tangential directions in $R^{0}$ and $L^{0}$ were set to ${[\begin{matrix} 1 & \dots & 1 \end{matrix}]}^{T}$ of appropriate dimensions. In CIRKA, strategies I.2 for initialization and U.2 for Model Function updates were used. The results are summarized in .

Table 1. Reduction results for the gyro model using zero initialization.

Display Table

For all three cases considered, CIRKA required significantly less full-dimensional LU decompositions compared to IRKA. This is reflected also in the significant speedup in reduction time, ranging from four to over 20 times faster than IRKA. Note that for the case $n = 10$ , IRKA did not converge within the maximum number of steps. On the other hand, CIRKA converges to a local optimum, hence reaching a lower relative approximation error. As the model represents an oscillatory system with weakly damped eigenfrequencies and many outputs, it is hard to approximate well with a low reduced order, as the relative approximation errors in reflect. Nevertheless, a comparison of the reduced models in a Bode diagram shows a good approximation, as it is depicted in exemplary for the case $n = 30$ and the 9^th output.

Figure 6. Reduction of the gyro model using CIRKA and IRKA ( $n = 30$ , 9th output).

To analyse the dependency of the results in from the initialization, summarizes the results for initial frequencies and tangential directions corresponding to the mirrored eigenvalues with smallest magnitude and respective input/output residues of the full model $Σ$ . This choice of initial parameters is known to be advantageous [Citation31,Citation32], its computation can be performed efficiently for large-scale sparse models using iterative methods (cf. e.g. the MATLAB command eigs).

Table 2. Reduction results for the gyro model using eigs initialization.

Display Table

Interestingly, for the case $n = 10$ , IRKA converged to the default tolerance in one step, requiring only 5 LU decompositions. As CIRKA required two iterations until convergence, the number of full-dimensional LU decomposition required were 6. In this special case, IRKA was faster than CIRKA. In addition, comparison of the relative $H_{2}$ error indicates that IRKA converged to a better local minimum. A comparison with the respective result of demonstrates that initialization of $H_{2}$ -optimal reduction can have a large impact on the optimization and the results. For the other two cases, the pattern is similar to what discussed so far. In particular, the speedup obtained through CIRKA is significant.

Comparing the values of $ϵ_{H_{2} H 2}$ in and , we see that the value for CIRKA may be below, above, or the same as the one for IRKA. In general, it is not possible to say which method is expected to yield a better approximation quality. Given identical initializations, often times IRKA and CIRKA will converge to the same local optimum, especially in the case of small number of inputs and outputs. However, as the optimization in CIRKA is based on a surrogate of the actual cost function, it may converge to a different (better or worse) local optimum. Note that neither in IRKA, nor in CIRKA, we have control over which local optimum will be found – we only hope to find a local optimum. This is why in [Citation29], an approach to find the global optimum exploiting the Model Function framework is presented.

Finally, note that even when IRKA converges in few steps (e.g. 9) a reduction using CIRKA can already result in a significant speedup. Also note that the relative error estimation ${\tilde{ϵ}}_{H_{2}}$ tends to underestimate the approximation error, which has been observed especially for models with high dynamics. In general, it is not possible to state how far the estimate is from the real error. Nonetheless, given that no rigorous error bounds or estimators are yet available for $H_{2}$ -optimal reduction of general systems, we believe a non rigorous estimate at basically no additional cost may be of interest. To further investigate the validity of the error estimation based on the Model Function, Figure 11 compares the error estimations to the true error after reduction with CIRKA for the benchmark model CDplayer [Citation21] and different reduced orders. ) shows the estimation of the $H_{2}$ error as defined in (40), whereas in ) an analogous estimation of the $H_{\infty}$ error is given.

Figure 7. Comparison of relative error and its estimation for the CDplayer model.

6. Application to different system classes

The discussion of this contribution has been limited to the model reduction of linear time-invariant systems as in (1) with regular $E$ matrix. However, $H_{2}$ approximation approaches exist for other system classes as well, including systems of Differential Algebraic Equations (DAEs) [Citation33], systems for which only transfer function evaluations or measurements are available [Citation19], linear systems with time delays [Citation34], linear systems in port-Hamilton form [Citation35], as well as non-linear systems in bilinear [Citation36,Citation37] or quadratic-bilinear form [Citation38]. Obviously, going into the details of all approximation methods above would exceed the scope of this paper. Nonetheless, as we feel that $H_{2}$ algorithms for different system classes could also benefit from the framework presented in this contribution, we will give brief indications of what steps and considerations are required to extend this framework. Generally speaking, the idea of surrogate optimization is certainly not new. However, as most methods above are based on interpolation, an update of the surrogate model as presented here can ensure interpolation of the full model with respect to optimal interpolation data, which would be otherwise lost. Once more, we emphasize that the idea presented in this contribution truly is a general framework and not just an additional reduction algorithm by the name of CIRKA. In the following, we briefly indicate for some system classes, how the principle of this new framework could be applied. The goal is to reduce the computational cost while still satisfying the same conditions as the original algorithms. Due to the brevity of the following subsections, readers who are not familiar with the system classes mentioned might prefer to skip these discussions.

6.1. DAE systems

DAE systems are state-space representations as in (1) having a singular matrix $E$ . Their transfer function can be generally decomposed into the sum of a strictly proper part $G_{s p} (s)$ , satisfying $lim_{s \to \infty} G_{s p} (s) = 0$ , and a polynomial part $P (s)$ , a polynomial in $s$ of order at most $ν$ , the so called index of the DAE [Citation39]. Due to this polynomial contribution, approximation by tangential interpolation is not sufficient to prevent the reduction error to become unbounded, which is a result of a mismatch in the polynomial part. For this reason, in the reduction the polynomial part needs to be matched exactly, while the strictly proper part can be approximated through tangential interpolation. To achieve tangential interpolation of the strictly proper part while preserving the polynomial part, the subspace conditions in (4), (5) can be adapted by including spectral projectors onto deflating subspaces [Citation33, Theorem 3.1]. Based on this, Gugercin, Stykel and Wyatt show in [Citation33, Theorem 4.1] that interpolatory $H_{2}$ -optimal reduction of DAEs can be performed by using the modified subspace conditions to find an $H_{2}$ approximation to the strictly proper part $G_{s p} (s)$ .

For this system class, an extension of the Model Function framework is quite straightforward: By computing a Model Function following the modified interpolatory subspace conditions of [Citation33, Theorem 3.1], $H_{2}$ optimization can be performed on the surrogate instead of on the original DAE. An update of the Model Function finally leads to a reduced-order model that tangentially interpolates the strictly proper part of the original DAE at the optimal frequencies, along the optimal right and left tangential directions.

6.2. Port-Hamiltonian systems

Port-Hamiltonian (pH) formulations of linear dynamical systems are a very effective way to model systems that result from the interconnection of different subsystems, especially in a multi-physics domain. Their representation is similar to (1), where the system matrices have the special structures $E = I$ , $A = (J - R) Q$ , and $C = B^{T} Q$ , $J$ being a skew-symmetric interconnection matrix, $R$ a positive semi-definite dissipation matrix and $Q$ a positive definite energy matrix [Citation40]. Models in pH form bear several advantages, such as passivity, and it is therefore of great relevance to preserve the pH structure after reduction. In a setting of tangential interpolation, this can be achieved by computing a projection matrix V according to (4) and choosing $W$ in order to retain the pH structure (cf [Citation35,Citation41].).

Gugercin, Polyuga, Beattie and van der Schaft present in [Citation35] an iterative algorithm by the name of IRKA-PH, that adaptively chooses interpolation frequencies and input tangential directions, inspired by the IRKA iteration. In general, the resulting reduced model will not satisfy the first-order optimality conditions (17), as $W$ is not chosen to enforce bitangential Hermite interpolation but rather structure preservation. Nonetheless, the algorithm has shown to produce good approximations in practice.

To reduce the cost involved in repeatedly computing the projection matrices (4) in IRKA-PH, a Model Function in pH form could be introduced and updated. At convergence, the reduced model satisfies the condition (18) while being of pH-form as in the IRKA-PH case. In addition, the resulting Model Function (also in pH form) could be used e.g. for error estimation.

6.3. TF-IRKA

Gugercin and Beattie demonstrate in [Citation19] how to construct a reduced-order model $G_{r} (s)$ satisfying first-order $H_{2}$ optimality conditions (17) only from evaluations of the transfer function $G (s)$ . The algorithm by the name of TF-IRKA exploits the Loewner-Framework, a data-driven approximation approach by Mayo and Antoulas [Citation42] that produces a reduced model that tangentially interpolates the transfer function of a given system, whose transfer behaviour can be measured or evaluated at selected frequencies. Therefore, this algorithm can be used to obtain a reduced-order model satisfying optimality conditions for irrational, infinite dimensional dynamical systems for which an expression for $G (s)$ can be obtained by direct Laplace transformation of the partial differential equation, i.e. without prior discretization. In addition, it can be used for optimal approximation of systems where no model is available but the transfer behaviour $G (s)$ can be measured in experiments. Similar approaches have been derived to obtain reduced-order models for systems with delays [Citation43].

The cost of TF-IRKA is dominated by the evaluation of the transfer function $G (s)$ . In general, if an analytic expression is given, this cost is minimal compared to evaluating a discretized high-order model. However, if the evaluation is obtained through costly measurements, an approach to reduce the number of evaluations would be highly beneficial. In [Citation44], Beattie, Drma $\overset{⌣}{c}$ , and Gugercin introduce a quadrature-based version of TF-IRKA for SISO models, called Q-IRKA, that only requires the evaluation of $G (s)$ once for some frequencies (quadrature nodes) and returns a reduced-order model that satisfies the optimality conditions (17) within the quadrature error. In [Citation45], the same authors propose a similar approach by the name of QuadVF, based on vector fitting rational approximations using frequency sampling points.

By exploiting the new framework presented in this paper, it is possible to reduce the number of evaluations of $G (s)$ and satisfy the optimality conditions (17) exactly. By creating a surrogate using the Loewner-Framework, standard IRKA can be applied to obtain a new set of frequencies and tangential directions. Evaluating (or measuring) $G (s)$ for the new optimal frequencies allows to update the surrogate and repeat the process. Compared to Q-IRKA, this approach bears the additional advantage of automatically finding the amount and position of suitable frequencies for evaluating $G (s)$ . In contrast, Q-IRKA and QuadVF require the initial evaluation of $G (s)$ at some pre-specified quadrature nodes. While a judicious spacing of the sampling frequencies in QuadVF was presented in [Citation45], the frequency range and number of sampling points remains a tunable parameter within this approach.

6.4. Nonlinear systems

Model reduction approaches for general non-linear systems typically differ from what discussed in this contribution. However, in the case of weak nonlinearities, a bilinear formulation can be obtained by means of Carleman Bilinearization [Citation46]. For this class of non-linear systems, the dynamic equations are similar to (1) and are complemented by an additional term that sums the weighted product between the state and the inputs. Recently, Benner and Breiten have presented in [Citation36] a Bilinear IRKA (B-IRKA) algorithm that produces a reduced bilinear model satisfying, at convergence, first-order optimality conditions with respect to the bilinear $H_{2}$ norm for the approximation error. This algorithm requires the repeated solution of two bilinear Sylvester equations, which can be obtained either by vectorization, solving linear systems of the dimension $N n$ , or, equivalently, computing the limit to an infinite series of linear Sylvester equations. Flagg and Gugercin have demonstrated in [Citation37,Citation47] for SISO models that the optimality conditions in [Citation36] can be interpreted as interpolatory conditions of the underlying Volterra series ([Citation37, Theorem 4.2]). In addition, they present an algorithm to construct reduced bilinear models that achieve multipoint Volterra series interpolation by construction (cf. [Citation37, Theorem 3.1]), requiring the solution of two bilinear Sylvester equations. Exploiting the often fast convergence of the Volterra integrals, they propose a Truncated B-IRKA (TB-IRKA) that satisfies the first-order optimality conditions asymptotically as the truncation index tends to infinity. Note that extensions to bilinear DAEs were presented by Benner and Goyal in [Citation48].

The interpolatory nature of this $H_{2}$ approach makes it possible to extend the new framework presented in this paper to bilinear systems, hopefully speeding up the reduction with B-IRKA significantly. In fact, given a set of initial frequencies, a bilinear Model Function can be constructed that interpolates the Volterra series at these frequencies. This would require the solution of the full bilinear Sylvester equations for a few selected frequencies. Using this bilinear surrogate model, B-IRKA can be run efficiently on a lower dimension to find a new set of optimal frequencies and update the Model Function. Note that the cost of solving a full-dimensional bilinear Sylvester equation plays the same role in the complexity of B-IRKA as the cost of a full-dimensional LU decomposition for linear systems. Assuming the convergence behaviour of the new framework is similar to the linear case, this could lead to an algorithm that produces reduced bilinear models that satisfy the $H_{2}$ optimality conditions exactly – i.e. without truncation error – at a far lower cost.

In presence of strong nonlinearities, the dynamics can be captured exactly in a quadratic bilinear model where, in addition, a quadratic term in the state variable is present. An $H_{2}$ approach for this system class has been recently presented by Benner, Goyal and Gugercin in [Citation38] and it is not evident at this point, how the new framework could be applied to this system class while preserving rigor of the results.

7. Conclusions and outlook

In this paper, we have presented a new general framework for $H_{2}$ -optimal reduction, based on the local nature of (tangential) interpolation and $H_{2}$ optimality. The motivation to develop this framework, which can be interpreted e.g. as a surrogate optimization or subspace acceleration approach, has resulted from theoretical and numerical considerations on the cost of $H_{2}$ -optimal reduction. By means of updated surrogate optimization, the cost of reduction can be decoupled from the cost of optimization. Through a numerical example, we have illustrated the effectiveness of the new framework in reducing the cost of $H_{2}$ -optimal reduction. Theoretical considerations have demonstrated how the reduced-order models resulting from this new framework still satisfy first-order optimality conditions at convergence. The system class discussed in detail in this contribution has been MIMO LTI models. In addition, first indications have been given on how to extend this new framework for $H_{2}$ reduction of different system classes, e.g. DAEs and bilinear systems. The new framework not only produces optimal reduced models at a far lower cost than conventional methods, but also provides – at no additional cost – a middle-sized surrogate model that can be used e.g. for error estimation, as demonstrated in numerical examples.

Recent research endeavours exploit the new framework to obtain the global optimum amongst all local $H_{2}$ optimal reduced models of prescribed order [Citation29]. In current research, the Model Function is used for error estimation in the cumulative reduction framework CURE by Panzer, Wolf and Lohmann [Citation11,Citation12,Citation49] to adaptively chose the reduced order. Finally, the Model Function framework is currently being used in parametric model reduction to recycle the Model Function from one parameter sample point to another and reduce the cost tied to the repeated reduction at several points in the parameter space. Preliminary work is available in [Citation50]. These advances will be topics of future investigations and publications.

Supplemental material

Supplementary_materials_MATLAB_toolboxes_sss___sssMOR.zip

Download Zip (647 KB)

Disclosure statement

No potential conflict of interest was reported by the authors.

Additional information

Funding

The work related to this contribution is supported by the German Research Foundation (DFG), Grant [LO408/19-1].

Notes

1. For dense matrices, this can be motivated by simple asymptotic operation counts. The QR decomposition of a

V \in R^{N \times n}

matrix via Householder requires

2 n^{2} (N - \frac{n}{3})

flops and is hence linear in

N

. The flops involved in the product

W^{T} E V

are

n N (2 N - 1) + n^{2} (2 N - 1)

for a dense

E

and hence quadratic in

N

. Note however that for a diagonal

E

matrix – an ideally sparse invertible matrix – the flops become at most

n N + n^{2} (2 N - 1)

, hence being linear in

N

[Citation23].

2. For the results of this contribution, we have used sparse LU decompositions, which are still feasible for models of relative large size even on standard machines (cf. ). This approach bears the advantage of allowing the recycling of LU factors, e.g. to solve LSEs with same left-hand sides. If other methods are used instead, some of the values and statements in this contribution may vary. Note, in addition, that complex conjugated pairs of shifts $σ_{i} = \overline{σ_{j}}$ yield complex conjugated directions $V_{i}^{P} = \overline{V_{j}^{P}}$ . Therefore, the actual number of LSE to be solved may actually vary, making $2 n$ a worst-case estimate.

3. All numerical examples presented in this contribution were generated using the sss and sssMOR toolboxes in MATLAB® [Citation24].

Related Research Data

A new framework for H2-optimal model reduction

Source: Figshare

A new framework for H2-optimal model reduction

Source: Figshare

A new framework for H2-optimal model reduction

Source: Figshare

Linking provided by

References

T. Penzl, A cyclic low-rank Smith method for large sparse Lyapunov equations, SIAM J. Sci. Comput. 21 (4) (2000), pp. 1401–1418. doi:10.1137/S1064827598347666
Web of Science ®Google Scholar
V. Druskin, V. Simoncini, and M. Zaslavsky, Adaptive tangential interpolation in rational Krylov subspaces for MIMO dynamical systems, SIAM J. Matrix Anal. Appl. 35 (2) (2014), pp. 476–498. doi:10.1137/120898784
Web of Science ®Google Scholar
P. Kürschner, Efficient low-rank solution of large-scale matrix equations, Ph.D. thesis, Max Planck Institute for Dynamics of Complex Technical Systems, 2016.
Google Scholar
E.J. Grimme, Krylov projection methods for model reduction, Ph.D. Thesis, University of Illinois at Urbana-Champaign, USA, 1997.
Google Scholar
C.A. Beattie and S. Gugercin, Model reduction by rational interpolation, in Model Reduction and Approximation: Theory and Algorithms, P. Benner, A.M. Cohen, M. Ohlberger, eds., et al., math.NA. SIAM, Sep. 2017, Available at http://arxiv.org/abs/1409.2140v1.
Google Scholar
K. Glover, All optimal Hankel-norm approximations of linear multivariable systems and their L∞-error norms, Int. J. Control 39 (6) (1984), pp. 1115–1193. doi:10.1080/00207178408933239
Web of Science ®Google Scholar
A. Castagnotto, C.A. Beattie, and S. Gugercin, Interpolatory methods for H2∞ model reduction of multi-input/multi-output systems, in Model Reduction of Parametrized Systems, P. Benner, M. Ohlberger, A. Patera, eds., et al., Vol. 17, Modeling, Simulation & Applications, Springer, Cham, 2017, Chap. 22, pp. 349–365. doi:10.1007/978-3-319-58786-8_22
Google Scholar
S. Gugercin, A.C. Antoulas, and C.A. Beattie, H2 model reduction for largescale dynamical systems, SIAM J. Matrix Anal. Appl. 30 (2) (2008), pp. 609–638. doi:10.1137/060666123
Web of Science ®Google Scholar
P. Van Dooren, K. Gallivan, and P.-A. Absil, H2-optimal model reduction of MIMO systems, Appl. Math. Lett. 21 (2008), pp. 1267–1273. doi:10.1016/j.aml.2007.09.015
Google Scholar
C.A. Beattie and S. Gugercin, A trust region method for optimal H2 model reduction, in IEEE Conference on Decision and Control, IEEE, 2009, pp. 5370–5375. doi:10.1109/CDC.2009.5400605
Google Scholar
H.K.F. Panzer, S. Jaensch, T. Wolf, et al. A Greedy Rational Krylov Method for H2-pseudooptimal model order reduction with preservation of stability, in Proceedings of the American Control Conference, 2013, pp. 5512–5517.
Google Scholar
H.K.F. Panzer, Model order reduction by Krylov subspace methods with global error bounds and automatic choice of parameters, Ph.D. thesis, Technische Universität München, 2014.
Google Scholar
A. Castagnotto, H.K.F. Panzer, and B. Lohmann, Fast H2-optimal model order reduction exploiting the local nature of Krylov-subspace methods, in European Control Conference 2016, Aalborg, Denmark, 2016, pp. 1958–1963. doi:10.1109/ECC.2016.7810578
Google Scholar
K. Gallivan, A. Vandendorpe, and P. Van Dooren, Model reduction of MIMO systems via tangential interpolation, SIAM J. Matrix Anal. Appl. 26 (2) (2004), pp. 328–349. doi:10.1137/S0895479803423925
Web of Science ®Google Scholar
K. Gallivan, A. Vandendorpe, and P. Van Dooren, Sylvester equations and projection-based model reduction, J. Comput. Appl. Math. 162 (1) (2004), pp. 213–229. doi:10.1016/j.cam.2003.08.026
Web of Science ®Google Scholar
A.C. Antoulas, Approximation of Large-Scale Dynamical Systems, Vol. 6, Advances in Design and Control, SIAM Publications, Philadelphia, PA, 2005, ISBN 9780898715293. doi:10.1137/1.9780898718713
Google Scholar
A.C. Antoulas, C.A. Beattie, and S. Gugercin, Interpolatory model reduction of large-scale dynamical systems, in Efficient Modeling and Control of Large-Scale Systems, Springer US, 2010, pp. 3–58. doi:10.1007/978-1-4419-57573_1
Google Scholar
L. Meier and D.G. Luenberger, Approximation of linear constant systems, IEEE Trans. Autom. Control 12 (5) (Oct., 1967), pp. 585–588. doi:10.1109/TAC.1967.1098680
Google Scholar
C.A. Beattie and S. Gugercin, Realization-independent H22-approximation, in 51st IEEE Conference on Decision and Control, IEEE, 2012, pp. 4953–4958. doi:10.1109/CDC.2012.6426344
Google Scholar
Y. Saad, Iterative Methods for Sparse Linear Systems, SIAM, 2003.
Google Scholar
Y. Chahlaoui and P. Van Dooren, A collection of benchmark examples for model reduction of linear time invariant dynamical systems, Tech. rep. 2002-2, SLICOT Working Note, 2002. Available at www.slicot.org.
Google Scholar
J.G. Korvink and E.B. Rudnyi, Oberwolfach benchmark collection, in Dimension Reduction of Large-Scale Systems, P. Benner, D.C. Sorensen, and V. Mehrmann, eds., Vol. 45, Lecture Notes in Computational Science and Engineering, Springer Berlin Heidelberg, 2005, pp. 311–315. doi:10.1007/3-54027909-1_11
Google Scholar
G.H. Golub and C.F. Van Loan, Matrix Computations, Johns Hopkins University Press, Baltimore, 1996.
Google Scholar
A. Castagnotto, M. Cruz Varona, and L. Jeschek, sss & sssMOR: analysis and reduction of large-scale dynamic systems in MATLAB, atAutomatisierungstechnik, 65 (2) (2017), pp. 134–150. doi:10.1515/auto-20160137
Web of Science ®Google Scholar
N.V. Queipo, R.T. Haftka, W. Shyy, T. Goel, R. Vaidyanathan, and P. Kevin Tucker, Surrogate-based analysis and optimization, Prog. Aerosp. Sci. 41 (1) (2005), pp. 1–28. doi:10.1016/j.paerosci.2005.02.001
Web of Science ®Google Scholar
P. Benner, Z. Tomljanović, and N. Truhar, Optimal damping of selected eigenfrequencies using dimension reduction, Numer. Lin. Alg. Appl. 20 (1) (2013), pp. 1–17, ISSN . doi:10.1002/nla.833
Web of Science ®Google Scholar
J. Rommes and N. Martins, Effcient computation of transfer function dominant poles using subspace acceleration, IEEE Trans. Power Syst. 21 (4) (2006), pp. 1218–1226. doi:10.1109/TPWRS.2006.876671
Web of Science ®Google Scholar
J.D. Pintér, Global Optimization in Action: Continuous and Lipschitz Optimization: Algorithms, Implementations and Applications, Vol. 6, Springer Science & Business Media, 2013.
Google Scholar
A. Castagnotto, S. Hu, and B. Lohmann, An approach for globalized H2-optimal model reduction, in Proceedings of the 9th Vienna International Conference on Mathematical Modelling, 2018.
Google Scholar
J. Lienemann, D. Billger, E.B. Rudnyi, et al., MEMS compact modeling meets model order reduction: Examples of the application of Arnoldi methods to microsystem devices, in The Technical Proceedings of the 2004 Nanotechnology Conference and Trade Show, Nanotech, Vol. 4, 2004.
Google Scholar
S. Gugercin, Projection methods for model reduction of large-scale dynamical systems, Ph.D. thesis, Rice University, 2003. Available at http://hdl.%20handle.%20net/1911/18536.
Google Scholar
S. Gugercin and A.C. Antoulas, An H2 error expression for the Lanczos procedure, in 42nd IEEE Conference on Decision and Control, Vol. 2, IEEE, 2003, pp. 1869–1872.
Google Scholar
S. Gugercin, T. Stykel, and S. Wyatt, Model reduction of descriptor systems by interpolatory projection methods, SIAM J. Sci. Comput. 35 (5) (2013), pp. B1010–B1033. doi:10.1137/130906635
Web of Science ®Google Scholar
I. Pontes Duff, S. Gugercin, C. Beattie, C. Poussot-Vassal, and C. Seren, H2-optimality conditions for reduced time-delay systems of dimension one, IFAC Pap. OnLine 49 (10) (2016), 13th IFAC Workshop on Time Delay Systems TDS 2016, pp. 7–12. doi:10.1016/j.ifacol.2016.07.464
Google Scholar
S. Gugercin, R.V. Polyuga, C. Beattie, and A. van der Schaft, Structure-preserving tangential interpolation for model reduction of port-Hamiltonian systems, Autom. 48 (9) (2012), pp. 1963–1974. doi:10.1016/j.automatica.2012.05.052.
Web of Science ®Google Scholar
P. Benner and T. Breiten, Interpolation-based H2-model reduction of bilinear control systems, SIAM J. Matrix Anal. Appl. 33 (3) (2012), pp. 859–885. doi:10.1137/110836742
Web of Science ®Google Scholar
G.M. Flagg and S. Gugercin, Multipoint volterra series interpolation and H2Optimal model reduction of bilinear systems, SIAM J. Numer. Anal. 36 (2) (2015), pp. 549–579. doi:10.1137/130947830
Google Scholar
P. Benner, P. Goyal, and S. Gugercin, H2-quasi-optimal model order reduction for quadratic-bilinear control systems, arXiv Preprint arXiv:1610.03279 (2016).
Google Scholar
P. Benner and T. Stykel, Model Order Reduction for Differential-Algebraic Equations: A Survey, Springer, 2017, pp. 107–160.
Google Scholar
A. van der Schaft, L2-Gain and Passivity Techniques in Nonlinear Control, 3rd ed., Springer, 2017.
Google Scholar
T. Wolf, B. Lohmann, R. Eid, and P. Kotyczka, Passivity and structure preserving order reduction of linear port-Hamiltonian systems using Krylov subspaces, Eur. J. Control 16 (4) (2010), pp. 401–406. doi:10.3166/ejc.16.401-406
Web of Science ®Google Scholar
A.J. Mayo and A.C. Antoulas, A framework for the solution of the generalized realization problem, Linear Algebra Appl. 425 (2–3) (2007), Special Issue in honor of P. A. Fuhrmann, Edited by A. C. Antoulas, U. Helmke, J. Rosenthal, V. Vinnikov, and E. Zerz, pp. 634–662, ISSN: . doi:10.1016/j.laa.2007.03.008
Web of Science ®Google Scholar
I. Pontes Duff, C. Poussot-Vassal, and C. Seren, Realization independent single time-delay dynamical model interpolation and H2-optimal approximation, in 2015 54th IEEE Conference on Decision and Control (CDC), Dec. 2015, pp. 4662–4667. doi:10.1109/CDC.2015.7402946
Google Scholar
C. Beattie, Z. Drmač, and S. Gugercin, Quadrature-based IRKA for optimal H2 model reduction, IFAC Pap. OnLine 48 (1) (2015), 8th Vienna International Conferenceon Mathematical Modelling, pp. 5–6. doi:10.1016/j.ifacol.2015.05.196.
Google Scholar
Z. Drmač, S. Gugercin, and C. Beattie, Quadrature-based vector fitting for discretized H2 approximation, SIAM J. Sci. Comput. 37 (2) (2015), pp. A625–A652. doi:10.1137/140961511
Google Scholar
W.J. Rugh, Nonlinear System Theory, Johns Hopkins University Press Baltimore, 1981.
Google Scholar
G.M. Flagg, Interpolation methods for the model reduction of bilinear systems, Ph.D. Thesis, Virginia Polytechnic Institute and State University, Blacksburg, Virginia, Apr. 2012.
Google Scholar
P. Benner and P. Goyal, Multipoint interpolation of volterra series and H2Model reduction for a family of bilinear descriptor systems, Syst. Control Lett. 97 (2016), pp. 1–11. doi:10.1016/j.sysconle.2016.08.008
Web of Science ®Google Scholar
T. Wolf, Pseudo-optimal model order reduction, Dissertation, Technische Universität München, Munich, Germany, 2015.
Google Scholar
J. Peters, Parametrische modellordnungsreduktion mit dem H₂-optimalen Modellfunktions-Framework, Bachelor Thesis, Technical University of Munich, Oct. 2017.
Google Scholar

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Download PDF

Share icon
Back to Top

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Your download is now in progress and you may close this window

Did you know that with a free Taylor & Francis Online account you can gain access to the following benefits?

Choose new content alerts to be informed about new research of interest to you
Easy remote access to your institution's subscriptions on any device, from any location
Save your searches and schedule alerts to send you new results
Export your search results into a .csv file to support your research

Have an account?
Login now Don't have an account?
Register for free

Login or register to access this feature

Have an account?
Login now Don't have an account?
Register for free

Choose new content alerts to be informed about new research of interest to you
Easy remote access to your institution's subscriptions on any device, from any location
Save your searches and schedule alerts to send you new results
Export your search results into a .csv file to support your research

A new framework for H2-optimal model reduction

ABSTRACT

1. Introduction

2. Preliminaries

2.1. Model reduction by tangential interpolation

2.2. $ℋ_{2}$ -optimal reduction

3. The cost of $H_{2}$ -optimal reduction

4. A new framework for $H_{2}$ -optimal reduction

4.1. The Model Function framework

4.2. An illustrative example

4.3. Optimality of the new framework

4.4. Interpretations of the new framework

4.5. The cost of the new framework

4.6. Further advantages of the new framework

5. Numerical results

Table 1. Reduction results for the gyro model using zero initialization.

Table 2. Reduction results for the gyro model using eigs initialization.

6. Application to different system classes

6.1. DAE systems

6.2. Port-Hamiltonian systems

6.3. TF-IRKA

6.4. Nonlinear systems

7. Conclusions and outlook

Supplementary_materials_MATLAB_toolboxes_sss___sssMOR.zip

Disclosure statement

Related Research Data

References

Information for

Open access

Opportunities

Help and information

A new framework for H2-optimal model reduction

ABSTRACT

1. Introduction

2. Preliminaries

2.1. Model reduction by tangential interpolation

2.2. ℋ2-optimal reduction

3. The cost of H2-optimal reduction

4. A new framework for H2-optimal reduction

4.1. The Model Function framework

4.2. An illustrative example

4.3. Optimality of the new framework

4.4. Interpretations of the new framework

4.5. The cost of the new framework

4.6. Further advantages of the new framework

5. Numerical results

Table 1. Reduction results for the gyro model using zero initialization.

Table 2. Reduction results for the gyro model using eigs initialization.

6. Application to different system classes

6.1. DAE systems

6.2. Port-Hamiltonian systems

6.3. TF-IRKA

6.4. Nonlinear systems

7. Conclusions and outlook

Supplementary_materials_MATLAB_toolboxes_sss___sssMOR.zip

Disclosure statement

Additional information

Funding

Notes

Related Research Data

References

Reprints and Corporate Permissions

Academic Permissions

Related research

To cite this article:

Download citation

Your download is now in progress and you may close this window

Login or register to access this feature

Information for

Open access

Opportunities

Help and information

Keep up to date

2.2. $ℋ_{2}$ -optimal reduction

3. The cost of $H_{2}$ -optimal reduction

4. A new framework for $H_{2}$ -optimal reduction