909
Views
0
CrossRef citations to date
0
Altmetric
Articles

Numerical strategies for recursive least squares solutions to the matrix equation AX = B

&
Pages 497-510 | Received 31 Jan 2022, Accepted 02 Sep 2022, Published online: 21 Oct 2022

Abstract

The recursive solution to the Procrustes problem -with or without constraints- is thoroughly investigated. Given known matrices A and B, the proposed solution minimizes the square of the Frobenius norm of the difference AXB when rows or columns are added to A and B. The proposed method is based on efficient strategies which reduce the computational cost by utilizing previous computations when new data are acquired. This is particularly useful in the iterative solution of an unbalanced orthogonal Procrustes problem. The results show that the computational efficiency of the proposed recursive algorithms is more significant when the dimensions of the matrices are large. This demonstrates the usefulness of the proposed algorithms in the presence of high-dimensional data sets. The practicality of the new method is demonstrated through an application in machine learning, namely feature extraction for image processing.

MSC2020 AMS SUBJECT CLASSIFICATIONS:

1. Introduction

In practice, one may be interested in finding the matrix X such that AX=B where matrices A and B come from experiments. However, A and B often do not satisfy the solvability conditions and hence, the least squares solution of the difference AXB is required [Citation15]. Specifically, the problem of approximating one given matrix A with another given matrix B by a transformation matrix X so that the square of the difference AXB is minimized is known as the Procrustes problem [Citation13,Citation15,Citation33]. Often, depending on the application, it is assumed that X belongs to a specific class of matrices, and thus setting in this way a set of constraints to the optimization problem. The most frequent classes of matrices for X is orthogonality and symmetry, and variants thereof, see, for example [Citation5,Citation11,Citation15,Citation19,Citation21,Citation29,Citation31,Citation33]. In many cases, orthogonal factorizations like the singular value decomposition (SVD), the eigenvalue decomposition (EVD) and the CS decomposition have been used to solve the Procrustes problem and variants thereof [Citation8,Citation15,Citation19,Citation22,Citation31,Citation33, pp. 327–328].

The application of the Procrustes problem in factor analysis has a long history [Citation10,Citation13,Citation15,Citation28]. It also appears in numerical analysis problems for the solution of partial differential equations, in multidimensional scaling, in growth curve modelling, in scientific computing, in computer vision, in image processing, in system and control theory, in the analysis of space structures and in aerospace engineering for spacecraft attitude determination [Citation12,Citation23,Citation24,Citation26,Citation27,Citation32,Citation38,Citation41,Citation43]. Moreover, in the case where A is the identical matrix the problem becomes a matrix nearness problem with many applications in statistical and financial modelling and in theoretical computer science, see, for example [Citation20,Citation36,Citation39].

The recursive solution to a least squares problem is needed when the experiment is conducted repeatedly and as a result the given matrices are updated with new arriving data regularly. Also, in high dimensional settings, the matrices A and B are very large and it may not be possible to treat all data at once or the computational cost of processing them may be significantly expensive. In this case, a sequential procedure which splits A and B into sub-matrices of smaller dimensions and then proceeds by gradually incorporating the sub-matrices into the least squares solution of AXB is essential. Recursive least squares is often needed in many problems of different areas like engineering, statistics, econometrics and finance [Citation7,Citation9,Citation17,Citation18,Citation42]. A recursive algorithm reduces the computational cost and also the storage requirements for large matrices.

Herein, the recursive least squares solution to the matrix equation AX=B when A and B are known matrices is investigated in depth. Namely, the recursive solution to the Procrustes problem is examined. The use of the QR decomposition is examined when there are no constraints on X. Also, the problems of minimizing the difference AXB when X is orthogonal and when X is symmetric are also considered and their recursive least squares solution using the eigenvalue and singular value decompositions is explored in depth. When constraints are imposed on X, the method of Lagrange multipliers is used to solve the optimization problem. The proposed solution, in each case, is the matrix which minimizes the square of the Frobenious norm of AXB. It is an exact solution in the sense that X is explicitly determined and does not comprise any arbitrary elements. The recursive numerical solution proposed does not require the matrices be full rank.

Throughout this paper, F denotes the Frobenius norm. Also, for known matrices S and P, when computing partial derivatives [Citation25, p. 201] the following properties are used: (1) (SXP)X=STPT,(SXTP)X=PS.(1) The paper is organized as follows. Section 2 introduces the general Procrustes problem where no assumption is made for the solution matrix X. The problem is solved using the QR decomposition and then the recursive solution is presented. Section 3 considers the orthogonal Procrustes problem where the solution matrix X is orthogonal and Section 4 derives the solution to the symmetric Procrustes problem when X is assumed to be symmetric. Section 5 presents computational results and finally, in Section 6 we conclude and discuss future work.

2. Numerical solution to the general procrustes problem

Consider the problem of finding a matrix XRn×n so that the known matrix BRm×n is approximated by matrix AX where ARm×n is also known. That is, a solution to the matrix equation (2) AX=B(2) is required. The least squares approximation problem to be solved is given by (3) argminXAXBF2,(3) where f(X)=AXBF2=trace((AXB)T(AXB))=trace(XTATAXXTATBBTAX+BTB).On using (Equation1) partial differentiation yields (4) f(X)X=2ATAX2ATB=0(4) whence (5) ATAX=ATB.(5) When ATA is non-singular the solution to (Equation5) is given by X=(ATA)1ATB [Citation14]. However, ATA may be ill-conditioned and in this case inverting the matrix may give inaccurate results. In the case where ATA is singular the latter will fail to give a solution for X. A numerically stable method to obtain X is to use the QR decomposition (QRD) of A, namely (6) QAT(AB)=(RARB10RB2),whereQA=(QA1QA2).(6) Then A=QA1RA and hence ATA=RATRA and ATB=RATRB1. Therefore, X=RA1RB1. When A is rank deficient, the procedure to obtain X is similar to the above, namely (Equation6). In this case, a complete QRD can be computed to triangularize A [Citation1].

2.1. Recursive solution

We next consider the recursive solution of the Procrustes problem when the matrices A and B are updated. Without loss of generality, suppose that A and B are augmented with the addition of a single row; namely, (7) A~=(Aa)B~=(Bb),(7) where A,B are as in (Equation2) and a,bR1×n represent new data points. Then, the updated Procrustes problem, based on A~ and B~, requires the solution of the least squares problem (8) argminX~A~X~B~F2(8) when (Equation3) has already been solved (see (Equation4)–(Equation6)). The efficient solution of (Equation8) requires that previous computations from the solution of (Equation3) be utilized. Namely, argminX~A~X~B~F2=argminX~ (QA1T001n)[(Aa)X~(Bb)]F2=argminX~(RAa)X~(RB1b)F2,where QA1, RA and RB1 are as in (Equation6) and 1n is the n-dimensional row vector of ones. Consider now the updating QRD (9) QAuT(RARB1ab)=(R~AR~B10R~B2).(9) We then have that argminX~A~X~B~F2=argminX~(R~AX~R~B1F2+R~B2F2)and therefore it follows that X~=R~A1R~B1.

In many cases, it is possible that the matrices A and B are updated with a single column, namely Aˇ=(AAn+1),Bˇ=(BBn+1)where An+1, Bn+1Rm×1 denote new variables which become available after (Equation3) has been solved. As a result, the solution XˇR(n+1)×(n+1) to the updated Procrustes problem (10) argminXˇAˇXˇBˇF2.(10) needs to be computed. By utilizing efficiently previous computations from the solution of (Equation3), (Equation10) is written as argminXˇAˇXˇBˇF2=argminXˇQAT((AAn+1)Xˇ(BBn+1))F2=argminXˇ(RAA~n+10Aˆn+1)Xˇ(RB1B~n+1RB2Bˆn+1)F2,where QA is from the QRD of A in (Equation6). The column-updating QRD that needs to be computed is then given by (In00qˇT)(RAA~n+10Aˆn+1)=(RAA~n+10aˇ00),where qˇ is an orthogonal transformation that eliminates all but the first element of Aˆn+1 and aˇ is a scalar. Hence, the updated Procrustes problem (Equation10) becomes (11) argminXˇAˇXˇBˇF2=argminXˇ(RˇAXˇRˇBF2+RˇB2F2),(11) where RˇA=(RAA~n+10aˇ),RˇB=(RB1B~n+1RB2(1)Bˆn+1(1))andRˇB2=(RB2(2)Bˆn+1(2)).The solution to problem (Equation11) is given by Xˇ=RˇA1RˇB.

3. The orthogonal procrustes problem

The orthogonal Procrustes problem (OPP) is that of minimizing the sum of the squared error of the difference matrix AXB when the unknown matrix X is orthogonal. The constraint is imposed by using the method of Lagrange multipliers; the matrices A and B need not be full rank [Citation33]. The constrained optimization problem is then given by (12) argminXAXBF2subject to XXT=XTX=I,(12) where A,BRm×n and XRn×n is orthogonal. Herein, we are most interested in cases where m<n. To find the solution to (Equation12), we consider the Lagrangian function L(X)=trace((AXB)T(AXB))+trace(Λ(XXTI)),which is equivalently written as (13) L(X)=j=1ni=1m(q=1naiqxqjbij)2+q,r=1nλqr(k=1nxqsxrsδqr),(13) where Λ=[λqr]q,r=1n is the symmetric matrix of Lagrange multipliers [Citation15]. Partial differentiation of (Equation13) yields (14) L(X)xpj=2i=1m(q=1naiqxqjbij)aip+2r=1nλqrxrj.(14) On setting (Equation14) equal to zero and using matrix notation, (Equation14) can be written as AqTAqXj+ΛqXj=AqTBj,q,j=1,,n,or equivalently, as (15) (ATA+Λ)X=ATB,(15) where Aq is the qth column of A, Xj, Bj are the jth columns of X and B, respectively, and Λq is the qth row of Λ. From (Equation15), (ATA+Λ)XXT(ATA+Λ)T=ATB(ATB)T, therefore Λ=(ATBBTA)1/2ATA and, as observed earlier, Λ is symmetric. Now on post-multiplying (Equation15) by XT gives ATA+Λ=ATBXT which implies that Λ=ATBXTATA, whence ATBXT=XBTA. Therefore, (16) ATB=XBTAX.(16) Furthermore, let H=ATB and consider the following two matrices F=HHT=ATBBTAand G=HTH=BTAATB.Matrices F, GRn×n are symmetric and thus they are diagonalizable and their eigenvalue decomposition (EVD) exists, that is, (17a) F=UDUT(17a) and (17b) G=VDVT,(17b) where U,VRn×n are orthogonal. Additionally, since they are of the form HHT and HTH they have the same eigenvalues. Now on using (Equation16) it follows that F=ATBBTA=(XBTAX)(XBTAX)T=XBTAATBXT=XGXT,and from (Equation17a) and (Equation17b) we now have that F=UDUT=XVDVTXT,which implies that U=XV, where U is as in (Equation17a). Therefore, the solution to the least squares problem (Equation12) is given by X=UVT.Finally, a sufficient condition which guaranties the uniqueness of the solution X and that the argument in (Equation12) is minimized requires that all the diagonal elements of the matrix D1/2 in (Equation17a) and (Equation17b) are non-negative [Citation15,Citation33]. In essence, this condition determines the orientation of the orthogonal matrices U and V and is specified by the Eckart–Young decomposition (see [Citation10]) of ATB, namely ATB=UD1/2VT.Notice that in the case of symmetric orthogonality the procedure is the same but at the last step the symmetry of X needs to be taken into account. Namely, the sum of UVT is not explicitly computed since only the upper or lower part of X needs to be determined. This results in a dimensional reduction of the solution matrix from n2 to n(n+1)/2 [Citation40].

3.1. Recursive solution to the orthogonal procrustes problem

Suppose that an updated orthogonal Procrustes problem needs to be solved when new data become available. Without loss of generality it is assumed that the original OPP (Equation12) needs to be re-solved when appending a new single row of data in matrices A and B. That is, let the row updated matrices A~ and B~ be as in (Equation7), where A,B are as in (Equation12) with a,bR1×n. The updated orthogonal Procrustes problem based on A~ and B~ requires that the solution of the following least squares problem be derived: (18) argminX~A~X~B~F2subject to X~X~T=X~TX~=I,(18) where X~Rn×n. The solution to the least squares problem (Equation18) is obtained by using the method of Lagrange multipliers, namely (19) L(X~)=trace(A~X~B~)T(A~X~B~))+trace(Λ~(X~X~TI))(19) which yields the first order condition L(X~)X~=2A~TA~X~2A~TB~+(Λ~T+Λ~)X~on using (Equation1). In a way similar to (Equation12), (Equation19) has the solution (20) X~=U~V~T,(20) where U~ and V~ are the orthogonal matrices of the EVD of A~TB~B~TA~ and B~TA~A~TB~, respectively.

The recursive solution of the OPP presumes that previous computations in (Equation17a) from the solution of the original Procrustes problem (Equation12) are efficiently utilized. Consider the recursive computation of the EVD of A~TB~B~TA~, namely (21) A~TB~B~TA~=(ATB+aTb)(ATB+aTb)T=ATBBTA+ATBbTa+aTbBTA+aTbbTa=UDUT+ATBbTa+aTbBTA+aTbbTa(21) on using (Equation17a). Therefore, the recursive solution of an orthogonal Procrustes problem becomes a modified symmetric matrix eigenvalue problem [Citation3,Citation4,Citation16]. In particular, (Equation21) implies three rank-1 modifications of the EVD UDUT of the matrix ATBBTA in (Equation17a). That is, the EVD of A~TB~B~TA~ is obtained recursively in three main steps. First, (Equation21) is written as (22) A~TB~B~TA~=UDUT+(b~a)T(011β)(b~a),(22) where b~=bBTA and β=bbT is a scalar since b and a are row vectors. Second, the following EVD is derived (23) Δ=(011β)=Q(β100β2)QT(23) and, since Δ is a symmetric matrix, it is diagonalizable. Third, on using  (Equation22), (Equation23) becomes (24) A~TB~B~TA~=UDUT+(b~a)TQ(β100β2)QT(b~a)=UDUT+(b~1b~2)T(β100β2)(b~1b~2)=UDUT+β1b~1Tb~1+β2b~2Tb~2=U(D+β1b~~1Tb~~1)UT+β2b~2Tb~2,(24) where b~~1=b~1U with β1, β2 being the eigenvalues of Δ. Consider now the sequential updating of the diagonal matrix D in two steps. The first step computes the EVD of D+β1b~~1Tb~~1=U1D1U1T, where U1Rn×n is orthogonal and D1Rn×n is diagonal. The second step computes the EVD of D1+β2b~~2Tb~~2=U2D~U2T, where U2Rn×n is orthogonal and D~Rn×n is the diagonal matrix with elements the eigenvalues of A~TB~B~TA~. That is, A~TB~B~TA~=U~D~U~T, where U~=UU1U2 and U is as in (Equation17b). Repeating the procedure at steps (Equation21)–(Equation24) for B~TA~A~TB~ will derive its EVD recursively and will therefore give B~TA~A~TB~=V~D~V~T, where V~=VV1V2, V is defined in (Equation17a) and V1,V2 are computed in a way similar to that for (Equation24). Therefore, the updated solution (Equation20) to the Procrustes problem (Equation18) has been derived.

When extra columns are added to A and B, that is when (25) A˘=nkAAn+1  and   B˘=nkBBn+1,(25) an updated orthogonal Procrustes problem of larger dimensions needs to be solved. Namely, (26) argminX˘A˘X˘B˘F2subject to X˘X˘T=X˘TX˘=I,(26) where X˘R(n+k)×(n+k) has been augmented by k columns and k rows. In a similar way as for (Equation18), the solution of (Equation26) is given by X˘=U˘V˘T. It is obtained recursively, by updating the original orthogonal decompositions as in (Equation24). Notice that the addition of extra columns implies a rank-k updating of the EVD decomposition.

4. The symmetric procrustes problem

Consider the problem of minimizing the sum of squared error of the difference matrix AXSB when the unknown matrix XS is symmetric. The constrained optimization problem is given by argminXSAXSBF2subject to XST=XS,where XSRn×n is a symmetric matrix. As in the case of the OPP (Equation12), the matrices A and B are not necessarily full rank. Using the method of Lagrange multipliers, the problem becomes that of finding the matrix XS which minimizes L(XS)=trace((AXSB)T(AXSB))+trace(Λ(XSXST)).On using (Equation1) partial differentiation yields L(XS)XS=2ATAXS2ATB+ΛTΛ,which is set to zero. Now let the matrix M=ΛΛT, which is skew-symmetric, that is M=MT. It follows that 2ATAXS2ATB=(2XSATA2BTA),which yields the Lyapunov equation (27) ATAXS+XSATA=ATB+BTA.(27) Since ATA is symmetric, it is therefore a diagonalizable matrix. That is, there is an orthogonal matrix P and a diagonal matrix DA such that (28) ATA=PDAPT,(28) where PRn×n has columns the eigenvectors of ATA and DA=diag(μ1,,μn) has diagonal elements the eigenvalues of ATA. Using  (Equation27), (Equation28) becomes (29) DAXS(P)+XS(P)DA=S,(29) where XS(P)=PTXSP and S=PT(ATB+BTA)P. By utilizing the diagonal structure of DA, it follows that xi,j=sijμi+μj,where XS(P)=[xi,j]i,j=1n. The solution is then given by XS=PXS(P)PT.A necessary and sufficient condition for the uniqueness of XS, since S is positive definite, is that all the eigenvalues of ATA have a negative real part, that is, ATA is a stable matrix [Citation2,Citation34,Citation35].

4.1. Recursive solution to the symmetric procrustes problem

Consider now the case where the matrices A and B are augmented by the addition of an extra row as in (Equation7). The updated symmetric Procrustes problem requires the solution of the optimization problem argminX~SA~X~SB~F2subject to X~S=X~ST,where A~ and B~ are defined as in (Equation7). Using the Lagrange multipliers and the analysis for the symmetric Procrustes problem as in the previous section, the solution is obtained from the Lyapunov equation (30) A~TA~X~S+X~SA~TA~=A~TB~+B~TA~(30) by computing the EVD of A~TA~. To recursively solve (Equation30) observe that on using (Equation28) yields A~TA~=ATA+aTa=PDAPT+aTa=P(DA+a~Ta~)PT.Therefore, the sequential updating of DA requires one rank-1 update, namely, DA+a~Ta~=P1D~AP1T, whence A~TA~=P~D~AP~Twith P~=PP1. The Lyapunov equation (Equation30) then becomes (31) D~AX~SP+X~SPD~A=S~,(31) where X~S(P)=P~TXˇSP~ and S~=P~T(A~TB~+B~TA~)P~. The solution to (Equation31) is obtained in a way similar to that for (Equation29).

5. Computational remarks

To amplify the practical usability of the proposed method, the theoretical complexity analysis of the new algorithms has been studied. Consider the general Procrustes problem when no constraint is imposed to the least squares solution matrix X. Let the matrices A~,B~R(m+1)×n as in (Equation7) and assume that the QRD of A in (Equation6) is available. The complexity of computing the QRD of A~ afresh is 2n2(m+1n/3) floating point operations (flops) and that of applying these orthogonal transformations to matrix B~ is 2n2(2mn+3) flops. The complexities of computing the updating QRD in (Equation9) in order to update RA and RB require 4n2 and 8n2 flops, respectively. As a result, to compute the QRD of A~ and to apply the orthogonal transformations to B~ will always be computationally more demanding than computing it recursively utilizing previous calculations. The computational efficiency of the recursive method compared to computing the QRD from scratch is, approximately, given by (9m+2n)/18.

The solution of the orthogonal Procrustes problem is derived from the EVD of the square matrices F=ATB(ATB)T and G=BTAATB. However, in practice the SVD of ATB=UΣVT is computed since F=UΣ2UT=UDUT and G=VΣ2VT=VDVT. When A and B are modified, the computationally efficient solution to the orthogonal Procrustes problem requires that the SVD of ATB is computed recursively. Therefore, in the recursive solution of the orthogonal and symmetric Procrustes problems, updating SVD decompositions are computed which provide equivalent results as if the EVD were computed. The SVD of an m×n matrix requires n3+mn2+O(mn) flops in the first stage and kmn2+kn2+O(mn) flops in the second stage where k is the number of iterations required to compute each singular value. Herein, it is assumed that mn and it holds that rank(ATB)=r<min(m,n).

The algorithms employ a low-rank modification strategy for the recursive updating of the SVD as in [Citation3]. Algorithms 5.1–5.3 summarize the main computational steps for the solution of the orthogonal Procrustes problem and the recursive proposed solution when new rows or columns of data are added, respectively.

To evaluate further the new algorithms, experiments based on synthetic and real data have been conducted. The computational times of the new algorithms have been compared with the algorithm that solves the same problem afresh in order to obtain the efficiency ratio.

Table  presents the execution times (in CPU seconds) of Algorithm 5.1 compared with Algorithm 5.3 when m = 50, 100, 250, 500 rows and n = 1000, 5000 and a single column of data is added to A and B, that is k = 1. The results show that keeping n fixed and increasing m reduces the computational efficiency. However, comparing Panel A and B shows that the efficiency is more significant when n increases. Table  presents the execution times (in CPU seconds) of the Algorithms 5.1 and 5.2 when m = 100, n = 1000 and when m = 500, n = 10, 000 with a variable number k of new columns are added to A and B. All the times presented are the average times after solving the same OPP (afresh or recursively) 100 times. The computational results in Table  show that as the number of dimensions increase the computational efficiency also increases. Furthermore, the results suggest that the proposed algorithm for the addition on new column is more efficient when a small number of extra columns is included. Comparing the results in Panel A and B of Table  we see that the efficiency of the proposed recursive method increases when the dimensions of the matrices in the original OPP also increase.

Table 1. Execution times in seconds of the recursive solution of the OPP when one single column is added.

Table 2. Execution times in seconds of the recursive solution of the OPP when new columns are added.

Many a time, the solution of an OPP is applied in the development of algorithms for face recognition, see for example [Citation6,Citation30,Citation37,Citation43]. The algorithm proposed in [Citation43] for feature extraction requires to solve -until convergence- a series of updating OPPs as in (Equation12) where A is the data matrix and B is the class indicator matrix after they have both been centred. The processing of an image starts by considering their pixels as the elements of a matrix and then converting each matrix (or each image) to a column vector. Each row in A has length equal to the feature number of the images. When a new image becomes available and is added to the database, a row will be added to A. The face recognition algorithm will then have to process the extra image, that is the additional row of the data matrix A. This is equivalent to row updating as in (Equation18). Herein, Algorithm 5.3 is employed to show the computational efficiency of utilizing previous computations when a series of updating OPPs is solved after augmenting the data matrix.

In Table  the algorithm that estimates the problems afresh and the recursive algorithm are compared. It is assumed that there are m = 2000, 5000, 10, 000, 15, 000, 20, 000 images available which have been cropped and re-sized to 16×16 and 32×32 pixels, that is n = 256, 1024. The run times (in CPU seconds) when an extra image or alternatively an extra row of data is included 100 times are presented. The results show that when new rows of data are added to the problem the efficiency increases as the number of rows increase. The reason for this is the fact that the recursive algorithm does not depend on m, the number of rows of matrices A and B. Instead, it depends on the number of columns n of A and B and on the number of rows l added to the model. This is also why the timings of Algorithm 5.3 are the same when n and l are fixed.

Table 3. Total execution times in seconds of the recursive solution of the OPP when a new single row of data is added 100 times.

Overall, the results show that the computational efficiency of the proposed recursive algorithm increases when the dimensions of the matrices (data) increase. The computational efficiency is significantly more important when a small number of rows or columns is amended -compared to the original matrices- that is, the modification is low-rank. This demonstrates the practicability of the proposed method when solving sequentially OPPs with small modifications in the underlying dataset. The proposed methods are particularly usable when the matrices involved are large-scale and the data are high-dimensional. All the reported computational results were performed on a 64-bit 1.80 GHz Core(TM) i7-8550U processor and 16.0 GB RAM using R (version 3.6.1).

6. Conclusions and future work

The recursive least squares solutions to the matrix equation AX = B when new rows or columns of data are added to data matrices A and B is thoroughly investigated. A computationally efficient algorithm which is based on the singular value decomposition is proposed. Computational results are presented for synthetic data and also for a machine learning application based on feature extraction for face recognition. The recursive solution to the symmetric Procrustes problem when including extra data is also investigated. The experimental results suggest that the proposed algorithms are computationally more efficient when the matrices are high-dimensional and when they are augmented with a small number of rows or columns.

The extension of the proposed method to include other special classes of matrices like reflexive and anti-reflexive, Stiefel matrices or Toeplitz matrices merits further investigation. Future work will also consider the solution to the Procrustes problem with regularization constraints after modifying the matrices with the addition or deletion of data.

Disclosure statement

No potential conflict of interest was reported by the author(s).

References

  • E. Anderson, Z. Bai, and J. Dongarra, Generalized QR factorization and its applications, Linear Algebra Appl. 162–164(0) (1992), pp. 243–271.
  • R.H. Bartels and G.W. Stewart, Solution of the matrix equation AX + XB = C [F4], Commun. ACM 15(9) (1972), pp. 820–826.
  • M. Brand, Fast low-rank modifications of the thin singular value decomposition, Linear Algebra Appl. 415(1) (2006), pp. 20–30.
  • J.R. Bunch and C.P. Nielsen, Updating the singular value decomposition, Numer. Math. 31(2) (1978), pp. 111–129.
  • J.R. Cardoso and K. Ziẹtak, On a sub-Stiefel procrustes problem arising in computer vision, Numer. Linear Algebra Appl. 22(3) (2015), pp. 523–547.
  • D. Chen, X. Cao, F. Wen, and J. Sun. Blessing of dimensionality: High-dimensional feature and its efficient compression for face verification, in 2013 Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Portland, OR, USA, 2013, pp. 3025–3032.
  • R. Christensen, L.M. Pearson, and W. Johnson, Case-deletion diagnostics for mixed models, Technometrics 34(1) (1992), pp. 38–45.
  • K. Chu, Singular value and generalized singular value decompositions and the solution of linear matrix equations, Linear Algebra Appl. 88 (1987), pp. 83–98.
  • R.D. Cook, Influential observations in linear regression, J. Am. Stat. Assoc. 74(365) (1979), pp. 169–174.
  • C. Eckart and G. Young, The approximation of one matrix by another of lower rank, Psychometrika 1(3) (1936), pp. 211–218.
  • L. Eldén and H. Park, A procrustes problem on the Stiefel manifold, Numer. Math. 82(4) (1999), pp. 599–619.
  • Y. Gong, S. Lazebnik, A. Gordo, and F. Perronnin, Iterative quantization: A procrustean approach to learning binary codes for large-scale image retrieval, IEEE Trans. Pattern Anal. Mac. Intell. 35(12) (2013 Dec), pp. 2916–2929.
  • J.C. Gower, Generalized procrustes analysis, Psychometrika 40(1) (1975), pp. 33–51.
  • J.C. Gower, Procrustes methods, Interdiscip. Rev. Comput. Stat. 2(4) (2010), pp. 503–508.
  • B.F. Green, The orthogonal approximation of an oblique structure in factor analysis, Psychometrika 17(4) (1952), pp. 429–440.
  • M. Gu and S.C. Eisenstat, Downdating the singular value decomposition, SIAM J. Matrix Anal. Appl. 16(3) (1995), pp. 793–810.
  • S. Hadjiantoni and E.J. Kontoghiorghes, Estimating large-scale general linear and seemingly unrelated regressions models after deleting observations, Stat. Comput. 27(2) (2017), pp. 349–361.
  • S. Hadjiantoni and E.J. Kontoghiorghes, A recursive three-stage least squares method for large-scale systems of simultaneous equations, Linear Algebra Appl. 536 (2018), pp. 210–227.
  • N.J. Higham, The symmetric procrustes problem, BIT Numer. Math. 28(1) (1988), pp. 133–143.
  • N.J. Higham and N. Strabić, Bounds for the distance to the nearest correlation matrix, SIAM J. Matrix Anal. Appl. 37(3) (2016), pp. 1088–1102.
  • D. Hua, On the symmetric solutions of linear matrix equations, Linear Algebra Appl. 131 (1990), pp. 1–7.
  • A.-P. Liao and Y. Lei, Least-squares solution with the minimum-norm for the matrix equation (AXB, GXH) = (C, D), Comput. Math. Appl. 50(3) (2005), pp. 539–549.
  • X. Liu, Hermitian and non-negative definite reflexive and anti-reflexive solutions to AX=B, Int. J. Comput. Math. 95(8) (2018), pp. 1666–1671.
  • Y.H. Liu, Ranks of least squares solutions of the matrix equation AXB=C, Comput. Math. Appl. 55(6) (2008), pp. 1270–1278.
  • J.R. Magnus and H. Neudecker, Matrix Differential Calculus with Applications in Statistics and Econometrics, 3rd ed., Wiley, Chichester, UK, 2007.
  • F.L. Markley, Attitude determination using vector observations and the singular value decomposition, J. Astronaut. Sci. 36(3) (1988), pp. 245–258.
  • C. Meng, X. Hu, and L. Zhang, The skew-symmetric orthogonal solutions of the matrix equation AX=B, Linear Algebra Appl. 402 (2005), pp. 303–318.
  • C.I. Mosier, Determining a simple structure when loadings for certain tests are known, Psychometrika 4(2) (1939), pp. 149–162.
  • X.Y. Peng, X.Y. Hu, and L. Zhang, The reflexive and anti-reflexive solutions of the matrix equation AHXB=C,J. Comput. Appl. Math. 200(2) (2007), pp. 749–760.
  • Y. Peng, W. Kong, and B. Yang, Orthogonal extreme learning machine for image classification, Neurocomputing 266 (2017), pp. 458–464.
  • Y. Qiu and A. Wang, Solving balanced procrustes problem with some constraints by eigenvalue decomposition,J. Comput. Appl. Math. 233(11) (2010), pp. 2916–2924.
  • M.B. Rubin and D. Solav, Unphysical properties of the rotation tensor estimated by least squares optimization with specific application to biomechanics, Int. J. Eng. Sci. 103 (2016), pp. 11–18.
  • P.H. Schönemann, A generalized solution of the orthogonal procrustes problem, Psychometrika 31(1) (1966),pp. 1–10.
  • V. Simoncini, Computational methods for linear matrix equations, SIAM Rev. 58(3) (2016), pp. 377–441.
  • J. Snyders and M. Zakai, On nonnegative solutions of the equation AD+DA′=−C, SIAM J. Appl. Math. 18(3) (1970), pp. 704–714.
  • Y. Sun and L. Vandenberghe, Decomposition methods for sparse matrix nearness problems, SIAM J. Matrix Anal. Appl. 36(4) (2015), pp. 1691–1717.
  • Y. Tai, J. Yang, Y. Zhang, L. Luo, J. Qian, and Y. Chen, Face recognition with pose variations and misalignment via orthogonal procrustes regression, IEEE Trans. Image Process. 25(6) (2016), pp. 2673–2683.
  • Y. Tian and Y. Takane, On consistency, natural restrictions and estimability under classical and extended growth curve models, J. Stat. Plan. Inference 139(7) (2009), pp. 2445–2458.
  • J.A. Tropp, A. Yurtsever, M. Udell, and V. Cevher, Practical sketching algorithms for low-rank matrix approximation, SIAM J. Matrix Anal. Appl. 38(4) (2017), pp. 1454–1485.
  • W.J. Vetter, Vector structures and solutions of linear matrix equations, Linear Algebra Appl. 10(2) (1975),pp. 181–188.
  • G. Wahba, A least squares estimate of satellite attitude, SIAM Rev. 7(3) (1965), pp. 409–409.
  • P.C. Young, Recursive Estimation and Time-Series Analysis, 2nd ed., Springer-Verlag, Berlin Heidelberg, 2011.
  • H. Zhao, Z. Wang, and F. Nie, Orthogonal least squares regression for feature extraction, Neurocomputing 216 (2016), pp. 200–207.