909
Views
19
CrossRef citations to date
0
Altmetric
Original Articles

ADMM for multiaffine constrained optimization

, & ORCID Icon
Pages 257-303 | Received 28 May 2019, Accepted 18 Oct 2019, Published online: 06 Nov 2019
 

ABSTRACT

We expand the scope of the alternating direction method of multipliers (ADMM). Specifically, we show that ADMM, when employed to solve problems with multiaffine constraints that satisfy certain verifiable assumptions, converges to the set of constrained stationary points if the penalty parameter in the augmented Lagrangian is sufficiently large. When the Kurdyka–Łojasiewicz (K–Ł) property holds, this is strengthened to convergence to a single constrained stationary point. Our analysis applies under assumptions that we have endeavoured to make as weak as possible. It applies to problems that involve nonconvex and/or nonsmooth objective terms, in addition to the multiaffine constraints that can involve multiple (three or more) blocks of variables. To illustrate the applicability of our results, we describe examples including nonnegative matrix factorization, sparse learning, risk parity portfolio selection, nonconvex formulations of convex problems and neural network training. In each case, our ADMM approach encounters only subproblems that have closed-form solutions.

2010 MATHEMATICS SUBJECT CLASSIFICATIONS:

Acknowledgements

We thank Qing Qu, Yuqian Zhang and John Wright for helpful discussions about applications of multiaffine ADMM. We thank Wotao Yin for his feedback, and for bringing the paper [Citation23] to our attention. We thank Yenson Lau for discussions about numerical experiments.

Disclosure statement

No potential conflict of interest was reported by the authors.

Notes

1 [Citation23] shows that every limit point of ADMM for the problem (NMF) is a constrained stationary point, but does not show that such limit points necessarily exist.

2 See Section 5.4 for the definition of λmin and λ++.

3 Note that we have deliberately excluded =0. 4.2.2 is not required to hold for X0.

4 That is, either (1a) and (1b) hold, or (2a) and (2b) hold.

5 As an illustrative example, a problem may be formulated with constraints X0X1+Z1=0,X0+P1(X1)+Z2=0,X0X2+Z3=0,P2(X2)+Z4=0, where P1,P2 are injective linear maps. The notation A(X,Z0)+Q(Z>) denotes the concatenation of these equations, which can also be seen naturally as a system of four constraints. In this case, the indices r(){1,2,3,4}, and 4.2.2(2a) is satisfied by the second constraint X0+P1(X1)+Z2=0 for the variables X0,X1 (i.e. r(0)=r(1)=2 and R0=I,R1=P1), and by the fourth constraint P2(X2)+Z4=0 for X2.

6 Note that X0 is included here, unlike in Assumption 4.2.

7 For linear constraints A1x1++Anxn=b, the equivalent statement is that Im(An)i=1n1Im(Ai).

8 In the sense that this exact assumption is always necessary and cannot be replaced.

9 When j>m1, θj(X) is a constant map of Y.

10 To see this, define the prox-Lagrangian LP(U,Y,W,O)=L(U,Y,W)+YOS2. By definition, Y+ decreases the prox-Lagrangian, so LP(U,Y+,W,Yk)LP(U,Yk,W,Yk)=L(U,Y,W) and the desired result follows.

11 To motivate the sub-blocks (Y0,Y1) in 6.13, one should look to the decomposition of ψ(Z) in Assumption 4.1, where we can take Y0={Z0} and Y1=Z>. Intuitively, Y1 is a sub-block such that ψ is a smooth function of Y1, and which is ‘absorbing’ in the sense that for any U+ and Y0+, there exists Y1 making the solution feasible.

12 (2) is assumed to hold for the iterates U+ and Y+ generated by ADMM as the minimal required condition, but one should not, in general, think of this property as being specifically related to the iterates of the algorithm. In the cases we consider, it will be a property of the function f and the constraint C that for any point (U~,Y~), there exists Yˆ1argminY1{fU~(Y~0,Y1):CU~(Y~0,Y1)=bU~} such that Yˆ1Y~12ζCU~(Y+)bU~2.

13 To clarify the definition of Yˆ1, the sub-block for Y0 is fixed to the value of Y0+ on the given iteration, and then Yˆ1 is obtained by minimizing fU+(Y0+,Y1) for the Y1 sub-block over the feasible region CU+(Y0+,Y1)=bU+.

Additional information

Funding

The research of Wenbo Gao and Donald Goldfarb was supported in part by the National Science Foundation under NSF Grants CCF-1527809 and CCF- 1838061. The research of Frank E. Curtis was supported in part by the National Science Foundation under NSF Grant CCF-1618717 and by the U.S. Department of Energy under DOE Grant DE-SC0010615.

Notes on contributors

Wenbo Gao

Wenbo Gao is a PhD student in the Department of Industrial Engineering and Operations Research at Columbia University.

Donald Goldfarb

Donald Goldfarb is the Alexander and Hermine Avanessians Professor of Industrial Engineering and Operations Research at Columbia University. He has been a faculty member at Columbia Engineering since 1982. He served as interim dean of the School in 2012–2013, executive vice dean in 2011–2012, acting dean in 1994–1995, and chair of the IEOR department from 1984 to 2002.

His research interests include algorithms for linear, quadratic, semidefinite, convex and general nonlinear programming, network flows, large sparse systems, and applications in robust optimization, imaging, machine learning, and finance. He has published more than 100 technical papers and served on the editorial boards of several journals, including editor in chief of Mathematical Programming, editor of the SIAM (Society for Industrial and Applied Mathematics) Journal on Optimization and the SIAM Journal on Numerical Analysis, and associate editor of Operations Research and Mathematics of Computation. He has been a member of the councils of the Mathematical Programming Society and the American Mathematical Society, numerous technical society program and award committees, and advisory committees to various universities and government research agencies.

Before coming to Columbia, he held positions as professor and acting chair in the Department of Computer Science at the City College of New York, visiting professor in the Department of Computer Science and at the School of Operations Research and Industrial Engineering at Cornell University, and assistant research scientist at the Courant Institute of Mathematical Sciences of New York University. He earned a BChE from Cornell in 1963 and MA and PhD from Princeton in 1965 and 1966, respectively.

Frank E. Curtis

Frank E. Curtis is an Associate Professor in the Department of Industrial and Systems Engineering at Lehigh University, where he has been employed since 2009. He received his Bachelors degree from the College of William and Mary in 2003 with a double major in Mathematics and Computer Science, received his Masters in 2004 and PhD in 2007 from the Department of Industrial Engineering and Management Science at Northwestern University, and spent two years as a Postdoctoral Researcher in the Courant Institute of Mathematical Sciences at New York University from 2007 until 2009. His research focuses on the design, analysis, and implementation of numerical methods for solving large-scale nonlinear optimization problems. He received an Early Career Award from the Advanced Scientific Computing Research program of the US Department of Energy, and has received funding from various programs of the US National Science Foundation, including through a TRIPODS Institute grant awarded to him and his collaborators at Lehigh, Northwestern, and Boston University. He currently serves as an Associate Editor for Mathematical Programming, SIAM Journal on Optimization, Mathematics of Operations Research, and Mathematical Programming Computation. He served as the Vice Chair for Nonlinear Programming for the INFORMS Optimization Society from 2010 until 2012, and is currently very active in professional societies and groups related to mathematical optimization, including INFORMS, the Mathematics Optimization Society, and the SIAM Activity Group on Optimization.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.