Views

CrossRef citations to date

Altmetric

Original Articles

Analysis of the gradient method with an Armijo–Wolfe line search on a class of non-smooth convex functions

Azam AslCourant Institute of Mathematical Sciences, New York University, New York, NY, USA

https://orcid.org/0000-0001-5865-9623 View further author information

Michael L. OvertonCourant Institute of Mathematical Sciences, New York University, New York, NY, USACorrespondence[email protected]

https://orcid.org/0000-0002-6563-6371 View further author information

ABSTRACT

It has long been known that the gradient (steepest descent) method may fail on non-smooth problems, but the examples that have appeared in the literature are either devised specifically to defeat a gradient or subgradient method with an exact line search or are unstable with respect to perturbation of the initial point. We give an analysis of the gradient method with steplengths satisfying the Armijo and Wolfe inexact line search conditions on the non-smooth convex function $f (x) = a | x^{(1)} | + \sum_{i = 2}^{n} x^{(i)}$ . We show that if a is sufficiently large, satisfying a condition that depends only on the Armijo parameter, then, when the method is initiated at any point $x_{0} \in R^{n}$ with $x_{0}^{(1)} \neq 0$ , the iterates converge to a point $\bar{x}$ with ${\bar{x}}^{(1)} = 0$ , although f is unbounded below. We also give conditions under which the iterates $f (x_{k}) \to - \infty$ , using a specific Armijo–Wolfe bracketing line search. Our experimental results demonstrate that our analysis is reasonably tight.

KEYWORDS:

2010 MATHEMATICS SUBJECT CLASSIFICATIONS:

Disclosure statement

No potential conflict of interest was reported by the authors.

ORCID

Azam Asl http://orcid.org/0000-0001-5865-9623

Michael L. Overton http://orcid.org/0000-0002-6563-6371

Notes

1. There is a subtle distinction between the Wolfe condition given here and that given in [Citation16], since here the Wolfe condition is understood to fail if the gradient of f does not exist at $x_{k} + t d_{k}$ , while in [Citation16] it is understood to fail if the function of one variable $s \mapsto f (x_{k} + s d_{k})$ is not differentiable at s = t. For the example analysed here, these conditions are equivalent.

2. The same oscillatory behaviour occurs if we replace the Wolfe condition by the Goldstein condition $f (x_{k} + t d_{k}) \geq f (x_{k}) + c_{2} t \nabla f (x_{k})^{T} d_{k}$ .

3. www.cs.nyu.edu/overton/software/hanso/

4. In our implementation, we made no attempt to determine whether $\hat{f}$ is differentiable at a given point or not. This is essentially impossible in floating point arithmetic, but as noted earlier, the gradient is defined at randomly generated points with probability one; there is no reason to suppose that any of the methods tested will generate points where $\hat{f}$ is not differentiable, except in the limit, and hence the ‘subgradient’ method actually reduces to the gradient method with $t_{k} = 1 / k$ . See [Citation16] for further discussion.

Additional information

Funding

Michael L. Overton was supported in part by National Science Foundation [grant number DMS-1620083].

Notes on contributors

Azam Asl

Azam Asl obtained her B.Sc. in Computer Engineering at Sharif University of Technology in Teheran in 2008. She will receive her Ph.D. in Computer Science from New York University in 2020.

Michael L. Overton

Michael L. Overton obtained his Ph.D. in Computer Science from Stanford University in 1979. He is a Silver Professor of Computer Science and Mathematics at the Courant Institute of Mathematical Sciences, New York University.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Analysis of the gradient method with an Armijo–Wolfe line search on a class of non-smooth convex functions

Notes on contributors

Azam Asl

Michael L. Overton

Information for

Open access

Opportunities

Help and information

Analysis of the gradient method with an Armijo–Wolfe line search on a class of non-smooth convex functions

ABSTRACT

Disclosure statement

ORCID

Notes

Additional information

Funding

Notes on contributors

Azam Asl

Michael L. Overton

Reprints and Corporate Permissions

Academic Permissions

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date

Your download is now in progress and you may close this window

Login or register to access this feature