1,378
Views
2
CrossRef citations to date
0
Altmetric
PRODUCTION & MANUFACTURING

A non-Secant quasi-Newton Method for Unconstrained Nonlinear Optimization

ORCID Icon | (Reviewing editor)
Article: 2018929 | Received 13 Sep 2020, Accepted 18 Nov 2021, Published online: 04 Jan 2022

Abstract

The Secant equation has long been the foundation of quasi-Newton methods, as updated Hessian approximations satisfy the equation with each iteration. Several publications have lately focused on modified versions of the Secant relation, with promising results. This study builds on that idea by deriving a Secant-like modification that uses non-linear quantities to construct Hessian (or its inverse) approximation updates. The method uses data from the two most recent iterations to provide an alternative to the Secant equation with the goal of producing improved Hessian approximations that induce faster convergence to the objective function optimal solution. The reported results provide strong evidence that the proposed method is promising and warrants further investigation.

Public interest statement

Unconstrained optimization problems are common in industrial and engineering applications. Much research has been dedicated to devise efficient methods to solve such problems. Secant Quasi-Newton methods constitute the basis for such methods due to their attractive properties. This paper derives a new nonlinear variant of the classical Secant methods that efficiently computes solutions to unconstrained optimization problems.

1. Introduction

The techniques examined in this paper are those employed in the solution of problems of the form

minfx,wheref:RnR.

Such issues are common in all types of engineering and manufacturing projects. The above problem is solved iteratively using quasi-Newton methods. For the current iteration i, the gradient of f at xi is indicated as gi and the matrix Bi is approximates G(xi), the actual Hessian of f. The next iteration Hessian matrix approximation must traditionally satisfy the so-called Secant equation

(1) Bi+1si=yi,(1)

for

yi=gi+1gi and si=xi+1xi.

The new solution approximation is defined as

(2) xi+1=xi+αipi,(2)

where pi the direction vector, obtained through solving

(3) Bipi= gi or pi=  Higi.(3)

The step length αi in (2) is computed as the solution to

αi=αR+minfxi+αpi,

by means of some line search methodology to satisfy conditions the following Wolfe-Powell- conditions are satisfied (Fletcher (Citation1987))

(4) fxi+1fxi+104siTgi(4)

and

(5) siTgi+10.9 piTgi.(5)

Satisfying conditions of the type in (4) and (5) is necessary to guarantee convergence.

The derivation and coding of the actual Hessian matrix is susceptible to human error and demands considerable storage for large problems. The identity matrix (or some weighted version of it) is usually chosen as the initial approximation to the Hessian, and it is updated step-by-step using the most recent available computed data. It has been demonstrated that such methods exhibit superlinear convergence under reasonable assumptions about the objective functions (Fletcher, Citation1970, Citation1994; Dennis & Schnabel, Citation1979). When the line searches are exact, the methods converge in at most n iterations on quadratic functions (Broyden, Citation1970).

The most numerically successful update formula used to approximate the Hessian on a step-wise basis is the BFGS update and is given as

Bi+1=Bi+yiyiTsiTyiBisisiTBisiTBisi.

The global convergence of the BFGS method for convex objective functions has been established by some authors (see, Dai et al., Citation2002; Dennis & Schnabel, Citation1979; Fletcher, Citation1987; Shanno, Citation1978; Shanno & Phua, Citation1978; Xiao et al., Citation2006; Yuan et al., Citation2018, Citation2017). Dai et al. (Citation2002) demonstrate that when the line searches are inexact, the standard BFGS method may fail to converge on non-convex functions. Wei et al. (Citation2004, Citation2006) show that even with accurate-line searches, the classical BFGS method can fail to converge. Li and Fukushima (Citation2001) modified the classical BFGS formula to create a variant of the method that exhibits global convergence even when the objective function f lacks the convexity condition.

2. Variants of the Secant equation

Much research has been conducted in order to develop methods that outperform the classical BFGS update numerically and are globally convergent under reasonable assumptions. In this section, we provide a brief overview of some of the successful methods that have outperformed the classical BFGS. The algorithms’ development is primarily based on the introduction of modifications to the Secant Equationequation (1). The concept of using more of the data available at each iteration in the update of the Hessian approximation that would otherwise be disregarded motivates one particularly successful class of approaches documented in the literature. These data include, though not limited to, the readily computed step vectors as well as the gradient difference vectors in (1) from the m latest iterations (m > 1). The standard quasi-Newton methods utilize just from the most recent iteration. We next present a particularly successful class of methods known as the multi-step methods (Ford & Moghrabi, Citation1993, Citation1994, Citation1996).

Applying the Chain Rule is applied to g xτ, for a differentiable path in Rn, X=xτ (for τ  R), and differentiate with respect to τ, the result is

(6) Gxi+1x τm=g xτm.(6)

In particular, if m is chosen to pass through the most recent point xi+1 (so that xτm=xi+1, say), then Equationequation (6), known as the “Newton Equation” (see, Al-Baali, Citation1985; Broyden, Citation1970; Dennis & Schnabel, Citation1979), presents generally an alternative that the Hessian matrix Gxi+1 satisfies. EquationEquation (6) may be regarded as a basis for the Secant equation when m is chosen to be i + 1 (Broyden, Citation1970). In Ford and Moghrabi (Citation1994), the path X is constructed as the polynomial interpoloating the m+1 most recent points {xim+k+1}k=0m. The polynomial (gˆτ, say) is used to approximate the vector g xτm that by differentiating the corresponding polynomial which interpolates the corresponding available gradient points {gxim+k+1}k=0m.

The scalar values {τk}k=0m correspond to the points {xim+k+1}k=0m on the curve X=xτ as follows

xτk=xim+k+1,for k=0,1,,m.)

If Bi+1 is a given approximation to Gxi+1 in (6) and we make the following definitions

(7) limifi=f.(7)
(8) l1p 2dT2fxdl2p 2(8)

(where the vectors si and yi are as defined in (1) and wi represents an approximation to g xτm, it is not ureasonable (by (6)) to force the matrix Bi+1 to satisfy a relation similar to the one in (6). We, thus, have

(9) Bi+1ri=wi.(9)

There remains the issue of choosing values for the τ parameters. One possibile choice is (Ford & Moghrabi, Citation1993)

(10) τ0=||si2||+||si12||, τ2=0,andτ1=||si2|| .(10)

This choice has the merit of reflecting the distances among the iterates in the space of the variables and updated for each new point.

The multi-step BFGS Hessian update is given by

(11) Bi+1MS=Bi+wiwiTwiTriBiririTBiriTBiri.(11)

The main advantage of (10) is in exploiting several recent steps as well as available gradient vectors in contrast to the standard BFGS method that utilizes only latest one-step and the corresponding gradient difference vectors.

Other non-secant methods are motivated by the fact that the classical BFGS formula is confined to using single gradient and step vectors in the construction of the Hessian approximation while neglecting other potentially useful readily computed data. For example, function values might prove to be valuable in the updating process. So, focusing on building better ‘quality’ updates, variants of the classical Secant relation (1) have been explored (see, Ford & Moghrabi, Citation1996; Wei et al., Citation2004; Ortiz et al., Citation2004; G. Yuan & Wei, Citation2010; Yuan et al., Citation2010, Citation2017, Citation2018; Zhang et al., Citation1999). Wei et al. (Citation2006), using Taylor’s series, derived a modification to (1) as follows:

Bi+1si=zi,

where, zi=yi+θi||si2|| and θi=2fifi+1+gi+1+giTsi.

G. Yuan and Wei (Citation2010) used another alternative, given as:

zi=yi+max0,θi||si2||.

Some of the recent and well-known Secant-like methods that are motivated by the need to derive accelerated convergence variants of the classical BFGS may be summarized in .

Table 1. Some modifications of Secant equation

Although the methods listed in represent some of the well-known published Secant-variant formulae that in addition to incorporating gradient and step vectors function values obtained from the latest iterate are exploited in the update of the Hessian approximation. To differ, the methods derived in Moghrabi (Citation2017) and Ford and Moghrabi (Citation1993), utilize nonlinear interpolating polynomials, and introduce a completely different Secant-like equation utilizing available data from the two/three most recent steps. The methods of Moghrabi (Citation2017) are based on the idea of performing multiple updates on each iteration so that multiple Secant and Secant-variant conditions are observed. The updates are performed in a manner that ensures the first update obeys (1) while subsequent updates satisfy

Bi+1t+1vt=ut,t= 1,2,m 1 m> 1

where vt=rim+t and ut=wim+t, for r and w as in (7) and (8).

A sample of the work done on the derivation of modified Secant equations is summarized in . For interested readers, there are many other similar methods available in the literature (Wei et al., Citation2006; Xiao et al., Citation2006; Yuan et al., Citation2018; G. Yuan & Wei, Citation2010; Yuan et al., Citation2017, Citation2010; Zhang et al., Citation1999). Several of the approaches mentioned here have had their convergence qualities investigated. Yuan et al. (Citation2018), for example, demonstrate global convergence using a less stringent variant of the Powell-Wolfe line search criteria (4) and (5).

Such methods may prove useful in various domains in which the classical BFGS has been a serious contender. One possible domain is in sustainable transportation planning for dealing with the issue of minimizing air pollution, congestion, and noise (Sayyadi & Awasthi, Citation2018). Another possible application could be in the bi-objective inventory model created by Gharaei et al. (Citation2020). The goal of to optimize the lot-sizing of the replenishment while meeting stochastic limitations and calculating the optimal number of lots and volume of each lot. Gharaei et al. (Citation2015) aim to optimize incentives in single machine scheduling in reward-driven systems such that total reward is maximized while limitations such as total reward limitation for earliness and learning, independence from earliness and learning, and others are met. The goal is to maximize overall rewards in the mentioned system by using quadratic rewards for both learning and earliness, where non-secant methods may prove useful. Gharaei et al. (Citation2019) present a mathematical model that combines the buyers’ and vendors’ total costs (TC) in a supply chain (TC) under penalty, green, and quality assurance (QC) rules, as well as a VMI-CS agreement. The goal is to find the optimal batch-sizing policy in the integrated SC with the lowest TC that finds both the number of the vendor’s batches for each of the shipped product lines and the size of the batches shipped to the buyers in order to minimize the integrated SC’s TC while satisfying the stochastic barriers. Due to the complexity of the optimization model, the possibility of employing the methods proposed in this paper for solving the integrated supply chain problem. Another attractive application might be in areas where quality control and green production policies are considered when designing a bi-objective multi-product limited and integrated economic production quantity model. There are stochastic constraints in the model mentioned above. Such research aims to find the best total inventory cost and total profit while keeping stochastic restrictions in mind (see, Gharaei et al. (Citation2021)). Awasthi & Omrani, (Citation2019) propose a goal-oriented strategy for prioritizing sustainable mobility projects based on fuzzy axiomatic design. The affinity diagram technique is used to establish sustainability evaluation attributes. The criteria and alternatives are rated using the fuzzy Delphi technique. Fuzzy axiomatic design provides final rankings for urban transportation initiatives, and the best project(s) are picked for implementation. Sensitivity analysis is used to examine the impact of changes in input parameters on the model’s stability, and this is a possible venue for the method proposed in this paper to prove its usefulness.

Next a new method that is motivated by the same concept of incorporating data available from several of the latest iterations for deriving a new variant to the Secant Equationequation (1) is developed.

3. A New Non-Secant equation

We will use data from three most recent iterations to derive the new Secant-like relationship that the updated Hessian (or its inverse) must fulfil, just as we did with the nonlinear multi-step quasi-Newton methods (Equationequations (5Equation10)) discussed earlier, thus choosing m=2 in (6). In particular, an interpolation of the points xi1, xi and xi+1 is constructed using a differentiable curve in Rn, xτ, such that xτ0=xi1, xτ1=xi and xτ2=xi+1.

The corresponding objective function is modelled as ωτfxτ, using Taylor’s expansion relation around the point τ2=0 (corresponds to xi+1) as follows

(12) ωτω0+τω 0+12τ2ω 0,(12)

for the τ-values chosen in (10).

Using (12), we have

(13) ω 0x (0)Tgi+1ω ′′0x (0)TBi+1x 0+x (0)Tgi+1,(13)

where x τdxdτ and x τd2xdτ2.

Choosing the Lagrange representation for xτ, we have (obtained by setting m = 2 in (7) and (8)), we obtain

(14) x τ=L2τsiρsi1(14)

and

(15) x τ=L2 ′′τsiγsi1,(15)

for

ρ=δ21+2δ, γ=τ2τ1τ1τ0,

and, for all τ, the following quantities hold

(16) L2τ=2ττ0τ1τ2τ1τ2τ0,L0τ=2ττ1τ2τ0τ1τ0τ2,(16)
(17) L2τ=2τ2τ1τ2τ0andL0τ=2τ0τ1τ02.(17)

For the τ-values selected in (10), we have γ=||si||||si1||. It is not unreasonable then to introduce a generalization of γ obtained by plugging in a scaling constant, μ0 (see, Moghrabi, Citation2017). Such scaling gives an easier more convenient mechanism to switch to the standard one-step Secant update method by setting γ =0. Therefore,

δ=μ||si||||si1||.

Should τ0 be chosen to be used in (12) if the true Hessian at xi+1 be substituted by its approximation Bi+1, as is the case in the implementation of the standard quasi-Newton methods, it is reasonable to require that

(18) τ02ω 02fi1fi+1τ0ω 0,(18)

or, alternatively,

(19) fi1=fi+1+τ0x (0)Tgi+1+12τ02[x (0)TBi+1x 0+x 0)Tgi+1,(19)

For the quantities x 0 and x 0 as in (14)

and (15), respectively. It follows that

fi1=fi+1+τ0τ0τ1τ1τ0siTgi+1+τ1τ0τ1τ0+12τ02[x 0)TBi+1x 0+2τ1τ0si2τ0τ1τ0si1Tgi+1.

Now, from (13) and (19), we obtain

(20) τ02τ0τ1τ1τ02siϑsi1TBi+1siϑsi1+2τ1τ0siδsi1Tgi+1=2(fi1fi+1τ0τ1τ1siϑsi1Tgi+1,(20)

for ϑ and δ are as defined in (14)

and (15), respectively.

Upon defining the quantities

risiρsi1 uisiγsi1 and ϑτ0+τ1τ1,

then from (19) we have

(21) ϑ2riTBi+1ri=2fi1fi+1+ϑriTgi+1τ0τ1gi+1Trˉi.(21)

EquationEquation (21) can be rewritten as

(22) ziTBi+1zi=ziTwi+σi(22)

for

ziϑri, wiyiρyi1

and

σi=2fi1fi+1+ziTgi+1τ0τ1gi+1TuiziTwi.

The equation obtained in (22) suggests a new Secant-like equation of the form

(23) Bi+1zi=vi(23)

for zi as in (21) and viwi+δi||zi||2zi.

The computed search direction is downhill if Bi+1 is positive definite. By analogy with the standard Secant Equationequation (1), Bi+1 is positive definite if and only if Bi is already so and ziTvi>0. As this cannot be guaranteed in this new formulation, (23) is replaced with

(24) Bi+1zi=vˉi(24)

where

vˉi=vi+εizi,εi=γ||gi2||+maxziTzi||zi||2,0
,

for some positive constant γ. Relation (24) is an appropriate replacement to (23) since

(25) vˉiTzi γ||gi||2||zi||2 > 0,(25)

ascertains the positive definiteness of Bi+1.

The outline of the new method is as follows

Algorithm NMBFGS

Input: x0Rn, and set Ho I. Let iteration count i = 0.

Output: (near) optimal solution

  1. Stop if the size of the gradient satisfies ||gi||  ε (convergence threshold).

  2. compute di = −Higi.

  3. Minimize, fxi+αpi to determine αi using

αR

Cubic interpolation, such that conditions (4) and (5) are satisfied (an O(n) operation).

  1. Compute the new iterate xi+1=xi+αipi. If (24) is satisfied, update Hi using (10) with wi replaced by vˉi and ri replaced by zi in (23), else set zi=si and vˉi=yi in (23) and update to satisfy condition (1) (an O(n2) operation).

  2. Set ii + 1 and go to 1.

The overall cost of the above method, asymptotically, is the same as the standard quasi-Newton algorithm though practically the recorded savings in both iteration, function/gradient evaluations count differentiate the new method as opposed to others in the same category.

4. Convergence properties

The convergence analysis presented next rests on the following assumptions:

A1. The level set D=x|fx<fx0 is bounded, for a starting point x0.

A2. The objective function f is twice continuously differentiable on D and in an open set M containing D, there exists a constant ƺ > 0 such that

gxgy ƺ||x—y||, for all x,yM.

Given that the sequence fi is a decreasing sequence, then the sequence xi generated by the new method belongs to D, and there exists a constant f such that:

(26) limifi=f.(26)

A3. The objective function f is uniformly convex, in that there are positive constants l1 and l2 such that

l1||p|| 2dT2fxdl2||p|| 2

holds for all xDandpRn.

Theorem 1. Let f satisfy assumptions 1 and 2 and xi be generated by algorithm NBFGS and there exist constants m1 and m2 such that

(27) ||Bisi||m1||si||andsiTBisim2||si||2,i,(27)

then the following holds

(28) limiinf||gi||=0.(28)

Proof: Since si=αipi,Bisi=αigi, ||gi||2=αi2||Bisi||, it is straightforward to show that (27) holds with pi replacing si. Using (27) and gi=Bipi, we get

piTBipim2pi2
and

m2pigim1pi.

Let us define the set Δ of indices i, for which (27) holds. By the Powell-Wolfe condition (5) and Assumption 2, we obtain

ξαipi2piTyi0.1piTgi.
It follows that, for any

iΔ,

αi0.1piTBipiξpi20.1ξ1m1.

Using (26), one gets

i=1fifi+1=limεi=1fifi+1=limεf1fε=f1fmin,
and, thus,

i=1fifi+1<.

By the Powell-Wolfe condition (4), it follows that

i=1siTgi<,
and, consequently,

limεαipiTgi=0.

Hence,

limiΔ,ipiTBipi=limiΔ,ipiTgi=0,
and (28) follows, given (27).

We now proceed to prove that algorithm NBFGS has global convergence.

Theorem 2. Let f satisfy assumptions 1 and 2 above and xi generated by algorithm NBFGS. Then, the following holds

(29) limiinf gi=0.(29)

Proof. As per Theorem 2, it suffices to prove that (29) holds for all i. By contradiction, assume this is not the case. Thus, there exists a positive constant ƺ such that, for all i,

gi>
ƺ.

It is then easy to show that

(30) vˉiθsi,forsomepositiveconstantθ.(30)

From (25) and the fact that

uiσsiεsi112σsi,

it follows that

(31) uiTvˉiγξ2uiTuiγ2σξ2siTsi.(31)

Thus, from (30) and (31), we obtain

vˉiTvˉiuiTvˉiρ,

for ρθ2γξ2σ/2.

The sequence Bi, by theorem 2, for there exist constants m1 and m2 such that (29) applies for all i. The proof completes on the basis of the following (see Theorem 2.1 in Dai et al. (Citation2002)).

If there are two positive constants l1 and l2 such that for all i

uiTvˉiuiTuil1 and vˉiTvˉiuiTvˉil2,

such that for any positive integer j (28) holds for no less than [j/2] iterations of i1,..,j.

4.1. Non-asymptotic Superlinear Convergence of the New Method

The analysis presented here is quite similar to that of Jin and Mokhtari (Citation2021). The convergence results presented by the authors apply primarily to the standard BFGS method. We follow a similar approach to prove the superlinear convergence of NMBFGS. To do so, the following assumptions are needed:

A3. The objective function f is twice differentially continuous. It is also convex with convexity parameter σ>0 and a gradient g(x) which is Lipschitz continuous with parameter ρ>0. Thus, we have

σxvgxgvρxv,x,yRn,
σIGxρI,xRn.

A4. The Hessian matrix G(x) of f satisfies the condition

GxGxxx,xRn,

for some constant δ>0.

Theorem 3. Consider the method described in Algorithm NMBFGS and suppose the objective function f satisfies the conditions in Assumptions A3 and A4. Choose two real parameters, for any arbitrary α in (0, 1), choose > 0 and θ > 0 such that

2α1θ+α21αθ,1δ2ρθ+2θ+1δα,

where α1=15/4δρσ1.5 and α2=21+2n12δρ12σ2Gx12F. Should the start point x0 and the initial inverse Hessian approximation H0 satisfy the relations

||x0x||<min,σ1.53δρ,||H0Gx1||Gx1/2<θ,
then the sequence xii=0 generated by the new method converges to the optimal point x superlinearly at a rate of
(32) x0x8ρ2θ1+α2σ21αδ+δρ1+ασ21α2δδx0x.(32)

Proof. Similar to the proof done in Jin and Mokhtari (Citation2021).Theorem 4 suggests that the sequence of points generated by NMBFGS advance to the optimal solution at a superlinear rate of ε1δ+ε2δδ in the local neighbourhood of the optimal solution. The constants ε1 and ε2 depend on the parameter of the problem being solved. For sufficiently large δ, the convergence rate is O1δδ2.

There is a compromise between the speed the method converges superlinearly and the size of the local neighbourhood in that increasing the size of and θ results in an increase in the size of the vicinity the method converges at a superlinear rate. However, this will lower the speed of convergence since (32) increases therefore, by selecting different values for the parameters α, and δ both the speed and region of superlinear convergence change.

5. Numerical test results and discussions

Summaries of the numerical results are tabulated in this section. summarizes the list of test problems. The chosen test set is for issues of varying dimensions, allowing for the examination of various test problem sizes. The functions that were tested, and hence the results that were reported, relate to problems that were categorized by size/dimension. Those problem sizes fall into four categories, low (2 ≤ n ≤ 20), medium (21 ≤ n ≤ 40), moderately high (41 ≤ n ≤ 1000) and high (n >1000). The findings result from experiments on 17 different test problems with dimensions ranging from 2 to 100,000. Each of the problems in the list has been tested from four different perspectives. A total of 950 test items were obtained. Moré et al. (Citation1981), Fletcher (Citation1987), Tajadod et al. (Citation2016), and Xiao et al. (Citation2006) provided the test problems.

Table 2. Test problems and dimensions

In (8) (see, Ford & Moghrabi, Citation1993), (Ford & Moghrabi, Citation1994, Citation1996), the new algorithm NMBFGS is compared to Yuan et al. (Citation2017) (see ) and the multi-step BFGS (MSBFGS; see, Ford & Moghrabi, Citation1993). The tests are run using Yuan’s technique as a benchmark. presents a summary of the overall findings. show the results of the small, medium, moderately large, and huge problems, respectively. Because there isn’t enough room to provide the figures for each dimension separately, the findings are summarized by dimension category. Iteration, function/gradient assessments, and total execution times are all included in the scores. The average optimality error is also reported for each dimension category. provide a diagrammatic summary for each of function/gradient evaluations, iteration count and timing, respectively. The implementation code is C++ carried out on a 64-bit machine with i7-3770, 3.4 GHZ CPU.

Figure 1. Overall evaluations.

Figure 1. Overall evaluations.

Figure 2. Overall iterations.

Figure 2. Overall iterations.

Figure 3. Overall timings.

Figure 3. Overall timings.

Table 3. Overall iteration, function evaluations count and timing

Table 4. Iteration, function evaluation count and timing-large problems

Table 5. Iteration, function evaluation count and timing-moderately large problems

Table 6. Iteration, function evaluations count and timing-medium problems

Table 7. Iteration, function evaluation count and timing-small problems

For all the methods tested here, the next point xi+1 is obtained from xi by employing a line search algorithm that applies safeguarded cubic interpolation with step-doubling, such that the conditions in (4) are satisfied.

For the standard BFGS method, it is well known that a necessary and sufficient condition for ensuring that the generated updated matrices Hi are positive definite and hence the computed search direction is downhill, is that siTyi>0 (Broyden, Citation1970). By analogy, for the new method MSBFGS, condition (24) is imposed in order to guarantee that vˉiTzi is “sufficiently” positive to circumvent any potential numerical instability in the construction of Hi+1. Should condition (24) fail to hold, the algorithm reverts to using an MSBFGS iteration on that particular instance. An initial scaling is applied to the start inverse Hessian approximation using the methods introduced in Shanno and Phua (Citation1978).

It is generally known that siTyi>0 is a necessary and sufficient condition for assuring that the generated updated matrices Hi are positive definite and thus the computed search direction is downhill for the typical BFGS method (Broyden, Citation1970). By analogy, condition (24) is applied for the new technique MSBFGS in order to ensure that that vˉiTzi is “sufficiently” positive to avoid any potential numerical instability in the building of Hi+1. If condition (24) fails, the algorithm falls back to utilizing an MSBFGS iteration on that specific instance. Using the approaches proposed in Shanno and Phua (Citation1978), an initial scaling is applied to the start inverse Hessian approximation.

shows the graphical results of the test problems in terms of the most expensive component of the minimization process, the overall number of function/gradient evaluations. That is, we depict the proportion pτ of problems for which each technique is within a factor of τ the best number of evaluations for each approach. The algorithm that performs better in terms of evaluations that is within a factor of the best number of function evaluations is shown on the top curve. shows that NMBFGS outperforms the other two methods in terms of number of iterations and evaluations, thus making it the fastest and best performer.

Figure 4. Overall evaluations performance profile.

Figure 4. Overall evaluations performance profile.

6. Conclusions

In this study, Secant-like approaches based on modifications to the classical Secant relation (1) are investigated, and a new non-Secant method is devised. To increase the quality of the computed Hessian approximations and, as a result, expedite convergence to a solution for a particular issue, the approach employs nonlinear quantities that integrate data readily available from the previous three iterations. The outperforming algorithm has been justified both theoretically and practically. The primary motivation to exploring alternatives to the standard Secant equation has been to obtain possibly better Hessian approximations that expedite the convergence of quasi-Newton methods. The proposed algorithm’s convergence behaviour has been investigated. The numerical data achieved above have shown promising results, drawing attention to the methods’ potential practical significance. The findings demonstrate that past iteration data, such as function values, gradient points, and step vectors, yield performance gains practically, and hence provide incentives to explore them further.

7. Future work

Future research ought to concentrate on modified Secant methods for various τ-parameter choices in order to investigate the numerical behaviour of these methods’ sensitivity to such choices. It would also be interesting to study these methods’ ability to offer equivalent improvements when used to solve systems of non-linear equations, as well as their convergence characteristics.

Similar approaches are also being investigated, in which the computed search direction di is adjusted before the line search is performed. This strategy looks to be effective. A variation of the technique is also being considered, in which the update of the Hessian approximation is skipped every other iteration (or more frequently). Also, the sensitivity of the algorithmic behaviour to the accuracy of the line search deserves investigation. The numerical performance of such methods when replacing the line search with Trust Region methods is also worth exploring.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Additional information

Funding

The author received no direct funding for this research.

Notes on contributors

Issam A.R. Moghrabi

Dr. Issam Moghrabi is a Professor of M.I.S/C.S. He laid the foundations of the M.B.A Program at GUST. He is a Fulbright Scholar and did a postdoctoral research in Geographic Information Systems. His main research interests are in mathematical Optimization, Management Science and Information Retrieval and database systems. He serves as a referee for many well-known journals in his subjects of interest.

References

  • Al-Baali, M. (1985). New property and global convergence of the Fletcher–Reeves method with inexact line searches. IMA Journal of Numerical Analysis, 5(1), 122–19. https://doi.org/10.1093/imanum/5.1.121
  • Awasthi, A., & Omrani, H. (2019). A goal-oriented approach based on fuzzy axiomatic design for sustainable mobility project selection. International Journal of Systems Science: Operations & Logistics, 6(1), 86–98. https://doi.org/10.1080/23302674.2018.1435834
  • Broyden, C. G. (1970). The convergence of a class of double-rank minimization algorithms - Part 2: The new algorithm. IMA Journal of Applied Mathematics, 6(3), 222–231. https://doi.org/10.1093/imamat/6.3.222
  • Dai, Y., Yuan, J., & Yuan, Y. (2002). Modified two-point step size Gradient methods for unconstrained optimization. Computational Optimization and Applications, 22(1), 103–109. https://doi.org/10.1023/A:1014838419611
  • Dennis, J. E., & Schnabel, R. B. (1979). Least change Secant updates for quasi-Newton methods. SIAM Review, 21(4), 443–459. https://doi.org/10.1137/1021091
  • Fletcher, R. (1970). A new approach to variable metric algorithms. The Computer Journal, 13(3), 317–322. https://doi.org/10.1093/comjnl/13.3.317
  • Fletcher, R. (1987). Practical Methods of Optimization (second ed.). Wiley.
  • Fletcher, R. (1994). An Overview of Unconstrained Optimization. In E. Spedicato (Ed.), Algorithms for Continuous Optimization. NATO ASI Series (Series C: Mathematical and Physical Sciences) (Vol. 434, pp. 109–143). Springer.
  • Ford, J. A., & Moghrabi, I. A. R. (1993). Alternative parameter choices for multi-step quasi-Newton methods. Optimization Methods & Software, 2(3–4), 357–370. https://doi.org/10.1080/10556789308805550
  • Ford, J. A., & Moghrabi, I. A. R. (1994). Multi-step quasi-Newton methods for optimization. Journal of Computational and Applied Mathematics, 50(1–3), 305–323. https://doi.org/10.1016/0377-0427(94)90309-3
  • Ford, J. A., & Moghrabi, I. A. R. (1996). Using function-values in multi-step quasi-Newton methods. Journal of Computational and Applied Mathematics, 66(1–2), 201–211. https://doi.org/10.1016/0377-0427(95)00178-6
  • Gharaei, A., Hoseini, S. S., & Karimi, M. (2020). Modelling And optimal lot-sizing of the replenishments in constrained, multi-product and bi-objective EPQ models with defective products: Generalised Cross Decomposition. International Journal of Systems Science: Operations & Logistics, 7(3), 262–274. https://doi.org/10.1080/23302674.2019.1574364
  • Gharaei, A., Naderi, B., & Mohammadi, M. (2015). Optimization of rewards in single machine scheduling in the rewards-driven systems.Management. Science Letters, 5(6), 629–638.
  • Gharaei, A., Naderi, B., & Mohammadi, M. (2019). An integrated multi-product, multi-buyer supply chain under penalty, green, and quality control polices and a vendor managed inventory with consignment stock agreement: The outer approximation with equality relaxation and augmented penalty algorithm. Applied Mathematical Modelling, 69 2 , 223–254. https://doi.org/10.1016/j.apm.2018.11.035
  • Gharaei, A., Shekarabi, S. A., Karimi, M., Pourjavad, E., & Amjadian, A. (2021). An integrated stochastic EPQ model under quality and green policies: Generalised cross decomposition under the separability approach. International Journal of Systems Science: Operations & Logistics, 8(2), 119–131. https://doi.org/10.1080/23302674.2019.1656296
  • Hassan, B., & Ghada, M. (2020). A new quasi-newton equation on the gradient methods for optimization minimization problem. Indonesian Journal of Electrical Engineering and Computer Science, 19(2), 737–744. https://doi.org/10.11591/ijeecs.v19.i2.pp737-744
  • Hassan, B., & Mohammed, W. T. (2020). A new variants of quasi-newton equation based on the quadratic function for unconstrained optimization. Indonesian Journal of Electrical Engineering and Computer Science, 19(2), 701–708. https://doi.org/10.11591/ijeecs.v19.i2.pp701-708
  • Hassan, B. (2019). A new Quasi-Newton Updating formula based on the new quasi-Newton equation. Numerical Algebra Control and Optimization, 4(1), 63–72.
  • Hassan, B. (2020). A new type of quasi-Newton updating formulas based on the new quasi-Newton equation. SIAM Journal Numerical Algebra, Control, and Optimization, 10(2), 227–235. https://doi.org/10.3934/naco.2019049
  • Jin, Q., & Mokhtari, A. (2021). Non-asymptotic super linear convergence of standard quasi-Newton methods. Journal of Strategic Decision Sciences, 121, https://arxiv.org/pdf/2003.13607v1.pdf
  • Li, D., & Fukushima, M. (2001). A modified BFGS method and its global convergence in nonconvex minimization. Journal of Computational and Applied Mathematics, 129(1–2), 15–35. https://doi.org/10.1016/S0377-0427(00)00540-9
  • Moghrabi, I. A. R. (2017). Implicit extra-update multi-step quasi-newton methods. International Journal of Operational Research, 28(2), 69–81. https://doi.org/10.1504/IJOR.2017.081472
  • Moré, J. J., Garbow, B. S., & Hillstrom, K. E. (1981). Testing unconstrained optimization software. ACM Transactions on Mathematical Software, 7(1), 17–41. https://doi.org/10.1145/355934.355936
  • Ortiz, F., Simpson, J. R., Pignatiello, J. J., & Heredia-Langner, A. (2004).A genetic algorithm approach to multiple-response optimization. Journal of Quality Technology, 36(4), 432–450. https://doi.org/10.1080/00224065.2004.11980289
  • Sayyadi, R., & Awasthi, A. (2018). A simulation-based optimisation approach for identifying key determinants for sustainable transportation planning. International Journal of Systems Science: Operations & Logistics, 5(2), 161–174. https://doi.org/10.1080/23302674.2016.1244301
  • Shanno, D. F., & Phua, K. H. (1978). Matrix conditioning and nonlinear optimization. Mathematical Programming, 14(1), 149–160. https://doi.org/10.1007/BF01588962
  • Shanno, D. F. (1978). On the convergence of a new conjugate gradient algorithm. SIAM Journal on Numerical Analysis, 15(6), 1247–1257. https://doi.org/10.1137/0715085
  • Tajadod, M., Abedini, M., Rategari, A., & Mobin, M. (2016). A comparison of multi-criteria decision making approaches for maintenance strategy selection (a case study). International Journal of Strategic Decision Sciences, 7(3), 51–56. https://doi.org/10.4018/IJSDS.2016070103
  • Wei, Z., Li, G., & Qi, L. (2004). The superlinear convergence of a modified BFGS- type method for unconstrained optimization. Computational Optimization and Applications, 29(3), 315–332. https://doi.org/10.1023/B:COAP.0000044184.25410.39
  • Wei, Z., Li, G., & Qi, L. (2006). New quasi-Newton methods for unconstrained optimization problems. Mathematics Program Applied Mathematics and Computation, 175(2), 1156–1188. https://doi.org/10.1016/j.amc.2005.08.027
  • Xiao, Y. H., Wei, Z. X., & Zhang, L. (2006). A modified BFGS method without line searches for nonconvex unconstrained optimization. Advances in Theoretical and Applied Mathematics, 1(2), 149–162. https://www.ripublication.com/Volume/atamv1n2.htm
  • Yuan, G., Sheng, Z., Wang, B., Hu, W., & Li, C. (2018). The global convergence of a modified BFGS method for nonconvex functions. Journal of Computational and Applied Mathematics, 327(1), 274–294. https://doi.org/10.1016/j.cam.2017.05.030
  • Yuan, G., Wei, Z., & Lu, X. (2017). Global convergence of BFGS and PRP methods under a modified weak Wolfe–Powell line search. Applied Mathematical Modelling, 47(2), 811–825. https://doi.org/10.1016/j.apm.2017.02.008
  • Yuan, G., Wei, Z., & Wu, Y. (2010). Modified limited memory BFGS method with non-monotone line search for unconstrained optimization. Journal of the Korean Mathematical Society, 47(4), 767–788. https://doi.org/10.4134/JKMS.2010.47.4.767
  • Yuan, G., & Wei, Z. (2010). Convex minimizations. Computational Optimization and Applications, 47(1), 237–255. https://doi.org/10.1007/s10589-008-9219-0
  • Zhang, J. Z., Deng, N. Y., & Chen, L. H. (1999). Quasi-Newton equation and related methods for unconstrained optimization. Journal of Optimization Theory and Applications, 102(1), 147–167. https://doi.org/10.1023/A:1021898630001