Search in:

Advanced search

Statistical Theory and Related Fields Volume 6, 2022 - Issue 1

Submit an article Journal homepage

Open access

833

Views

CrossRef citations to date

Altmetric

Articles

Research on three-step accelerated gradient algorithm in deep learning

Yongqiang LianKLATASDS-MOE, School of Statistics, East China Normal University, Shanghai, People's Republic of ChinaCorrespondence[email protected]

https://orcid.org/0000-0002-8524-253X View further author information

Yincai TangKLATASDS-MOE, School of Statistics, East China Normal University, Shanghai, People's Republic of China

https://orcid.org/0000-0001-6756-6461 View further author information

Shirong ZhouKLATASDS-MOE, School of Statistics, East China Normal University, Shanghai, People's Republic of China

https://orcid.org/0000-0002-4137-9067 View further author information

Pages 40-57 | Received 04 Jul 2020, Accepted 02 Nov 2020, Published online: 23 Nov 2020

Cite this article
https://doi.org/10.1080/24754269.2020.1846414
CrossMark

Full Article
Figures & data
References
Citations
Metrics
Licensing
Reprints & Permissions
View PDF PDF View EPUB EPUB

Figures & data

Figure 1. The performance comparison of deep learning and machine learning.

Figure 2. Illustration of GD iteration trajectory with learning rate 0.05.

Figure 3. Illustration of GD iteration trajectory with learning rate 0.1.

Figure 4. Illustration of GD iteration trajectory with learning rate 0.12.

Figure 5. Illustration of OGD iteration trajectory.

Figure 6. Illustration of PM iteration trajectory with $α = 0.1$ and $μ = 0.25$ .

Figure 7. Illustration of PM iteration trajectory with $α = 0.1$ and $μ = 0.5$ .

Figure 8. Illustration of PM iteration trajectory with $α = 0.1$ and $μ = 0.75$ .

Table 1. Iteration steps of PM, NAG, and TAG algorithms with learning rate $α = 0.05$ , while the steps of GD algorithm is 117.

Display Table

Table 2. Iteration steps of PM, NAG, and TAG algorithms with learning rate $α = 0.1$ , while the steps of GD algorithm is 54.

Display Table

Table 3. Iteration steps of PM, NAG, and TAG algorithms with learning rate $α = 0.12$ , while the steps of GD algorithm is 44.

Display Table

Figure 9. Illustration of NAG iteration trajectory with $α = 0.1$ and $μ = 0.25$ .

Figure 10. Illustration of NAG iteration trajectory with $α = 0.1$ and $μ = 0.5$ .

Figure 11. Illustration of NAG iteration trajectory with $α = 0.1$ and $μ = 0.75$ .

Figure 12. Illustration of TAG iteration trajectory for two-dimensional quadratic function with line search.

Figure 13. Illustration of TAG iteration trajectory with $α = 0.1$ and $μ = 0.25$ .

Figure 14. Illustration of TAG iteration trajectory with $α = 0.1$ and $μ = 0.5$ .

Figure 15. Illustration of TAG iteration trajectory with $α = 0.1$ and $μ = 0.75$ .

Figure 16. Illustration of TAG iteration trajectory with $α = 0.1$ and $μ = 1$ .

Table 4. Iteration steps and runtime (within parentheses) of 10-dimensional quadratic function.

Download CSV Display Table

Table 5. Iteration steps and runtime (within parentheses) of 100-dimensional quadratic function.

Download CSV Display Table

Table 6. Iteration steps and runtime (within parentheses) of 500-dimensional quadratic function.

Download CSV Display Table

Table 7. Iteration steps, runtime (within parentheses), and optimal μ with its range (within brackets) for 10-dimensional FLETCHCR function.

Download CSV Display Table

Table 8. Iteration steps, runtime (within parentheses), and optimal μ with its range (within brackets) for 100-dimensional FLETCHCR function.

Download CSV Display Table

Table 9. Iteration steps, runtime (within parentheses), and optimal μ with its range (within brackets) for 500-dimensional FLETCHCR function.

Download CSV Display Table

Figure 17. The scatter plot of spiral data.

Figure 18. Feedforward neural network with 3 hidden layers.

Figure 19. The scatter plot of iris petal.

Figure 20. BPTAG algorithm training results of iris data set with $α = 0.05$ and $μ = 0.1$ .

Table 10. Iteration steps and runtime (within parentheses) of BPPM, BPNAG, and BPTAG algorithms with learning rate $α = 0.025$ for iris data set, while the steps and runtime of BPGD algorithm are 64822 and 17.88s respectively.

Display Table

Table 11. Iteration steps and runtime (within parentheses) of BPPM, BPNAG, and BPTAG algorithms with learning rate $α = 0.05$ for iris data set, while the steps and runtime of BPGD algorithm is 48849 and 11.77s respectively.

Display Table

Table 12. Iteration steps and runtime (within parentheses) of BPPM, BPNAG, and BPTAG algorithms with learning rate $α = 0.075$ for iris data set, while the steps and runtime of BPGD algorithm is 33503 and 8.09s respectively.

Display Table

Table 13. Training accuracy of BPPM, BPNAG, and BPTAG algorithms with learning rate $α = 0.25$ for spiral data set, while the accuracy of BPGD algorithm is 0.8775.

Display Table

Table 14. Training accuracy of BPPM, BPNAG, and BPTAG algorithms with learning rate $α = 0.5$ for spiral data set, while the accuracy of BPGD algorithm is 0.8950.

Display Table

Table 15. Training accuracy of BPPM, BPNAG, and BPTAG algorithms with learning rate $α = 0.75$ for spiral data set, while the accuracy of BPGD algorithm is 0.8975.

Display Table

Figure 21. BPTASG algorithm training results of iris data set with $α = 0.05$ and $μ = 0.1$ .

Table 16. Iteration steps and runtime (within parentheses) of BPPMSGD, BPNASG, and BPTASG algorithms with learning rate $α = 0.05$ for iris data set, while the steps and runtime of BPSGD algorithm is 123859 and 22.13s respectively.

Display Table

Table 17. Training accuracy of BPPMSGD, BPNASG, and BPTASG algorithms with learning rate $α = 0.5$ for spiral data set, while the accuracy of BPSGD algorithm is 0.8887.

Display Table

Share icon
Back to Top

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Research on three-step accelerated gradient algorithm in deep learning

Table 1. Iteration steps of PM, NAG, and TAG algorithms with learning rate $α = 0.05$ , while the steps of GD algorithm is 117.

Table 2. Iteration steps of PM, NAG, and TAG algorithms with learning rate $α = 0.1$ , while the steps of GD algorithm is 54.

Table 3. Iteration steps of PM, NAG, and TAG algorithms with learning rate $α = 0.12$ , while the steps of GD algorithm is 44.

Table 4. Iteration steps and runtime (within parentheses) of 10-dimensional quadratic function.

Table 5. Iteration steps and runtime (within parentheses) of 100-dimensional quadratic function.

Table 6. Iteration steps and runtime (within parentheses) of 500-dimensional quadratic function.

Table 7. Iteration steps, runtime (within parentheses), and optimal μ with its range (within brackets) for 10-dimensional FLETCHCR function.

Table 8. Iteration steps, runtime (within parentheses), and optimal μ with its range (within brackets) for 100-dimensional FLETCHCR function.

Table 9. Iteration steps, runtime (within parentheses), and optimal μ with its range (within brackets) for 500-dimensional FLETCHCR function.

Table 10. Iteration steps and runtime (within parentheses) of BPPM, BPNAG, and BPTAG algorithms with learning rate $α = 0.025$ for iris data set, while the steps and runtime of BPGD algorithm are 64822 and 17.88s respectively.

Table 11. Iteration steps and runtime (within parentheses) of BPPM, BPNAG, and BPTAG algorithms with learning rate $α = 0.05$ for iris data set, while the steps and runtime of BPGD algorithm is 48849 and 11.77s respectively.

Table 12. Iteration steps and runtime (within parentheses) of BPPM, BPNAG, and BPTAG algorithms with learning rate $α = 0.075$ for iris data set, while the steps and runtime of BPGD algorithm is 33503 and 8.09s respectively.

Table 13. Training accuracy of BPPM, BPNAG, and BPTAG algorithms with learning rate $α = 0.25$ for spiral data set, while the accuracy of BPGD algorithm is 0.8775.

Table 14. Training accuracy of BPPM, BPNAG, and BPTAG algorithms with learning rate $α = 0.5$ for spiral data set, while the accuracy of BPGD algorithm is 0.8950.

Table 15. Training accuracy of BPPM, BPNAG, and BPTAG algorithms with learning rate $α = 0.75$ for spiral data set, while the accuracy of BPGD algorithm is 0.8975.

Table 16. Iteration steps and runtime (within parentheses) of BPPMSGD, BPNASG, and BPTASG algorithms with learning rate $α = 0.05$ for iris data set, while the steps and runtime of BPSGD algorithm is 123859 and 22.13s respectively.

Table 17. Training accuracy of BPPMSGD, BPNASG, and BPTASG algorithms with learning rate $α = 0.5$ for spiral data set, while the accuracy of BPSGD algorithm is 0.8887.

Information for

Open access

Opportunities

Help and information

Your download is now in progress and you may close this window

Login or register to access this feature

Research on three-step accelerated gradient algorithm in deep learning

Figures & data

Table 1. Iteration steps of PM, NAG, and TAG algorithms with learning rate α=0.05, while the steps of GD algorithm is 117.

Table 2. Iteration steps of PM, NAG, and TAG algorithms with learning rate α=0.1, while the steps of GD algorithm is 54.

Table 3. Iteration steps of PM, NAG, and TAG algorithms with learning rate α=0.12, while the steps of GD algorithm is 44.

Table 4. Iteration steps and runtime (within parentheses) of 10-dimensional quadratic function.

Table 5. Iteration steps and runtime (within parentheses) of 100-dimensional quadratic function.

Table 6. Iteration steps and runtime (within parentheses) of 500-dimensional quadratic function.

Table 7. Iteration steps, runtime (within parentheses), and optimal μ with its range (within brackets) for 10-dimensional FLETCHCR function.

Table 8. Iteration steps, runtime (within parentheses), and optimal μ with its range (within brackets) for 100-dimensional FLETCHCR function.

Table 9. Iteration steps, runtime (within parentheses), and optimal μ with its range (within brackets) for 500-dimensional FLETCHCR function.

Table 10. Iteration steps and runtime (within parentheses) of BPPM, BPNAG, and BPTAG algorithms with learning rate α=0.025 for iris data set, while the steps and runtime of BPGD algorithm are 64822 and 17.88s respectively.

Table 11. Iteration steps and runtime (within parentheses) of BPPM, BPNAG, and BPTAG algorithms with learning rate α=0.05 for iris data set, while the steps and runtime of BPGD algorithm is 48849 and 11.77s respectively.

Table 12. Iteration steps and runtime (within parentheses) of BPPM, BPNAG, and BPTAG algorithms with learning rate α=0.075 for iris data set, while the steps and runtime of BPGD algorithm is 33503 and 8.09s respectively.

Table 13. Training accuracy of BPPM, BPNAG, and BPTAG algorithms with learning rate α=0.25 for spiral data set, while the accuracy of BPGD algorithm is 0.8775.

Table 14. Training accuracy of BPPM, BPNAG, and BPTAG algorithms with learning rate α=0.5 for spiral data set, while the accuracy of BPGD algorithm is 0.8950.

Table 15. Training accuracy of BPPM, BPNAG, and BPTAG algorithms with learning rate α=0.75 for spiral data set, while the accuracy of BPGD algorithm is 0.8975.

Table 16. Iteration steps and runtime (within parentheses) of BPPMSGD, BPNASG, and BPTASG algorithms with learning rate α=0.05 for iris data set, while the steps and runtime of BPSGD algorithm is 123859 and 22.13s respectively.

Table 17. Training accuracy of BPPMSGD, BPNASG, and BPTASG algorithms with learning rate α=0.5 for spiral data set, while the accuracy of BPSGD algorithm is 0.8887.

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date

Table 1. Iteration steps of PM, NAG, and TAG algorithms with learning rate $α = 0.05$ , while the steps of GD algorithm is 117.

Table 2. Iteration steps of PM, NAG, and TAG algorithms with learning rate $α = 0.1$ , while the steps of GD algorithm is 54.

Table 3. Iteration steps of PM, NAG, and TAG algorithms with learning rate $α = 0.12$ , while the steps of GD algorithm is 44.

Table 10. Iteration steps and runtime (within parentheses) of BPPM, BPNAG, and BPTAG algorithms with learning rate $α = 0.025$ for iris data set, while the steps and runtime of BPGD algorithm are 64822 and 17.88s respectively.

Table 11. Iteration steps and runtime (within parentheses) of BPPM, BPNAG, and BPTAG algorithms with learning rate $α = 0.05$ for iris data set, while the steps and runtime of BPGD algorithm is 48849 and 11.77s respectively.

Table 12. Iteration steps and runtime (within parentheses) of BPPM, BPNAG, and BPTAG algorithms with learning rate $α = 0.075$ for iris data set, while the steps and runtime of BPGD algorithm is 33503 and 8.09s respectively.

Table 13. Training accuracy of BPPM, BPNAG, and BPTAG algorithms with learning rate $α = 0.25$ for spiral data set, while the accuracy of BPGD algorithm is 0.8775.

Table 14. Training accuracy of BPPM, BPNAG, and BPTAG algorithms with learning rate $α = 0.5$ for spiral data set, while the accuracy of BPGD algorithm is 0.8950.

Table 15. Training accuracy of BPPM, BPNAG, and BPTAG algorithms with learning rate $α = 0.75$ for spiral data set, while the accuracy of BPGD algorithm is 0.8975.

Table 16. Iteration steps and runtime (within parentheses) of BPPMSGD, BPNASG, and BPTASG algorithms with learning rate $α = 0.05$ for iris data set, while the steps and runtime of BPSGD algorithm is 123859 and 22.13s respectively.

Table 17. Training accuracy of BPPMSGD, BPNASG, and BPTASG algorithms with learning rate $α = 0.5$ for spiral data set, while the accuracy of BPSGD algorithm is 0.8887.