Publication Cover
Molecular Physics
An International Journal at the Interface Between Chemistry and Physics
Volume 116, 2018 - Issue 21-22: Daan Frenkel – An entropic career
959
Views
23
CrossRef citations to date
0
Altmetric
Frenkel Special Issue

Energy–entropy competition and the effectiveness of stochastic gradient descent in machine learning

, , &
Pages 3214-3223 | Received 06 Mar 2018, Accepted 23 May 2018, Published online: 22 Jun 2018

References

  • Y. Robbins and Y. Monro, Ann. Math. Statist.400 (1951). doi: 10.1214/aoms/1177729586
  • L. Bottou, in Proceedings of COMPSTAT'10 (2010), pp. 177–186.
  • L. Bottou, F.E. Curtis and J. Nocedal, SIAM Rev. 60 (2), 223 (2018).
  • A. Krizhevsky, I. Sutskever, G.E. Hinton, in Advances in Neural Information Processing Systems (2012), pp. 1097–1105.
  • O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla and M. Bernstein, Int. J. Comput. Vis. 115 (3), 211 (2015). doi: 10.1007/s11263-015-0816-y
  • Y. LeCun, Y. Bengio and G. Hinton, Nature 521 (7553), 436 (2015). doi: 10.1038/nature14539
  • C. Szegedy, S. Ioffe, V. Vanhoucke and A. Alemi, Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (2017).
  • D. Silver, A. Huang, C.J. Maddison, A. Guez, L. Sifre, G. Van Den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam and M. Lanctot, Nature 529 (7587), 484 (2016). doi: 10.1038/nature16961
  • Y.V. Fyodorov and I. Williams, J. Stat. Phys. 129 (5–6), 1081 (2007). doi: 10.1007/s10955-007-9386-x
  • A.J. Bray and D.S. Dean, Phys. Rev. Lett. 98 (15), 150201 (2007). doi: 10.1103/PhysRevLett.98.150201
  • Y.N. Dauphin, R. Pascanu, C. Gulcehre, K. Cho, S. Ganguli and Y. Bengio, in Advances in Neural Information Processing Systems (2014), pp. 2933–2941.
  • A. Choromanska, M. Henaff, M. Mathieu, G.B. Arous and Y. LeCun in AISTATS (2015).
  • P. Baldi and K. Hornik, Neural Netw. 2 (1), 53 (1989). doi: 10.1016/0893-6080(89)90014-2
  • K. Kawaguchi, in Advances in Neural Information Processing Systems (2016), pp. 586–594.
  • H. Kawaguchi and K. Lu, preprint, arXiv:1702.08580 (2017).
  • C. Zhang, S. Bengio, M. Hardt, B. Recht and O. Vinyals, in International Conference on Learning Representations (2017).
  • L. Prechelt, in Neural Networks: Tricks of the Trade (Springer, Berlin, 2012), pp. 53–67.
  • D. Duvenaud, D. Maclaurin, R. Adams, in Artificial Intelligence and Statistics (Springer, 2016), pp. 1070–1077.
  • G.E Hinton, D. Van Camp in Proceedings of the Sixth Annual Conference on Computational Learning Theory (1993), pp. 5–13.
  • S. Hochreiter and J. Schmidhuber, Advances in Neural Information Processing Systems (1995), pp. 529–536.
  • S. Hochreiter and J. Schmidhuber, Neural Comput. 9 (1), 1 (1997). doi: 10.1162/neco.1997.9.1.1
  • N.S. Keskar, D. Mudigere, J. Nocedal, M. Smelyanskiy and P.T.P. Tang, in International Conference on Learning Representations (2017).
  • S. Jastrzebski, Z. Kenton, D. Arpit, N. Ballas, A. Fischer,Y. Bengio and A. Storkey, preprint, arXiv:1711.04623 (2017).
  • A. Neelakantan, L. Vilnis, Q.V. Le, I. Sutskever, L. Kaiser,K. Kurach and J. Martens, preprint, arXiv:1511. 06807 (2015).
  • C. Baldassi, C. Borgs, J.T. Chayes, A. Ingrosso, C. Lucibello, L. Saglietti and R. Zecchina, Proc. Natl Acad. Sci. 113 (48), E7655 (2016). doi: 10.1073/pnas.1608103113
  • P. Chaudhari, A. Choromanska, S. Soatto and Y. LeCun, in International Conference on Learning Representations (2017).
  • L. Dinh, R. Pascanu, S. Bengio and Y. Bengio, preprint, arXiv:1703.04933 (2017).
  • S.L. Smith, Q.V. Le, in International Conference on Learning Representations (2018).
  • C.J. Li, L. Li, J. Qian and J.G. Liu, preprint, arXiv:1705. 07562 (2017).
  • J.J. Waterfall, F.P. Casey, R.N. Gutenkunst, K.S. Brown, C.R. Myers, P.W. Brouwer, V. Elser and J.P. Sethna, Phys. Rev. Lett. 97 (15), 150601 (2006). doi: 10.1103/PhysRevLett.97.150601
  • M. Advani, S. Lahiri and S. Ganguli, J. Stat. Mech. 2013 (03), P03014 (2013). doi: 10.1088/1742-5468/2013/03/P03014
  • A.J. Ballard, J.D. Stevenson, R. Das and D.J. Wales, J. Chem. Phys. 144 (12), 124119 (2016). doi: 10.1063/1.4944672
  • R. Das and D.J. Wales, Phy. Rev. E 93 (6), 063310 (2016).
  • L. Sagun, U. Evci,V.U. Guney,Y. Dauphin and L. Bottou, arXiv:1706.04454 (2018).
  • S. Martiniani, K.J. Schrenk, J.D. Stevenson, D.J. Wales and D. Frenkel, Phys. Rev. E 94 (3), 031301 (2016). doi: 10.1103/PhysRevE.94.031301
  • A. Krizhevsky and G. Hinton, Learning multiple layers of features from tiny images (2009).
  • K. Simonyan and A. Zisserman, in International Conference on Learning Representations (2015).
  • A.M. Saxe, J.L. McClelland and S. Ganguli, in International Conference on Learning Representations (2014).
  • H. Seung, H. Sompolinsky and N. Tishby, Phys. Rev. A 45 (8), 6056 (1992). doi: 10.1103/PhysRevA.45.6056
  • M.S. Advani and A.M. Saxe, preprint, arXiv:1710.03667 (2017).
  • P. Liang, https://web.stanford.edu/class/cs229t/Lectures/percy-notes.pdf.
  • D. Wales, Energy landscapes: applications to clusters, biomolecules and glasses (Cambridge University Press, Cambridge, 2003).
  • A.J. Ballard, R. Das, S. Martiniani, D. Mehta, L. Sagun, J.D. Stevenson and D.J. Wales, Phys. Chem. Chem. Phys. 19 (20), 12585 (2017). doi: 10.1039/C7CP01108C

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.