References
- Kelley, Henry J. (1960). Gradient theory of optimal flight paths. ARS Journal. 30 (10): 947–954. Bibcode:1960ARSJ...30.1127B. doi:10.2514/8.5282.
- Dreyfus, Stuart. (1962). The numerical solution of variational problems. Journal of Mathematical Analysis and Applications. 5 (1): 30–45. doi:10.1016/0022-247x(62)90004-5.
- Werbos, P. (1974). Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Sciences. PhD thesis, Harvard University.
- Rumelhart, David E.; Hinton, Geoffrey E.; Williams, Ronald J. (1986-10-09). Learning representations by back-propagating errors. Nature. 323 (6088): 533–536. Bibcode:1986Natur.323..533R. doi:10.1038/323533a0.
- LeCun, Y. (1987). Modèles Connexionnistes de l’apprentissage (Connectionist Learning Models), Ph.D. thesis, Université P. et M. Curie.
- Herbert Robbins and Sutton Monro. (1951). A Stochastic Approximation Method. The Annals of Mathematical Statistics...