Search in:

Advanced search

Network: Computation in Neural Systems Volume 6, 1995 - Issue 3

Submit an article Journal homepage

832

Views

292

CrossRef citations to date

Altmetric

Review Article

Probable networks and plausible predictions — a review of practical Bayesian methods for supervised neural networks

David J C Mackay Cavendish Laboratory, University of Cambridge, Cambridge, CB3 OHE, UK

Pages 469-505 | Received 09 Feb 1995, Published online: 09 Jul 2009

Cite this article
https://doi.org/10.1088/0954-898X_6_3_011

References
Citations
Metrics
Reprints & Permissions

References

Abu-Mostafa Y S. The Vapnik–Chervonenkis dimension: information versus complexity in learning. Neural Comput. 1990; 1(3)312–7
Google Scholar
Berger J. Statistical Decision Theory and Bayesian Analysis. Springer, Berlin 1985
Google Scholar
Bishop C M. Exact calculation of the Hessian matrix for the multilayer perceptron. Neural Comput. 1992; 4(4)494–501
Web of Science ®Google Scholar
Box G E P, Tiao G C. Bayesian Inference in Statistical Analysis. Addison-Wesley, Reading, MA 1973
Google Scholar
Breiman L. Stacked regressions. Technical Report 367. Department of Statistics, University of California, Berkeley 1992
Google Scholar
Bretthorst G. Bayesian Spectrum Analysis and Parameter Estimation. Springer, Berlin 1988
Google Scholar
Bridle J S. Probabilistic interpretation of feedforward classification network outputs, with relationships to statistical pattern recognition. Neuro-computing: Algorithms, Architectures and Applications, F Fougelman-Soulie, J Hérault. Springer, Berlin 1989
Google Scholar
Buntine W, Weigend A. Bayesian back-propagation. Complex Systems 1991; 5: 603–43
Google Scholar
Copas J B. Regression, prediction and shrinkage (with discussion). J. R. Statist. Soc. B 1983; 45(3)311–54
Google Scholar
Cox R. Probability, frequency, and reasonable expectation. Am. J. Phys. 1946; 14: 1–13
Web of Science ®Google Scholar
Gull S F. Bayesian inductive inference and maximum entropy. Maximum Entropy and Bayesian Methods in Science and Engineering, Vol. 1: Foundations, G Erickson, C Smith. Kluwer, Dordrecht 1988; 53–74
Google Scholar
Gull S F. Developments in maximum entropy data analysis. Maximum Entropy and Bayesian Methods Cambridge 1988, J Skilling. Kluwer, Dordrecht 1989; 53–71
Google Scholar
Guyon I, Vapnik V N, Boser B E, Bottou L Y, Solla S A. Structural risk minimization for character recognition. Advances in Neural Information Processing Systems 4, J E Moody, S J Hanson, R P Lippmann. Morgan Kaufmann, San Mateo, CA 1992; 471–9
Google Scholar
Hanson R, Stutz J, Cheeseman P. Bayesian classification with correlation and inheritance. Proc. 12th Int. Joint Conf. on Artifical Intelligence, SydneyAustralia. Morgan Kaufmann, San Mateo, CA 1991; 2: 692–8
Google Scholar
Hassibi B, Stork D G. Second order derivatives for network pruning: Optimal brain surgeon. Advances in Neural Information Processing Systems 5, C L Giles, S J Hanson, J D Cowan. Morgan Kaufmann, San Mateo, CA 1993; 164–71
Google Scholar
Hinton G E, Sejnowski T J. Learning and relearning in Boltzmann machines. Parallel Distributed Processing, D E Rumelhart, J E McClelland. MIT Press, Cambridge, MA 1986; 282–317
Google Scholar
Hinton G E, van Camp D. Keeping neural networks simple by minimizing the description length of the weights. Proc. 6th Ann. Workshop on Computer Learning Theory. ACM Press, New York 1993; 5–13
Google Scholar
Hinton G E, Zemel R S. Autoencoders, minimum description length and Helmholtz free energy. Advances in Neural Information Processing Systems 6, J D Cowan, G Tesauro, J Alspector. Morgan Kaufmann, San Mateo, CA 1994
Google Scholar
Jaynes E T. Bayesian intervals versus confidence intervals. E T Jaynes: Papers on Probability, Statistics and Statistical Physics, R D Rosencrantz. Kluwer, Dordrecht 1983; 151
Google Scholar
Jeffreys H. Theory of Probability. Oxford University Press, Oxford 1939
Google Scholar
LeCun Y, Denker J, Solla S A. Optimal brain damage. Advances in Neural Information Processing Systems 2, D Touretzky. Morgan Kaufmann, San Mateo, CA 1990; 598–605
Google Scholar
Loredo T J (1990) From Laplace to supernova SN 1987A: Bayesian inference in astrophysics. Maximum Entropy and Bayesian Methods, DartmouthUSA, 1989, P Fougere. Kluwer, Dordrecht, 81–142
Google Scholar
MacKay D J C. Bayesian methods for adaptive models. California Institute of Technology. 1991, PhD Thesis
Google Scholar
MacKay D J C. Bayesian interpolation. Neural Comput. 1992a; 4(3)415–47
Web of Science ®Google Scholar
MacKay D J C. A practical Bayesian framework for backpropagation networks. Neural Comput. 1992b; 4(3)448–72
Web of Science ®Google Scholar
MacKay D J C. The evidence framework applied to classification networks. Neural Comput. 1992c; 4(5)698–714
Google Scholar
MacKay D J C. Bayesian non-linear modelling for the prediction competition. ASHRAE Trans. Vol 100, part 2. ASHRAE, Atlanta, GA 1994
Google Scholar
MacKay D J C. Bayesian neural networks and density networks. Nucl. Instrum. Methods Phys. Res. A 1995a, in press
Web of Science ®Google Scholar
MacKay D J C (1995b) Hyperparameters: Optimize, or integrate out?. Maximum Entropy and Bayesian Methods, Santa Barbara, CA, 1993, G Heidbreder. Kluwer, Dordrecht
Google Scholar
Moody J E. The effective number of parameters: An analysis of generalization and regularization in nonlinear learning systems. Advances in Neural Information Processing Systems 4, J E Moody, S J Hanson, R P Lippmann. Morgan Kaufmann, San Mateo, CA 1992; 847–54
Google Scholar
Neal R M. Bayesian learning via stochastic dynamics. Advances in Neural Information Processing Systems 5, C L Giles, S J Hanson, J D Cowan. Morgan Kaufmann, San Mateo, CA 1993; 475–82
Google Scholar
Neal R M. Bayesian learning for neural networks. Department of Computer Science, University of Toronto. 1995, PhD Thesis
Google Scholar
Patrick J D, Wallace C S. Stone circle geometries: an information theory approach. Archaeoastronomy in the Old World, D C Heggie. Cambridge University Press, Cambridge 1982
Google Scholar
Pearlmutter B A. Fast exact multiplication by the Hessian. Neural Comput. 1994; 6(1)147–60
Web of Science ®Google Scholar
Rumelhart D E, Hinton G E, Williams R J. Learning representations by back-propagating errors. Nature 1986; 323: 533–6
Web of Science ®Google Scholar
Skilling J. Bayesian numerical analysis. Physics and Probability, W T Grandy, Jr, P Milonni. Cambridge University Press, Cambridge 1993
Google Scholar
Skilling J, Robinson D R T, Gull S F (1991) Probabilistic displays. Maximum Entropy and Bayesian Methods, Laramie, 1990, W T Grandy, L Schick. Kluwer, Dordrecht, 365–8
Google Scholar
Spiegelhalter D J, Lauritzen S L. Sequential updating of conditional probabilities on directed graphical structures. Networks 1990; 20: 579–605
Web of Science ®Google Scholar
Thodberg H H. Ace of Bayes: application of neural networks with pruning. Technical Report 1132 E. Danish Meat Research Institute. 1993
Google Scholar
Wallace C, Boulton D. An information measure for classification. Comput. J. 1968; 11(2)185–94
Web of Science ®Google Scholar
Wallace C S, Freeman P R. Estimation and inference by compact coding. J. R. Statist. Soc. B 1987; 49(3)240–65
Google Scholar
Weir N (1991) Applications of maximum entropy techniques to HST data. Proc. ESO/ST—ECF Data Analysis Workshop, Garching, April, 1991, P J Grosbol, R H Warmels. 115–29, (European Southern Observatory/Space Telescope—European Coordinating Facility)
Google Scholar
Witten I H, Neal R M, Cleary J G. Arithmetic coding for data compression. Commun. ACM 1987; 30(6)520–40
Web of Science ®Google Scholar
Wolpert D H. On the use of evidence in neural networks. Advances in Neural Information Processing Systems 5, C L Giles, S J Hanson, J D Cowan. Morgan Kaufmann, San Mateo, CA 1993; 539–46
Google Scholar

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Share icon
Back to Top

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Probable networks and plausible predictions — a review of practical Bayesian methods for supervised neural networks

References

Information for

Open access

Opportunities

Help and information

Your download is now in progress and you may close this window

Login or register to access this feature

Probable networks and plausible predictions — a review of practical Bayesian methods for supervised neural networks

References

Reprints and Corporate Permissions

Academic Permissions

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date