13
Views
4
CrossRef citations to date
0
Altmetric
Original Articles

Information Content in Neural Net Optimization

&
Pages 91-103 | Received 13 Nov 1992, Accepted 08 Oct 1993, Published online: 29 Oct 2007
 

Abstract

Reduction in the size and complexity of neural networks is essential to improve generalization, reduce training error and improve network speed. Most of the known optimization methods heavily rely on weight-sharing concepts for pattern separation and recognition. In weight-sharing methods the redundant weights from specific areas of input layer are pruned and the value of weights and their information content play a very minimal role in the pruning process. The method presented here focuses on network topology and information content for optimization. We have studied the change in the network topology and its effects on information content dynamically during the optimization of the network. The primary optimization uses scaled conjugate gradient and the secondary method of optimization is a Boltzmann method. The conjugate gradient optimization serves as a connection creation operator and the Boltzmann method serves as a competitive connection annihilation operator. By combining these two methods, it is possible to generate small networks which have similar testing and training accuracy, i.e. good generalization, from small training sets. In this paper, we have also focused on network topology. Topological separation is achieved by changing the number of connections in the network. This method should be used when the size of the network is large enough to tackle real-life problems such as fingerprint classification. Our findings indicate that for large networks, topological separation yields a smaller network size, which is more suitable for VLSI implementation. Topological separation is based on the error surface and information content of the network. As such it is an economical way of reducing size, leading to overall optimization. The differential pruning of the connections is based on the weight content rather than the number of connections. The training error may vary with the topological dynamics but the correlation between the error surface and recognition rate decreases to a minimum. Topological separation reduces the size of the network by changing its architecture without degrading its performance,

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.