Abstract
The authors address the issue of the precision required by a threshold element in order to implement a linearly separable mapping of N binary inputs. In contrast to previous works they require only the ability to correctly implement, with high probability, the mapping of P randomly chosen training examples, as opposed to the requirement of perfect mapping for every possible set of inputs. Their results are obtained within the statistical mechanics approach and are thus average case results as opposed to the extreme case analyses in the computational learning theory literature. They show that as long as the fraction P/N is finite, then with probability close to 1 as N→∞, a finite number of bits suffice to implement the mapping. This should be compared with the extreme case analysis which requires at least N bits. They also calculate the ability of the constrained network to predict novel examples and compare their predictions with those of an unconstrained network, showing that as long as the training set is not too large, the learning and generalization abilities of the constrained network are not very different from the unconstrained one.