796
Views
29
CrossRef citations to date
0
Altmetric
Review Article

Could information theory provide an ecological theory of sensory processing?

Pages 4-44 | Received 31 Dec 1991, Published online: 07 Dec 2011
 

Abstract

The sensory pathways of animals are well adapted to processing a special class of signals, namely stimuli from the animal’s environment. An important fact about natural stimuli is that they are typically very redundant and hence the sampled representation of these signals formed by the array of sensory cells is inefficient. One could argue for some animals and pathways, as we do in this review, that efficiency of information representation in the nervous system has several evolutionary advantages. Consequently, one might expect that much of the processing in the early levels of these sensory pathways could be dedicated towards recoding incoming signals into a more efficient form. In this review, we explore the principle of efficiency of information representation as a design principle for sensory processing. We give a preliminary discussion on how this principle could be applied in general to predict neural processing and then discuss concretely some neural systems where it recently has been shown to be successful. In particular, we examine the fly’s LMC coding strategy and the mammalian retinal coding in the spatial, temporal and chromatic domains.

Reprinted article originally published in Network: Computation in Neural systems Jan 1992, Vol. 3, No. 2: 213–251.

Reprinted article originally published in Network: Computation in Neural systems Jan 1992, Vol. 3, No. 2: 213–251.

Notes

Reprinted article originally published in Network: Computation in Neural systems Jan 1992, Vol. 3, No. 2: 213–251.

1. Physicists might find these bounds reminiscent of the bounds set by the laws of thermodynamics on the performance of heat engines.

2. This can always be achieved by an appropriate choice of discretization of source outputs.

3. Since we use log2 the units of I are bits (or binary digits)/message.

4. To see how the proof goes consider the simple case of two symbols. Define the matrix Dij = P(mi)P(mj) − P(mi, mj), then using the fundamental inequality x ⩾ ln(l + x) applied to x = Dij/P(mi, mj) we have the inequality Dij/P(mi, mj) ⩾ ln(1 + Dij/P(mi, mj)) Multiplying this by P(mi, mj) on both sides and summating on i and j remembering that P(mi) = ΣjP(mi, mj) and ΣiP(mi) = 1 one arrives at H(1) + H(2) ⩾ H(1, 2). Generalizing this proof to arbitrary number of symbols is straightforward.

5. Never mind the fact that they violate Chargaff’s rule.

6. For estimates of redundancy in other western languages, see Barnard (Citation1955).

7. Elegant examples of factorial codes can be found in Barlow et al. (Citation1989) and Hentschel and Barlow (Citation1991), see also Watanabe (Citation1981, Citation1985).

8. At this stage we cannot tell which of the two strategies, redundancy reduction or minimum entropy, is more fundamental in the nervous system. However, since they are closely related we will continue to treat both on an equal footing under the banner of efficiency.

9. Actually it is very unlikely that the bottleneck is abrupt. It is most likely happening through a gradual constriction of data flow.

10. Of course, if the animal’s needs are very specific then it could develop specialized feature detectors—bug detectors—very early on in its pathways that are tuned for objects and patterns that are critical for its survival. Such detectors will cut down on the data rate since they discard almost everything they do not detect. In higher animals, where the needs are not very specific and where flexibility to changing environment is critical, a better strategy is one which recodes to improve efficiency without discarding a lot of information early on. In reality, a combination of the two mechanisms is in place. For example, an animal chooses a sensory sampling unit—acuity limit or resolution—below which it discards all data.

11. In Pavlovian conditioning m1 is the conditional stimulus while m2 is the unconditional one.

12. To be more precise, it needs knowledge of the conditional probability P(m2 | m1) which is related to the joint probability through P(m2 | m1) = P(m2, m1)/P(m1). A high conditional probability P(m2 | m1) means that m1 is a good predictor of m2.

13. To take an example, imagine the situation where the visual pathway recodes images into a factorial representation. Then the probability of any scene can be computed easily from the product of probabilities of the individual elements that it activates. This scene probability can be thought of in two ways, one as the probability of some complex stimulus and two as the joint probability of the features that make up the stimulus. Thus factorial codes in vision provide the visual pathway with a simple way to compute joint probabilities of visual features.

14. Since by a simple transformation we can also achieve minimum redundancy, the results of this section are equally relevant to minimum redundancy coding.

15. To make the inverse Fourier transform of (13) well defined one has to use a low and high frequency cutoffs which physically correspond to 1/(size of the visual field) and l/(resolution scale) respectively. These cutoffs violate the scale invariance of R(x), which holds only as an approximate symmetry.

16. Here we are working at high luminosity so we can ignore the role of noise and the treat the problem with the tools of noiceless information theory

17. The contrast signal is defined as (I − I0)/I0)/I0 where I is the intensity of a given pixel while I is the average intensity within some visual window. This definition gives 3 contrast that cannot be smaller than −1.

18. For further information about retinal organization the reader should consult reviews on the subject (e.g. Davson Citation1980; Shapley and Enroth-Cugell Citation1984; Sterling Citation1990).

19. The linear cells in cat are often referred to as the X cells, while in monkey they are known as the parvocellular cells which constitute about 80% of the ganglion cells in the retina. In monkey, they are considered to be part of a pathway that extends into the deep layers and is believed to be concerned with detailed form recognition (see e.g. Van Essen and Anderson Citation1988).

20. There are other opponent cell types that involve blue cones. However, since blue cones are rare in the retina (non-existent in fovea) these cells are also rare and hence will not be discussed here (De Monasterio et al. Citation1985).

21. Since we will be assuming Gaussian signals, two-point decorrelation and statistical independence are equivalent.

22. To see this, note that under a linear transformation O = K · L, the probabilities being densities—dOP(O) = dLP(L) transform as P(O) = P(L)/det K. Substituting this expression into the definition of H(O) and changing variables it is straightforward to get H(O) = H(L) + log det K.

23. In the case of primates, which are believed to have evolved in a forest like environment, one finds that the proximity of R and G cones can be explained by’the fact that most of the information in a forest is squeezed in a narrow spectral band centred about 550 nm. Thus one needs to sample that region more densely if one is to resolve different objects found in that spectral band. On the other hand, under water light in the spectral band between 550 nm and 610 nm is heavily absorbed by water with the amount of absorption increasing dramatically with distance travelled. Thus if shallow-water fish had adopted pigments around 568 nm just like primates, they would not have been able to see far under water. Shallow-water fish instead evolved cones that sampled near the infrared, an area where the signal under water travels much farther before complete absorption. Additional discussion regarding the adaptation of the cone system of various species to the environment can be found in the excellent book of Lythgoe (Citation1979).

24.  is a constant matrix depending only on one number, the rotation angle; it satisfies .

25. A rotation could have been done in the goldfish case also, but there the two channels (49) Z1 and Z2 already have approximately equal S/N so the degree of mixing is very small or ignorable.

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 65.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 642.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.