Abstract
We investigate the psychophysical and computational processes underlying the perception of depth from texture cues in perspective projections. In particular, we analyse the similarities between the processes associated with perspective and orthographic projections. Based on a series of psychophysical experiments using white noise stimuli presented in perspective projection, we suggest that the visual system may characterize the spatial-frequency spectrum by the average peak frequency (APF), which is the same characteristic as that used in orthographic projections. We demonstrate that normalization of the APF yields an estimate of depth in perspective projection. Our previous studies suggest that the APF is used in orthographic projections, where the output of the normalization process represents a linear approximation of surface slant. Based on these results, together with previous psychophysical evidence, we propose a neural network model of shape-from-texture for the perspective view. Simulations of the model show qualitative agreement with human perception over a range of real and artificial stimuli.