107
Views
1
CrossRef citations to date
0
Altmetric
Original Articles

Memory modulated saliency: A computational model of the incremental learning of target locations in visual search

&
Pages 277-305 | Received 31 Oct 2011, Accepted 03 Mar 2013, Published online: 24 Apr 2013
 

Abstract

The top-down guidance of visual attention is one of the main factors allowing humans to effectively process vast amounts of incoming visual information. Nevertheless we still lack a full understanding of the visual, semantic, and memory processes governing visual attention. In this paper, we present a computational model of visual search capable of predicting the most likely positions of target objects. The model does not require a separate training phase, but learns likely target positions in an incremental fashion based on a memory of previous fixations. We evaluate the model on two search tasks and show that it outperforms saliency alone and comes close to the maximal performance of the Contextual Guidance Model (CGM; Torralba, Oliva, Castelhano, & Henderson, 2006; Ehinger, Hidalgo-Sotelo, Torralba, & Oliva, 2009), even though our model does not perform scene recognition or compute global image statistics. The search performance of our model can be further improved by combining it with the CGM.

Acknowledgments

The support of the European Research Council under award number 203427 “Synchronous Linguistic and Visual Processing” is gratefully acknowledged.

We would like to thank Moreno I. Coco for sharing his data, and for numerous comments and suggestions regarding this work. We are also grateful to the authors of Ehinger, Hidalgo-Sotelo, Torralba, and Oliva (Citation2009) and Torralba, Oliva, Castelhano, and Henderson (Citation2006) for sharing the image corpora and eye-tracking data used in their studies.

A preliminary version of the study reported in this paper has been published as Dziemianko, Keller, and Coco (2011).

Notes

1This approach does not require a separate training phase, which makes it more adaptable to different data sets, tasks, and experimental conditions.

2The histograms of target positions (see below) suggest that the distribution of target locations is slightly bimodal, so a modest improvement may result from employing a mixture of Gaussians instead of a single Gaussian.

3For the visual search data, the mean size of an object is 0.93° visual angle horizontally and 1.92° visual angle vertically. For the visual count data, the mean size is 1.77° horizontally and 3.90° vertically.

4Thresholding works by selecting the points with the highest model values until the threshold is reached. For example, a threshold of 10% on a saliency map means that we select the points with the highest saliency until we have selected 10% of the image. We then count how many of the fixations fall within these 10%. If we select 100% of image, we trivially predict all fixations correctly.

5Ehinger et al. (Citation2009) designed their stimuli as follows: “For the target-present images, targets were spatially distributed across the image periphery (target locations ranged from 2.7° to 13° from the screen centre; median eccentricity was 8.6°), and were located in each quadrant of the screen with approximately equal frequency” (p. 950). The fact that the authors deliberately placed the target at the screen periphery explains the bimodality of horizontal positions in (bottom panel). There is only a weak bimodality in vertical positions in (bottom panel), which is probably due to the fact that their target objects (which were always pedestrians) show a central bias vertically, which presumably counteracts the peripheral bias in the stimulus design.

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 53.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 238.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.