References
- Adeli, H., Vitu, F., & Zelinsky, G. J. (2017). A model of the superior colliculus predicts fixation locations during scene viewing and visual search. The Journal of Neuroscience, 37(6), 1453–1467. doi: 10.1523/JNEUROSCI.0825-16.2016
- Adeli, H., & Zelinsky, G. (2018). Deep-BCN: Deep networks meet biased competition to create a brain-inspired model of attention control. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops (CVPRw) (pp. 1932–1942).
- Allport, D. A. (1980). Attention and performance. In G. Claxton (Ed.), Cognitive psychology (pp. 112–153). London: Routledge & Kegan Paul.
- Ballard, D. H., & Hayhoe, M. M. (2009). Modelling the role of task in the control of gaze. Visual Cognition, 17, 1185–1204. doi: 10.1080/13506280902978477
- Bau, D., Zhou, B., Khosla, A., Oliva, A., & Torralba, A. (2017). Network dissection: Quantifying interpretability of deep visual representations. In Proceedings of computer vision and pattern recognition (CVPR 2017).
- Beck, D., Pinsk, M., & Kastner, S. (2005). Symmetry perception in humans and macaques. Trends in Cognitive Sciences, 9, 405–406. doi: 10.1016/j.tics.2005.07.002
- Brewer, A., Liu, J., Wade, A., & Wandell, B. (2005). Visual field maps and stimulus selectivity in human ventral occipital cortex. Nature Neuroscience, 8, 1102–1109. doi: 10.1038/nn1507
- Bylinskii, Z., Judd, T., Oliva, A., Torralba, A., & Durand, F. (2016). What do different evaluation metrics tell us about saliency models? arXiv:1604.03605.
- Cadieu, C., Hong, H., Yamins, D., Pinto, N., Ardila, D., Solomon, E., … Bethge, M. (2014). Deep neural networks rival the representation of primate IT cortex for core visual object recognition. PLoS Computational Biology, 10, e1003963. doi: 10.1371/journal.pcbi.1003963
- Cadieu, C., Kouh, M., Pasupathy, A., Connor, C., Riesenhuber, M., & Poggio, T. (2007). A model of V4 shape selectivity and invariance. Journal of Neurophysiology, 98, 1733–1750. doi: 10.1152/jn.01265.2006
- Canziani, A., Culurciello, E., & Paszke, A. (2017). An analysis of deep neural network models for practical applications. arXiv:1605.07678v4.
- Cohen, M. A., Alvarez, G. A., Nakayama, K., & Konkle, T. (2016). Visual search for object categories is predicted by the representational architecture of high-level visual cortex. Journal of Neurophysiology, 117, 388–402. doi: 10.1152/jn.00569.2016
- Csurka, G., Dance, C. R., Fan, L., Willamowski, J., & Bray, C. (2004). Visual categorization with bags of keypoints. In Proceedings of the European conference on computer vision (pp. 1–22).
- Desimone, R., & Duncan, J. (1995). Neural mechanisms of selective visual attention. Annual Review of Neuroscience, 18, 193–222. doi: 10.1146/annurev.ne.18.030195.001205
- Desimone, R., Schein, S., Moran, J., & Ungerleider, L. (1985). Contour, color and shape analysis beyond the striate cortex. Vision Research, 25, 441–452. doi: 10.1016/0042-6989(85)90069-0
- Deubel, H., & Schneider, W. X. (1996). Saccade target selection and object recognition: Evidence for a common attentional mechanism. Vision Research, 36, 1827–1837. doi: 10.1016/0042-6989(95)00294-4
- DiCarlo, J., & Cox, D. (2007). Untangling invariant object recognition. Trends in Cognitive Sciences, 11, 333–341. doi: 10.1016/j.tics.2007.06.010
- Einhäuser, W., Spain, M., & Perona, P. (2008). Objects predict fixations better than early saliency. Journal of Vision, 8(14), 18–18. doi: 10.1167/8.14.18
- Engel, S., Glover, G., & Wandell, B. (1997). Retinotopic organization in human visual cortex and the spatial precision of functional MRI. Cerebral Cortex, 7, 181–192. doi: 10.1093/cercor/7.2.181
- Fize, D., Vandeffel, W., Nelissen, K., Denys, K., d’Hotel, C. C., Faugeras, O., & Orban, G. (2003). The retinotopic organization of primate dorsal v4 and surrounding areas: A functional magnetic resonance imaging study in awake monkeys. The Journal of Neuroscience, 23, 7395–7406. doi: 10.1523/JNEUROSCI.23-19-07395.2003
- Freeman, J., & Simoncelli, E. (2011). Metamers of the ventral stream. Nature Neuroscience, 14, 1195–1201. doi: 10.1038/nn.2889
- Gattas, R., Sousa, A., Mishkin, M., & Ungerleider, L. (1997). Cortical projections of area v2 in the macaque. Cerebral Cortex, 7, 110–129. doi: 10.1093/cercor/7.2.110
- Grill-Spector, K., Weiner, K. S., Gomez, J., Stigliani, A., & Natu, V. S. (2018). The functional neuroanatomy of face perception: From brain measurements to deep neural networks. Interface Focus, 8, 20180013. doi: 10.1098/rsfs.2018.0013
- Harvey, B., & Dumoulin, S. (2011). The relationship between cortical magnification factor and population receptive field size in human visual cortex: Constancies in cortical architecture. Journal of Neuroscience, 31, 13604–13612. doi: 10.1523/JNEUROSCI.2572-11.2011
- He, K., Zhang, X., Ren, S., & Sun, J. (2015). Delving deep into rectifiers: surpassing human-level performance on imagenet classification In Proceedings of the international conference on computer vision (CVPR) (pp. 1026–1034).
- He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In IEEE conference on computer vision and pattern recognition (CVPR 2016) (pp. 770–778).
- Hong, H., Yamins, D., Majaj, N., & DiCarlo, J. (2016). Explicit information for category-orthogonal object properties increases along the ventral stream. Nature Neuroscience, 19, 613–622. doi: 10.1038/nn.4247
- Hout, M. C., Robbins, A., Godwin, H. J., Fitzsimmons, G., & Scarince, C. (2017). Categorical templates are more useful when features are consistent: Evidence from eye-movements during search for societally important vehicles. Attention, Perception, & Psychophysics, 79, 1578–1592. doi: 10.3758/s13414-017-1354-1
- Huang, G., Liu, Z., Van Der Maaten, L., & Weinberger, K. Q. (2017). Densely connected convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4700–4708).
- Huang, X., Shen, C., Boix, X., & Zhao, Q. (2015). Salicon: Reducing the semantic gap in saliency prediction by adapting deep neural networks. In Proceedings of the international conference on computer vision (ICCV).
- Ioffe, S., & Szegedy, C. (2015). Batch normalization: Accelerating deep network training by reducing internal covariate shift. In International conference on machine learning.
- Kastner, S., Weerd, P., Desimone, R., & Ungerleider, L. (1998). Mechanisms of directed attention in the human extrastriate cortex as revealed by functional MRI. Science, 282, 108–111. doi: 10.1126/science.282.5386.108
- Kastner, S., Weerd, P., Pinsk, M., Elizondo, M., Desimone, R., & Ungerleider, L. (2001). Modulation of sensory suppression: Implications for receptive field sizes in the human visual cortex. Journal of Neurophysiology, 86, 1398–1411. doi: 10.1152/jn.2001.86.3.1398
- Khaligh-Razavi, S., & Kriegeskorte, N. (2014). Deep supervised, but not unsupervised, models may explain IT cortical representation. PLoS Computational Biology, 10, e1003915. doi: 10.1371/journal.pcbi.1003915
- Kietzmann, T. C., McClure, P., & Kriegeskorte, N. (2019). Deep neural networks in computational neuroscience. Oxford Research Encyclopaedia of Neuroscience. doi: 10.1093/acrefore/9780190264086.013.46
- Kobatake, E., & Tanaka, K. (1994). Neuronal selectivities to complex object features in the ventral visual pathway of the macaque cerebral cortex. Journal of Neurophysiology, 71, 856–867. doi: 10.1152/jn.1994.71.3.856
- Kravitz, D., Kadharbatcha, S., Baker, C., Ungerleider, L., & Mishkin, M. (2013). The ventral visual pathway: An expanded neural framework for the processing of object quality. Trends in Cognitive Sciences, 17, 26–49. doi: 10.1016/j.tics.2012.10.011
- Kriegeskorte, N. (2015). Deep neural networks: A new framework for modelling biological vision and brain information processing. Annual Review of Vision Science, 1, 417–446. doi: 10.1146/annurev-vision-082114-035447
- Krizhevsky, A., Sutskever, I., & Hinton, G. (2012). Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems (pp. 1097–1105). Red Hook, NY: Curran Associates Inc.
- Larsson, J., & Heeger, D. (2006). Two retinotopic visual areas in human lateral occipital cortex. Journal of Neuroscience, 26, 13128–13142. doi: 10.1523/JNEUROSCI.1657-06.2006
- Li, M., & Tsien, J. Z. (2017). Neural code—neural self-information theory on how cell-assembly code rises from spike time and neuronal variability. Frontiers in Cellular Neuroscience, 11, 236. doi: 10.3389/fncel.2017.00236
- Li, M., Xie, K., Kuang, H., Liu, J., Wang, D., & Fox, G. (2017). Spike-timing patterns conform to a gamma distribution with regional and cell type-specific characteristics. BioRxiv:145813.
- Li, G., & Yu, Y. (2015). Visual saliency based on multiscale deep features. In IEEE conference on computer vision and pattern recognition (CVPR 2015).
- Lowe, D. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60, 91–110. doi: 10.1023/B:VISI.0000029664.99615.94
- Maxfield, J. T., Stalder, W. D., & Zelinsky, G. J. (2014). Effects of target typicality on categorical search. Journal of Vision, 14(12), 1–11. doi: 10.1167/14.12.1
- Maxfield, J. T., & Zelinsky, G. J. (2012). Searching through the hierarchy: How level of target categorization affects visual search. Visual Cognition, 20(10), 1153–1163. doi: 10.1080/13506285.2012.735718
- McKeefry, D., & Zeki, S. (1997). The position and topography of the human colour centre as revealed by functional magnetic resonance imaging. Brain, 120, 2229–2242. doi: 10.1093/brain/120.12.2229
- Mishkin, M., Ungerleider, L., & Macko, K. (1983). Object vision and spatial vision: Two cortical pathways. Trends in Neurosciences, 6, 414–417. doi: 10.1016/0166-2236(83)90190-X
- Nakamura, H., Gattass, R., Desimone, R., & Ungerleider, L. (1993). The modular organization of projections from areas v1 and v2 to areas v4 and teo in macaques. The Journal of Neuroscience, 13, 3681–3691. doi: 10.1523/JNEUROSCI.13-09-03681.1993
- Nako, R., Wu, R., & Eimer, M. (2014). Rapid guidance of visual search by object categories. Journal of Experimental Psychology: Human Perception and Performance, 40(1), 50–60.
- Nassi, J. J., & Callaway, E. (2009). Parallel processing strategies of the primate visual system. Nature Reviews Neuroscience, 10, 360–372. doi: 10.1038/nrn2619
- Neider, M. B., & Zelinsky, G. J. (2006). Scene context guides eye movements during visual search. Vision Research, 46, 614–621. doi: 10.1016/j.visres.2005.08.025
- Nelson, W. W., & Loftus, G. R. (1980). The functional visual field during picture viewing. Journal of Experimental Psychology: Human Learning and Memory, 6(4), 391–399.
- Oliva, A., & Torralba, A. (2001). Modeling the shape of the scene: A holistic representation of the spatial envelope. International Journal of Computer Vision, 42, 145–175. doi: 10.1023/A:1011139631724
- Orban, G., Zhu, Q., & Vanduffel, W. (2014). The transition in the ventral stream from feature to real-world entity representations. Frontiers in Psychology, 5, 695.
- Pasupathy, A., & Connor, C. (2002). Population coding of shape in area V4. Nature Neuroscience, 5(12), 1332–1338. doi: 10.1038/972
- Peelen, M. V., & Kastner, S. (2011). A neural basis for real-world visual search in human occipitotemporal cortex. Proceedings of the National Academy of Sciences, 108(29), 12125–12130. doi: 10.1073/pnas.1101042108
- Rajimehr, R., Young, J., & Tootell, R. (2009). An anterior temporal face patch in human cortex, predicted by macaque maps. Proceedings of the National Academy of Sciences, 106, 1995–2000. doi: 10.1073/pnas.0807304106
- Razavian, A. S., Azizpour, H., Sullivan, J., & Carlsson, S. (2014). CNN features off-the-shelf: An astounding baseline for recognition. In Computer vision and pattern recognition workshops.
- Riesenhuber, M., & Poggio, T. (1999). Hierarchical models of object recognition in cortex. Nature Neuroscience, 2, 1019–1025. doi: 10.1038/14819
- Rousselet, G., Thorpe, S., & Fabre-Thorpe, M. (2004). How parallel is visual processing in the ventral pathway? Trends in Cognitive Sciences, 8, 363–370. doi: 10.1016/j.tics.2004.06.003
- Russakovsky, O., Deng, J., Su, H., Jrause, J., Satheesh, S., Ma, S., … Li, F. F. (2015). Imagenet large scale visual recognition challenge. International Journal of Computer Vision, 115, 211–252. doi: 10.1007/s11263-015-0816-y
- Sanchez, J., Perronnin, F., Mensink, T., & Verbeek, J. (2013). Image classification with the fisher vector: Theory and practice. International Journal of Computer Vision, 105, 222–245. doi: 10.1007/s11263-013-0636-x
- Schmidt, J., & Zelinsky, G. J. (2009). Search guidance is proportional to the categorical specificity of a target cue. Quarterly Journal of Experimental Psychology, 62(10), 1904–1914. doi: 10.1080/17470210902853530
- Serre, T., Kreiman, G., Kouh, M., Cadieu, C., Knoblich, U., & Poggio, T. (2007). A quantitative theory of immediate visual recognition. Progress in Brain Research, 165, 33–56. doi: 10.1016/S0079-6123(06)65004-8
- Serre, T., Oliva, A., & Poggio, T. (2007). A feedforward architecture accounts for rapid categorization. Proceedings of the National Academy of Sciences, 104, 6424–6429. doi: 10.1073/pnas.0700622104
- Simonyan, K., & Zisserman, A. (2015). Very deep convolutional networks for large-scale image recognition. In International conference on learning representations (ICLR 2015).
- Smith, A., Williams, A., & Greenlee, M. (2001). Estimating receptive field size from fMRI data in human striate and extrastriate visual cortex. Cerebral Cortex, 11, 1182–1190. doi: 10.1093/cercor/11.12.1182
- Szegedy, C., Liu, W., Jia, Y., Sermanet, P., & Reed, S. (2015). Going deeper with convolutions. In IEEE conference on computer vision and pattern recognition (CVPR 2015).
- Tanaka, K. (1997). Mechanisms of visual object recognition: Monkey and human studies. Current Opinion in Neurobiology, 7, 523–529. doi: 10.1016/S0959-4388(97)80032-3
- Tarr, M. (1999). News on views: Pandemonium revisited. Nature Neuroscience, 2, 932–935. doi: 10.1038/14714
- Thorpe, S. J., Gegenfurtner, K. R., Fabre-Thorpe, M., & Bülthoff, H. H. (2001). Detection of animals in natural images using far peripheral vision. European Journal of Neuroscience, 14, 869–876. doi: 10.1046/j.0953-816x.2001.01717.x
- Tsotsos, J. K., Culhane, S. M., Wai, W. Y. K., Lai, Y., Davis, N., & Nuflo, F. (1995). Modeling visual attention via selective tuning. Artificial Intelligence, 78, 507–545. doi: 10.1016/0004-3702(95)00025-9
- Ungerleider, L., Galkin, T., Desimone, R., & Gattass, R. (2007). Cortical connections of area v4 in the macaque. Cerebral Cortex, 18, 477–499. doi: 10.1093/cercor/bhm061
- Van Essen, D. C., Lewis, J., Drury, H., Hadjikhani, N., Tootell, R., Bakircioglu, M., & Miller, M. (2001). Mapping visual cortex in monkeys and humans using surface-based atlases. Vision Research, 41, 1359–1378. doi: 10.1016/S0042-6989(01)00045-1
- Vicente, T., Hoai, M., & Samaras, D. (2015). Leave-one-out kernel optimization for shadow detection In Proceedings of the international conference on computer vision (ICCV) (pp. 3388–3396).
- Wade, A., Brewer, A., Rieger, J., & Wandell, B. (2002). Functional measurements of human ventral occipital cortex: Retinotopy and colour. Philosophical Transactions of the Royal Society of London. Series B: Biological Sciences, 357, 963–973. doi: 10.1098/rstb.2002.1108
- Wang, W., & Shen, J. (2017). Deep visual attention prediction. arXiv:1705.02544.
- Wolfe, J. M. (1994). Guided search 2.0: A revised model of visual search. Psychonomic Bulletin and Review, 1, 202–238. doi: 10.3758/BF03200774
- Yamins, D., Hong, H., Cadieu, C., Solomon, E., Seibert, D., & DiCarlo, J. (2014). Performance-optimized hierarchical models predict neural responses in higher visual cortex. Proceedings of the National Academy of Sciences, 111, 8619–8624. doi: 10.1073/pnas.1403112111
- Yang, H., & Zelinsky, G. J. (2009). Visual search is guided to categorically-defined targets. Vision Research, 49, 2095–2103. doi: 10.1016/j.visres.2009.05.017
- Yu, C.-P., Hua, W.-Y., Samaras, D., & Zelinsky, G. J. (2013). Modeling clutter perception using parametric proto-object partitioning. Proceedings of the 26th Conference on Advances in Neural Information Processing Systems (NIPS 2013).
- Yu, C.-P., Le, H., Zelinsky, G. Z., & Samaras, D. (2015). Efficient video segmentation using parametric graph partitioning. International conference on computer vision (ICCV).
- Yu, C.-P., Maxfield, J. T., & Zelinsky, G. J. (2016). Searching for category-consistent features: A computational approach to understanding visual category representation. Psychological Science, 27(6), 870–884. doi: 10.1177/0956797616640237
- Zagoruyko, S., & Komodakis, N. (2016). Wide residual networks. arXiv preprint arXiv:1605.07146.
- Zeiler, M. D., & Fergus, R. (2014). Visualizing and understanding convolutional networks. In Proceedings of the European conference on computer vision (ECCV 2014).
- Zelinsky, G. J. (2008). A theory of eye movements during target acquisition. Psychological Review, 115(4), 787–835. doi: 10.1037/a0013118
- Zelinsky, G. J., Adeli, H., Peng, Y., & Samaras, D. (2013). Modelling eye movements in a categorical search task. Philosophical Transactions of the Royal Society B: Biological Sciences, 368(1628), 1–12. doi: 10.1098/rstb.2013.0058
- Zelinsky, G. J., Peng, Y., Berg, A. C., & Samaras, D. (2013). Modeling guidance and recognition in categorical search: Bridging human and computer object detection. Journal of Vision, 13(3), 30, 1–20. doi: 10.1167/13.3.30
- Zelinsky, G. J., Peng, Y., & Samaras, D. (2013). Eye can read your mind: Using eye fixations to classify search targets. Journal of Vision, 13(14), 10, 1–13. doi: 10.1167/13.14.10
- Zhang, M., Feng, J., Ma, K. T., Lim, J. H., Zhao, Q., & Kreiman, G. (2018). Finding any Waldo: Zero-shot invariant and efficient visual search. Nature Communications, 9, 3730. doi: 10.1038/s41467-018-06217-x
- Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., & Torralba, A. (2014). Object detectors emerge in deep scene CNNs. arXiv:1412.6856.