References
- Andrews, S., Tsochantaridsi, I., & Hofmann, T. (2002). Support vector machines for multiple-instance learning. In Advances in neural information processing systems (pp. 561–568).
- Banerji, S., Sinha, A., & Liu, C. (2013). A new bag of words LBP (BoWL) descriptor for scene image classification. Computer Analysis of Images and Patterns, 8047, 490–497. doi: 10.1007/978-3-642-40261-6_59
- Chatfield, K., Simonyan, K., Vedaldi, A., & Zisserman, A. (2014). Return of the devil in the details: Delving deep into convolutional nets. Computer Science, arXiv:1405.3531 v4 [cs.CV].
- Choset, H., & Nagatani, K. (2001). Topological simultaneous localization and mapping (SLAM): Toward exact localization without explicit localization. IEEE Transactions on Robotics and Automation, 17, 125–137. doi: 10.1109/70.928558
- Ding, X., Luo, Y., Li, Q., Cheng, Y., Cai, G., Munnoch, R., … Wang, B. (2018). Prior knowledge-based deep learning method for indoor object recognition and application. Systems Science & Control Engineering, 6(1), 249–257. doi: 10.1080/21642583.2018.1482477
- Durrant-Whyte, H., & Bailey, T. (2006). Simultaneous localization and mapping: Part I. IEEE Robotics & Automation Magazine, 13, 99–110. doi: 10.1109/MRA.2006.1638022
- Espinace, P., Kollar, T., Roy, N., & Soto, A. (2013). Indoor scene recognition by a mobile robot through adaptive object detection. Robotics and Autonomous Systems, 61(9), 932–947. doi: 10.1016/j.robot.2013.05.002
- Feng, F., Shen, B., & Liu, H. (2018). Visual object tracking: In the simultaneous presence of scale variation and occlusion. Systems Science & Control Engineering, 6(1), 456–466. doi: 10.1080/21642583.2018.1536899
- He, K., Zhang, X., Ren, S., & Sun, J. (2014). Spatial pyramid pooling in deep convolutional networks for visual recognition. 13th European conference on computer vision.
- He, K., Zhang, X., Ren, S., & Sun, J. (2015). Deep residual learning for image recognition. IEEE conference on computer vision and pattern recognition.
- Juneja, M., Vedaldi, A., Jawahar, C. V., & Zisserman, A. (2013). Blocks that shout: Distinctive parts for scene classification. In IEEE conference on computer vision and pattern recognition, Portland (pp. 923–930).
- Kheradpisheh, S. R., Ghodrati, M., Ganjtabesh, M., & Masquelier, T. (2016). Deep networks can resemble human feed-forward vision in invariant object recognition. Scientific Reports, 6, 32672. doi: 10.1038/srep32672
- Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). ImageNet classification with deep convolutional neural networks. In Advances in neural information processing systems (pp. 1097–1105).
- Lazebnik, S., Schmid, C., & Ponce, J. (2006). Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In Computer society conference on computer vision and pattern recognition (pp. 2169–2178).
- LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278–2324. doi: 10.1109/5.726791
- Li, E. J. (2018). Remotely sensed image scene deep learning and application: A case study of urban structure types recognition. Geography and Geo-Information Science, 34(6), 127–127.
- Li, X., Shi, J., Dong, Y., & Tao, D. (2015). Overview of scene image classification techniques. China Science: Information Science, 45(7), 827–848.
- Li, L. J., Su, H., Lim, Y., & Fei-Fei, L. (2010). Objects as attributes for scene classification. In European conference on computer vision, Heraklion (pp. 57–69).
- Li, L. J., Su, H., Xing, E. P., & Fei-Fei, L. (2010). Object bank: A high-level image representation for scene classification & semantic feature sparsification. In Advances in neural information processing systems (pp. 1378–1386).
- Lowe, D. G. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60, 91–110. doi: 10.1023/B:VISI.0000029664.99615.94
- Mao, J., Zhong, D., Hu, Y., Sheng, W., Xiao, G., & Qu, Z. (2018). An image authentication technology based on depth residual network. Systems Science & Control Engineering, 6(1), 57–70. doi: 10.1080/21642583.2018.1446056
- Matrhew, D. Z., Graham, W. T., & Fergus, R. (2011). Adaptive deconvolutional networks for mid and high level feature learning. In IEEE international conference on computer vision (pp. 2018–2025).
- Mishkin, D., Sergievskiy, N., & Matas, J. (2017). Systematic evaluation of convolution neural network advances on the Imagenet. Computer Vision and Image Understanding, 161, 11–19. doi: 10.1016/j.cviu.2017.05.007
- Navneet, D., & Bill, T. (2005). Histograms of oriented gradients for human detection. IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 1, 886–893.
- Oliva, A., & Torralba, A. (2001). Modeling the shape of the scene: A holistic representation of the spatial envelope. International Journal of Computer Vision, 42, 91–110. doi: 10.1023/A:1011139631724
- Oliva, A., & Torralba, A. (2006). Building the gist of a scene: The role of global image features in recognition. Progress in Brain Research, 155, 23–36. doi: 10.1016/S0079-6123(06)55002-2
- Parizi, S. N., Oberlin, J. G., & Felzenszwalb, P. F. (2012). Reconfigurable models for scene recognition. IEEE conference on computer vision and pattern.
- Perronnin, F., & Dance, C. (2007). Fisher kernels on visual vocabularies for image categorization. In Conference on computer vision and pattern recognition (pp. 1–8).
- Ponce, J., Hebert, M., Schmid, C., & Zisserman, A. (2006). Toward category-level object recognition. Berlin: Springer.
- Qayyum, A., Malik, A. S., Saad, N. M., Iqbal, M., Abdullah, M. F., Rasheed, W., … Jafaar, M. Y. B. (2017). Scene classification for aerial images based on CNN using sparse coding technique. International Journal of Remote Sensing, 38(8), 2662–2685. doi: 10.1080/01431161.2017.1296206
- Roy, S., Shivakumara, P., Jain, N., Khare, V., Dutta, A., Pal, U., & Lu, T. (2018). Rough-fuzzy based scene categorization for text detection and recognition in video. Pattern Recognition, 80, 64–82. doi: 10.1016/j.patcog.2018.02.014
- Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., … Fei-Fei, L. (2015). Imagenet large scale visual recognition challenge. International Journal of Computer Vision, 115(3), 211–252. doi: 10.1007/s11263-015-0816-y
- Sadeghi, F., & Tappen, M. F. (2012). Latent pyramidal regions for recognizing scenes. In European conference on computer vision, Florence (pp. 228–241).
- Se, S., Lowe, D. G., & Little, J. J. (2001). Vision-based mobile robot localization and mapping using scale-invariant features. In Proceedings of the IEEE international conference on robotics and automation (pp. 2051–2058).
- Simonyan, Karen, & Zisserman, Andrew. (2014). Very deep convolutional networks for large-scale image recognition. Computer Science, 1–14. arXiv:1409.1556 [cs.CV].
- Szegedy, C., Ioffe, S., & Vanhoucke, V. (2016). Inception-v4, Inception-ResNet and the impact of residual connections on learning. Retrieved from https://arxiv.org/pdf/1602.07261.pdf
- Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., … Rabinovich, A. (2015). Going deeper with convolutions. In IEEE conference on computer vision and pattern recognition (pp. 1–9).
- Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., & Wojna, Z. (2015). Rethinking the inception architecture for computer vision. Retrieved from https://arxiv.org/pdf/1512.00567.pdf
- Tang, P., Wang, H., & Kwong, S. (2017). G-MS2F: GoogLeNet based multi-stage feature fusion of deep CNN for scene recognition. Neurocomputing, 225, 188–197. doi: 10.1016/j.neucom.2016.11.023
- Tian, D. P. (2013). A review on image feature extraction and representation techniques. International Journal of Multimedia and Ubiquitous Engineering, 8(4), 385–395.
- Wang, Q., Li, Z., Wang, J., Liu, S., & Li, D. (2013). A fast processing method of foreign fiber images based on HSV color space. IFIP Advances in Information & Communication Technology, 392, 390–397. doi: 10.1007/978-3-642-36124-1_47
- Wang, B. F., Zhou, J. L., Tang, G. S., Di, K., Wan, W., Liu, C., & Wang, J. (2014). Research on visual localization method of lunar rover. Scientia Sinica (Informationis), 44, 452–460.
- Wu, W. R., Wang, D. Y., Xing, Y., Gong, X., & Liu, J. (2011). Binocular visual odometry algorithm and experimentation research for the lunar rover. Scientia Sinica Informationis, 41, 1415–1422.
- Xiao, J. X., Hays, J., Ehinger, K. A., Oliva, A., & Torralba, A. (2010). Sun database: Large-scale scene recognition from abbey to zoo. In IEEE conference on computer vision and pattern recognition, San Francisco (pp. 3485–3492).
- Yu, X. R., Wu, X. M., Luo, C. B., & Ren, P. (2017). Deep learning in remote sensing scene classification: A data augmentation enhanced convolutional neural network framework. GIScience & Remote Sensing, 54(5), 741–758. doi: 10.1080/15481603.2017.1323377
- Zeiler, M. D., & Fergus, R. (2014). Visualizing and understanding convolutional networks. In European conference on computer vision (pp. 818–833).
- Zhou, W. Y., He, X. H., Qing, L. B., Wan, Y. J., & Zheng, X. B. (2019). Recognizing building areas under construction in complex scenarios. Computer Systems & Applications, 28(1), 140–146.
- Zhou, L., Hu, D. W., & Zhou, Z. T. (2013). Scene recognition combining structural and textural features. Science China Information Sciences, 56(78), 1–14.