1,097
Views
0
CrossRef citations to date
0
Altmetric
Research Article

Scene-level buildings damage recognition based on Cross Conv-Transformer

, , &
Pages 3987-4007 | Received 05 Jul 2023, Accepted 15 Sep 2023, Published online: 28 Sep 2023

References

  • Adriano, Bruno, J. Xia, G. Baier, N. Yokoya, and S. Koshimura. 2019. “Multi-Source Data Fusion Based on Ensemble Learning for Rapid Building Damage Mapping During the 2018 Sulawesi Earthquake and Tsunami in Palu, Indonesia.” Remote Sensing 11: 886. doi:10.3390/rs11070886.
  • Akhmadiya, Asset, N. Nabiyev, K. Moldamurat, K. Dyusekeev, and S. Atanov. 2020. “Change Detection Based Building Damage Assessment Method Using Radar Imageries with GLCM Textural Parameters.” doi:10.20944/preprints202001.0225.v1.
  • Bazi, Y., L. Bashmal, M. Rahhal, R. Dayil, and N. Ajlan. 2021. “Vision Transformers for Remote Sensing Image Classification.” Remote Sensing 13 (3): 1–20. doi:10.3390/rs13030516.
  • Bialas, J., T. Oommen, and T. Havens. 2019. “Optimal Segmentation of High Spatial Resolution Images for the Classification of Buildings Using Random Forests.” International Journal of Applied Earth Observation and Geoinformation 82: 101895. doi:10.1016/j.jag.2019.06.005.
  • Chen, C., Q. Fan, N. Mallinar, T. Sercu, and R. Feris. 2019. “Big-Little Net: An Efficient Multi-Scale Feature Representation for Visual and Speech Recognition.” 7th International Conference on Learning Representations, ICLR, 1–20.
  • Chen, C., Q. Fan, and R. Panda. 2021. “CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification.” IEEE International Conference on Computer Vision, 347–356. doi:10.1109/ICCV48922.2021.00041.
  • Chen, S., X. Wang, and P. Xiao. 2018. “Urban Damage Level Mapping Based on Co-Polarization Coherence Pattern Using Multitemporal Polarimetric SAR Data.” IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 11 (8): 2657–2667. doi:10.1109/JSTARS.2018.2818939.
  • Dosovitskiy, A., L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, and M. Dehghani. 2020. “An Image is Worth 16 Words: Transformers for Image Recognition at Scale.” arXiv:2010.11929. http://arxiv.org/abs/2010.11929.
  • Duarte, D., F. Nex, and N. Kerle. 2020. “Detection of Seismic Façade Damages with Multi-Temporal Oblique Aerial Imagery.” GIScience & Remote Sensing 57 (5): 670–686. doi:10.1080/15481603.2020.1768768.
  • Fan, Q., C. F. R. Chen, H. Kuehne, et al. 2019a. “More is Less: Learning Efficient Video Representations by Big-Little Network and Depthwise Temporal Aggregation.” Advances in Neural Information Processing Systems 32.
  • Fan, X., G. Nie, Y. Deng, et al. 2019b. “Estimating Earthquake-Damage Areas Using Landsat-8 OLI Surface Reflectance Data.” International Journal of Disaster Risk Reduction 33: 275–283. doi:10.1016/j.ijdrr.2018.10.013.
  • Fan, X., G. Nie, C. Xia, et al. 2021. “Estimation of Pixel-Level Seismic Vulnerability of the Building Environment Based on Mid-resolution Optical Remote Sensing Images.” International Journal of Applied Earth Observation and Geoinformation 101: 102339. doi:10.1016/j.jag.2021.102339.
  • Foody, G. M. 2020. “Explaining the Unsuitability of the Kappa Coefficient in the Assessment and Comparison of the Accuracy of Thematic Maps Obtained by Image Classification.” Remote Sensing of Environment 239: 111630. doi:10.1016/j.rse.2019.111630.
  • Gebrehiwot, A., L. Beni, G. Thompson, P. Kordjamshidi, and T. Langan. 2019. “Deep Convolutional Neural Network for Models for Identifying Damaged Buildings Aerial Vehicles Data.” Sensors 19 (7), doi:10.3390/s19071486.
  • Graham, B., A. El-Nouby, H. Touvron, et al. 2021. “Levit: A Vision Transformer in ConvNet’s Clothing for Faster Inference.” IEEE/CVF International Conference on Computer Vision, 12259–12269.
  • Gulati, A., J. Qin, C. C. Chiu, et al. 2020. “Conformer: Convolution-Augmented Transformer for Speech Recognition”. The Annual Conference of the International Speech Communication Association, INTERSPEECH, 10: 5036–5040. doi:10.21437/Interspeech.2020-3015.
  • Hong, D., Z. Han, J. Yao, et al. 2021. “SpectralFormer: Rethinking Hyperspectral Image Classification with Transformers.” IEEE Transactions on Geoscience and Remote Sensing 60: 1–15. doi:10.1109/TGRS.2021.3130716.
  • Huang, J., J. Tao, B. Liu, et al. 2020. “Multimodal Transformer Fusion for Continuous Emotion Recognition.” IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 3507–3511.
  • Jia, S., and Y. Wang. 2022. “Multiscale Convolutional Transformer with Center Mask Pretraining for Hyperspectral Image Classification.” arXiv:2203.04771. http://arxiv.org/abs/2203.04771.
  • Knyaz, V. A., V. V. Kniaz, and F. Remondino. 2018. “Image-to-Voxel Model Translation with Conditional Adversarial Networks.” The European Conference on Computer Vision (ECCV) Workshops, 1: 1–19.
  • Liu, C., S. M. E. Sepasgozar, Q. Zhang, et al. 2022. “A Novel Attention-Based Deep Learning Method for Post-Disaster Building Damage Classification.” Expert Systems with Applications 202: 117268. doi:10.1016/j.eswa.2022.117268.
  • Ma, L., Y. Liu, X. Zhang, et al. 2019. “Deep Learning in Remote Sensing Applications: A Meta-Analysis and Review.” ISPRS Journal of Photogrammetry and Remote Sensing 152: 166–177. doi:10.1016/j.isprsjprs.2019.04.015.
  • Mangalathu, S., H. Sun, C. C. Nweke, et al. 2020. “Classifying Earthquake Damage to Buildings Using Machine Learning.” Earthquake Spectra 36 (1): 183–208. doi:10.1177/8755293019878137.
  • Mangan, S., and U. Alon. 2003. “Structure and Function of the Feed-Forward Loop Network Motif.” The National Academy of Sciences 100 (21): 11980–11985. doi:10.1073/pnas.2133841100.
  • Mohammadian, A., and F. Ghaderi. 2022. “SiamixFormer: A Siamese Transformer Network for Building Detection and Change Detection from Bi-Temporal Remote Sensing Images.” arXiv:2208.00657. http://arxiv.org/abs/2208.00657.
  • Naito, S., H. Tomozawa, Y. Mori, et al. 2020. “Building-Damage Detection Method Based on Machine Learning Utilizing Aerial Photographs of the Kumamoto earthquake.” Earthquake Spectra 36 (3): 1166–1187. doi:10.1177/8755293019901309.
  • Prashath, R. R., N. Priyadharshini, and C. B. Lakshmi. 2021. “Aerial Image Based Calamity Monitoring Using Deep Learning for Emergency Responsive Applications.” IOP Conference Series: Materials Science and Engineering 1055: 012094. doi:10.1088/1757-899X/1055/1/012094.
  • Radford, A., and S. Tim. 2021. “Language Understanding.” Encyclopedia of Autism Spectrum Disorders, 2640–2640. doi:10.1007/978-3-319-91280-6_300915.
  • Rosso, M. M., G. Marasco, S. Aiello, et al. 2023. “Convolutional Networks and Transformers for Intelligent Road Tunnel Investigations.” Computers & Structures 275: 106918. doi:10.1016/j.compstruc.2022.106918.
  • Roy, R., S. S. Kulkarni, V. Soni, et al. 2022. “Transformer-Based Flood Scene Segmentation for Developing Countries.” arXiv:2210.04218. http://arxiv.org/abs/2210.04218.
  • Settou, T., M. K. Kholladi, and A. Ben Ali. 2022. “Improving Damage Classification Via Hybrid Deep Learning Feature Representations Derived from Post-Earthquake Aerial Images.” International Journal of Image and Data Fusion 13 (1): 1–20. doi:10.1080/19479832.2020.1864787.
  • Shen, Y., S. Zhu, T. Yang, C. Chen, D. Pan, J. Chen, L. Xiao, and Q. Du. 2022. “BDANet: Multiscale Convolutional Neural Network with Cross-Directional Attention for Building Damage Assessment from Satellite Images.” IEEE Transactions on Geoscience and Remote Sensing 60: 1–16. doi:10.1109/TGRS.2021.3080580.
  • Shi, L., F. Zhang, J. Xia, J. Xie, Z. Zhang, Z. Du, and R. Liu. 2021. “Identifying Damaged Buildings in Aerial Images Using the Object Detection Method.” Remote Sensing 13 (21): 4213. doi:10.3390/rs13214213.
  • Shocher, A., G. Yossi, M. Inbar, Y. Michal, I. Michal, F. William, and D. Tali. 2020. “Semantic Pyramid for Image Generation.” Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 7455–7464. doi:10.1109/CVPR42600.2020.00748.
  • Touvron, H., C. Matthieu, D. Matthijs, M. Francisco, S. Alexandre, and J. Hervé. 2020. “Training Data-Efficient Image Transformers & Distillation Through Attention.” arXiv:2012.12877. http://arxiv.org/abs/2012.12877.
  • Vaswani, A., N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, and I. Polosukhin. 2017. “Attention is All You Need.” Advances in Neural Information Processing Systems 12: 5999–6009.
  • Voita, E., D. Talbot, F. Moiseev, R. Sennrich, and I. Titov. 2020. “Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned.” arXiv:1905.09418. doi:10.48550/arXiv.1905.09418.
  • Wang, Y., C. Alvin Wei, and L. Zhang. 2022. “Building Damage Detection from Satellite Images After Natural Disasters on Extremely Imbalanced Datasets.” Automation in Construction 140: 104328. doi:10.1016/j.autcon.2022.104328.
  • Wang, X., and P. Li. 2020. “Extraction of Urban Building Damage Using Spectral, Height and Corner Information from VHR Satellite Images and Airborne LiDAR Data.” ISPRS Journal of Photogrammetry and Remote Sensing 159: 322–336. doi:10.1016/j.isprsjprs.2019.11.028.
  • Wang, W., E. Xie, X. Li, D. Fan, K. Song, D. Liang, T. Lu, P. Luo, and L. Shao. 2021. “Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction Without Convolutions.” arXiv:2102.12122. http://arxiv.org/abs/2102.12122.
  • Wu, H., B. Xiao, N. Codella, M. Liu, X. Dai, L. Yuan, and L. Zhang. 2021. “CvT: Introducing Convolutions to Vision Transformers.” arXiv:2103.15808. http://arxiv.org/abs/2103.15808.
  • Xiong, R., Y. Yang, D. He, K. Zheng, S. Zheng, and T. Liu. 2020. “On Layer Normalization in the Transformer Architecture.” International Conference on Machine Learning, 10524–10533. https://proceedings.mlr.press/v119/xiong20b.
  • Xu, J., Y. Zhang, and D. Miao. 2020. “Three-Way Confusion Matrix for Classification: A Measure Driven View.” Information Sciences 507: 772–794. doi:10.1016/j.ins.2019.06.064.
  • Yang, W., W. Zhang, and P. Luo. 2021. “Transferability of Convolutional Neural Network Models for Identifying Damaged Buildings Due to Earthquake.” Remote Sensing 13 (3): 1–20. doi:10.3390/rs13030504.
  • Yuan, Li, Y. Chen, T. Wang, W. Yu, Y. Shi, Z. Jiang, and S. Yan. 2021. “Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet.” arXiv:2101.11986. doi:10.48550/arXiv.2101.1198.
  • Zhang, B., Z. Hu, P. Wu, H. Huang, and J. Xiang. 2023. “Engineering Applications of Artificial Intelligence EPT : A Data-Driven Transformer Model for Earthquake Prediction.” Engineering Applications of Artificial Intelligence 4: 106176. doi:10.1016/j.engappai.2023.106176.
  • Zhao, L., J. Yang, et al. 2013. “Damage Assessment in Urban Areas Using Post-Earthquake Airborne PolSAR Imagery.” International Journal of Remote Sensing 34 (24): 8952–8966. doi:10.1080/01431161.2013.860566.
  • Zhu, X., Devis Tuia, L. Mou, G. Xia, L. Zhang, F. Xu, and F. Fraundorfer. 2017. “Deep Learning in Remote Sensing: A Comprehensive Review and List of Resources.” IEEE Geoscience and Remote Sensing Magazine 5: 8–36. doi:10.1109/MGRS.2017.2762307.