Scene-level buildings damage recognition based on Cross Conv-Transformer

Lingfei Shia School of Earth Sciences, Zhejiang University, Hangzhou, People’s Republic of ChinaView further author information

Feng Zhanga School of Earth Sciences, Zhejiang University, Hangzhou, People’s Republic of China;b Zhejiang Provincial Key Laboratory of Geographic Information Science, Hangzhou, People’s Republic of ChinaCorrespondence[email protected]
View further author information

Junshi Xiac Geoinformatics Unit, RIKEN Center for Advanced Intelligence Project, Tokyo, JapanView further author information

Jibo Xied Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing, People’s Republic of ChinaView further author information

Pages 3987-4007 | Received 05 Jul 2023, Accepted 15 Sep 2023, Published online: 28 Sep 2023

Cite this article
https://doi.org/10.1080/17538947.2023.2261770
CrossMark

Full Article
Figures & data
References
Citations
Metrics
Licensing
Reprints & Permissions
View PDF PDF View EPUB EPUB

References

Adriano, Bruno, J. Xia, G. Baier, N. Yokoya, and S. Koshimura. 2019. “Multi-Source Data Fusion Based on Ensemble Learning for Rapid Building Damage Mapping During the 2018 Sulawesi Earthquake and Tsunami in Palu, Indonesia.” Remote Sensing 11: 886. doi:10.3390/rs11070886.
Web of Science ®Google Scholar
Akhmadiya, Asset, N. Nabiyev, K. Moldamurat, K. Dyusekeev, and S. Atanov. 2020. “Change Detection Based Building Damage Assessment Method Using Radar Imageries with GLCM Textural Parameters.” doi:10.20944/preprints202001.0225.v1.
Google Scholar
Bazi, Y., L. Bashmal, M. Rahhal, R. Dayil, and N. Ajlan. 2021. “Vision Transformers for Remote Sensing Image Classification.” Remote Sensing 13 (3): 1–20. doi:10.3390/rs13030516.
PubMed Web of Science ®Google Scholar
Bialas, J., T. Oommen, and T. Havens. 2019. “Optimal Segmentation of High Spatial Resolution Images for the Classification of Buildings Using Random Forests.” International Journal of Applied Earth Observation and Geoinformation 82: 101895. doi:10.1016/j.jag.2019.06.005.
Web of Science ®Google Scholar
Chen, C., Q. Fan, N. Mallinar, T. Sercu, and R. Feris. 2019. “Big-Little Net: An Efficient Multi-Scale Feature Representation for Visual and Speech Recognition.” 7th International Conference on Learning Representations, ICLR, 1–20.
Google Scholar
Chen, C., Q. Fan, and R. Panda. 2021. “CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification.” IEEE International Conference on Computer Vision, 347–356. doi:10.1109/ICCV48922.2021.00041.
Google Scholar
Chen, S., X. Wang, and P. Xiao. 2018. “Urban Damage Level Mapping Based on Co-Polarization Coherence Pattern Using Multitemporal Polarimetric SAR Data.” IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 11 (8): 2657–2667. doi:10.1109/JSTARS.2018.2818939.
Web of Science ®Google Scholar
Dosovitskiy, A., L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, and M. Dehghani. 2020. “An Image is Worth 16 Words: Transformers for Image Recognition at Scale.” arXiv:2010.11929. http://arxiv.org/abs/2010.11929.
Google Scholar
Duarte, D., F. Nex, and N. Kerle. 2020. “Detection of Seismic Façade Damages with Multi-Temporal Oblique Aerial Imagery.” GIScience & Remote Sensing 57 (5): 670–686. doi:10.1080/15481603.2020.1768768.
Web of Science ®Google Scholar
Fan, Q., C. F. R. Chen, H. Kuehne, et al. 2019a. “More is Less: Learning Efficient Video Representations by Big-Little Network and Depthwise Temporal Aggregation.” Advances in Neural Information Processing Systems 32.
Google Scholar
Fan, X., G. Nie, Y. Deng, et al. 2019b. “Estimating Earthquake-Damage Areas Using Landsat-8 OLI Surface Reflectance Data.” International Journal of Disaster Risk Reduction 33: 275–283. doi:10.1016/j.ijdrr.2018.10.013.
Web of Science ®Google Scholar
Fan, X., G. Nie, C. Xia, et al. 2021. “Estimation of Pixel-Level Seismic Vulnerability of the Building Environment Based on Mid-resolution Optical Remote Sensing Images.” International Journal of Applied Earth Observation and Geoinformation 101: 102339. doi:10.1016/j.jag.2021.102339.
Web of Science ®Google Scholar
Foody, G. M. 2020. “Explaining the Unsuitability of the Kappa Coefficient in the Assessment and Comparison of the Accuracy of Thematic Maps Obtained by Image Classification.” Remote Sensing of Environment 239: 111630. doi:10.1016/j.rse.2019.111630.
Web of Science ®Google Scholar
Gebrehiwot, A., L. Beni, G. Thompson, P. Kordjamshidi, and T. Langan. 2019. “Deep Convolutional Neural Network for Models for Identifying Damaged Buildings Aerial Vehicles Data.” Sensors 19 (7), doi:10.3390/s19071486.
PubMedGoogle Scholar
Graham, B., A. El-Nouby, H. Touvron, et al. 2021. “Levit: A Vision Transformer in ConvNet’s Clothing for Faster Inference.” IEEE/CVF International Conference on Computer Vision, 12259–12269.
Google Scholar
Gulati, A., J. Qin, C. C. Chiu, et al. 2020. “Conformer: Convolution-Augmented Transformer for Speech Recognition”. The Annual Conference of the International Speech Communication Association, INTERSPEECH, 10: 5036–5040. doi:10.21437/Interspeech.2020-3015.
Google Scholar
Hong, D., Z. Han, J. Yao, et al. 2021. “SpectralFormer: Rethinking Hyperspectral Image Classification with Transformers.” IEEE Transactions on Geoscience and Remote Sensing 60: 1–15. doi:10.1109/TGRS.2021.3130716.
Web of Science ®Google Scholar
Huang, J., J. Tao, B. Liu, et al. 2020. “Multimodal Transformer Fusion for Continuous Emotion Recognition.” IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 3507–3511.
Google Scholar
Jia, S., and Y. Wang. 2022. “Multiscale Convolutional Transformer with Center Mask Pretraining for Hyperspectral Image Classification.” arXiv:2203.04771. http://arxiv.org/abs/2203.04771.
Google Scholar
Knyaz, V. A., V. V. Kniaz, and F. Remondino. 2018. “Image-to-Voxel Model Translation with Conditional Adversarial Networks.” The European Conference on Computer Vision (ECCV) Workshops, 1: 1–19.
Google Scholar
Liu, C., S. M. E. Sepasgozar, Q. Zhang, et al. 2022. “A Novel Attention-Based Deep Learning Method for Post-Disaster Building Damage Classification.” Expert Systems with Applications 202: 117268. doi:10.1016/j.eswa.2022.117268.
Web of Science ®Google Scholar
Ma, L., Y. Liu, X. Zhang, et al. 2019. “Deep Learning in Remote Sensing Applications: A Meta-Analysis and Review.” ISPRS Journal of Photogrammetry and Remote Sensing 152: 166–177. doi:10.1016/j.isprsjprs.2019.04.015.
Web of Science ®Google Scholar
Mangalathu, S., H. Sun, C. C. Nweke, et al. 2020. “Classifying Earthquake Damage to Buildings Using Machine Learning.” Earthquake Spectra 36 (1): 183–208. doi:10.1177/8755293019878137.
Web of Science ®Google Scholar
Mangan, S., and U. Alon. 2003. “Structure and Function of the Feed-Forward Loop Network Motif.” The National Academy of Sciences 100 (21): 11980–11985. doi:10.1073/pnas.2133841100.
PubMed Web of Science ®Google Scholar
Mohammadian, A., and F. Ghaderi. 2022. “SiamixFormer: A Siamese Transformer Network for Building Detection and Change Detection from Bi-Temporal Remote Sensing Images.” arXiv:2208.00657. http://arxiv.org/abs/2208.00657.
Google Scholar
Naito, S., H. Tomozawa, Y. Mori, et al. 2020. “Building-Damage Detection Method Based on Machine Learning Utilizing Aerial Photographs of the Kumamoto earthquake.” Earthquake Spectra 36 (3): 1166–1187. doi:10.1177/8755293019901309.
Web of Science ®Google Scholar
Prashath, R. R., N. Priyadharshini, and C. B. Lakshmi. 2021. “Aerial Image Based Calamity Monitoring Using Deep Learning for Emergency Responsive Applications.” IOP Conference Series: Materials Science and Engineering 1055: 012094. doi:10.1088/1757-899X/1055/1/012094.
Google Scholar
Radford, A., and S. Tim. 2021. “Language Understanding.” Encyclopedia of Autism Spectrum Disorders, 2640–2640. doi:10.1007/978-3-319-91280-6_300915.
Google Scholar
Rosso, M. M., G. Marasco, S. Aiello, et al. 2023. “Convolutional Networks and Transformers for Intelligent Road Tunnel Investigations.” Computers & Structures 275: 106918. doi:10.1016/j.compstruc.2022.106918.
Web of Science ®Google Scholar
Roy, R., S. S. Kulkarni, V. Soni, et al. 2022. “Transformer-Based Flood Scene Segmentation for Developing Countries.” arXiv:2210.04218. http://arxiv.org/abs/2210.04218.
Google Scholar
Settou, T., M. K. Kholladi, and A. Ben Ali. 2022. “Improving Damage Classification Via Hybrid Deep Learning Feature Representations Derived from Post-Earthquake Aerial Images.” International Journal of Image and Data Fusion 13 (1): 1–20. doi:10.1080/19479832.2020.1864787.
Web of Science ®Google Scholar
Shen, Y., S. Zhu, T. Yang, C. Chen, D. Pan, J. Chen, L. Xiao, and Q. Du. 2022. “BDANet: Multiscale Convolutional Neural Network with Cross-Directional Attention for Building Damage Assessment from Satellite Images.” IEEE Transactions on Geoscience and Remote Sensing 60: 1–16. doi:10.1109/TGRS.2021.3080580.
Web of Science ®Google Scholar
Shi, L., F. Zhang, J. Xia, J. Xie, Z. Zhang, Z. Du, and R. Liu. 2021. “Identifying Damaged Buildings in Aerial Images Using the Object Detection Method.” Remote Sensing 13 (21): 4213. doi:10.3390/rs13214213.
Web of Science ®Google Scholar
Shocher, A., G. Yossi, M. Inbar, Y. Michal, I. Michal, F. William, and D. Tali. 2020. “Semantic Pyramid for Image Generation.” Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 7455–7464. doi:10.1109/CVPR42600.2020.00748.
Google Scholar
Touvron, H., C. Matthieu, D. Matthijs, M. Francisco, S. Alexandre, and J. Hervé. 2020. “Training Data-Efficient Image Transformers & Distillation Through Attention.” arXiv:2012.12877. http://arxiv.org/abs/2012.12877.
Google Scholar
Vaswani, A., N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, and I. Polosukhin. 2017. “Attention is All You Need.” Advances in Neural Information Processing Systems 12: 5999–6009.
Google Scholar
Voita, E., D. Talbot, F. Moiseev, R. Sennrich, and I. Titov. 2020. “Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned.” arXiv:1905.09418. doi:10.48550/arXiv.1905.09418.
Google Scholar
Wang, Y., C. Alvin Wei, and L. Zhang. 2022. “Building Damage Detection from Satellite Images After Natural Disasters on Extremely Imbalanced Datasets.” Automation in Construction 140: 104328. doi:10.1016/j.autcon.2022.104328.
Web of Science ®Google Scholar
Wang, X., and P. Li. 2020. “Extraction of Urban Building Damage Using Spectral, Height and Corner Information from VHR Satellite Images and Airborne LiDAR Data.” ISPRS Journal of Photogrammetry and Remote Sensing 159: 322–336. doi:10.1016/j.isprsjprs.2019.11.028.
Web of Science ®Google Scholar
Wang, W., E. Xie, X. Li, D. Fan, K. Song, D. Liang, T. Lu, P. Luo, and L. Shao. 2021. “Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction Without Convolutions.” arXiv:2102.12122. http://arxiv.org/abs/2102.12122.
Google Scholar
Wu, H., B. Xiao, N. Codella, M. Liu, X. Dai, L. Yuan, and L. Zhang. 2021. “CvT: Introducing Convolutions to Vision Transformers.” arXiv:2103.15808. http://arxiv.org/abs/2103.15808.
Google Scholar
Xiong, R., Y. Yang, D. He, K. Zheng, S. Zheng, and T. Liu. 2020. “On Layer Normalization in the Transformer Architecture.” International Conference on Machine Learning, 10524–10533. https://proceedings.mlr.press/v119/xiong20b.
Google Scholar
Xu, J., Y. Zhang, and D. Miao. 2020. “Three-Way Confusion Matrix for Classification: A Measure Driven View.” Information Sciences 507: 772–794. doi:10.1016/j.ins.2019.06.064.
Web of Science ®Google Scholar
Yang, W., W. Zhang, and P. Luo. 2021. “Transferability of Convolutional Neural Network Models for Identifying Damaged Buildings Due to Earthquake.” Remote Sensing 13 (3): 1–20. doi:10.3390/rs13030504.
PubMed Web of Science ®Google Scholar
Yuan, Li, Y. Chen, T. Wang, W. Yu, Y. Shi, Z. Jiang, and S. Yan. 2021. “Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet.” arXiv:2101.11986. doi:10.48550/arXiv.2101.1198.
Google Scholar
Zhang, B., Z. Hu, P. Wu, H. Huang, and J. Xiang. 2023. “Engineering Applications of Artificial Intelligence EPT : A Data-Driven Transformer Model for Earthquake Prediction.” Engineering Applications of Artificial Intelligence 4: 106176. doi:10.1016/j.engappai.2023.106176.
Google Scholar
Zhao, L., J. Yang, et al. 2013. “Damage Assessment in Urban Areas Using Post-Earthquake Airborne PolSAR Imagery.” International Journal of Remote Sensing 34 (24): 8952–8966. doi:10.1080/01431161.2013.860566.
Web of Science ®Google Scholar
Zhu, X., Devis Tuia, L. Mou, G. Xia, L. Zhang, F. Xu, and F. Fraundorfer. 2017. “Deep Learning in Remote Sensing: A Comprehensive Review and List of Resources.” IEEE Geoscience and Remote Sensing Magazine 5: 8–36. doi:10.1109/MGRS.2017.2762307.
Web of Science ®Google Scholar

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Scene-level buildings damage recognition based on Cross Conv-Transformer

References

Information for

Open access

Opportunities

Help and information

Your download is now in progress and you may close this window

Login or register to access this feature

Scene-level buildings damage recognition based on Cross Conv-Transformer

References

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date