ABSTRACT
Object detection in optical remote sensing images has made remarkable progress on computer vision. Unlike objects in natural images, objects in remote sensing are generally more complex and diverse. Therefore, most methods design deeper networks to capture more distinguishing features for each object region while ignoring the semantic correlation between scenes and ground objects. In this work, we introduce a novel multiknowledge learning module (MKLM), which combines two kinds of knowledge to adaptively enhance the original feature representation of the basic detector by interacting with information. Specifically, the internal knowledge module is utilized to learn global and local contextual relationships (e.g. feature similarity, spatial location) given object proposals, and the external knowledge module learns human common-sense knowledge (e.g. co-occurrence, inherent properties) to impose constraints. Then, it achieves the purpose of updating the object state under the supervision of different knowledge guidance. Our MKLM is easy to extend and migrate and can be embedded in any detection paradigm. Reliable experiments on two challenging remote sensing image object detection datasets (DOTA and RSOD) show that the proposed method achieves superior detection performance compared to many state-of-the-art methods and baselines.
Acknowledgements
The authors wish to thank Jiaming Han, Jian Ding and Yong Liu for providing their implementation source codes, which greatly facilitated the comparison experiments.
Disclosure statement
No potential conflict of interest was reported by the author(s).