ABSTRACT
Smart is a development trend in manufacturing systems, and intelligent defect recognition is essential in smart manufacturing systems for both quality control and decision-making. But the recognition performance of the current methods still needs to be improved, as well as the interpretability. As a hotspot, Transformer (ViT) has outstanding performance and interpretability on image recognition, which has shown the potential for intelligent defect recognition. However, ViT requires large numbers of samples, while small-sample is common in real-world cases, which contain less information, and this will cause ViT overfitting and misclassifying. Thus, it impedes the application of ViT greatly. To address this problem, a multi-scale spatial feature fusion-based ViT is proposed for small-sample defect recognition. The proposed method simulates human vision to extract the multi-level features of defects, and three improved ViTs are built to fuse the features. The experimental results indicate that the proposed method achieves improved performance on small-sample defect recognition. Compared with the DL and defect recognition methods, the accuracies are improved by 1.5%~20.07% on wood defects, and achieve an accuracy of 100% on steel defects. Furthermore, the visualization results also show that the proposed method is explicable, and it is helpful for defect analysis.
Acknowledgements
This work was supported by the National Natural Science Foundation of China under Grant Nos. 52205523, U21B2029 and 52188102, and Key R&D Program of Hubei Province under Grant No. 2021AAB001.
Disclosure statement
No potential conflict of interest was reported by the author(s).