DOI: 10.1002/suco.202400848 ISSN: 1464-4177

Crack semantic segmentation performance of various attention modules in different scenarios

Junwen Zheng, Lingkun Chen, Nan Chen, Qizhi Chen, Qingyun Miao, Hesong Jin, Lizhong Jiang, Tuan Ngo

Abstract

As the core of the Transformer, the attention mechanism is crucial in model design. However, the performance of attention modules varies across different datasets. Additionally, hyperparameter settings significantly impact the performance of attention modules, complicating the selection of the appropriate module. To fill the current research gap regarding the performance of attention modules in crack recognition, this study investigates five highly cited attention modules across three datasets with different styles. By setting nine combinations of learning rates and batch sizes, we conducted 135 comparative experiments. Precision, Recall, F1‐score, and mIoU were used as evaluation metrics to analyze the recognition accuracy and loss curve convergence of each attention module. Parameters, frames per second (FPS), and floating‐point operations per second (FLOPs) were used to compare the computational efficiency of each module. The results indicate that channel attention (CA), bottleneck attention module (BAM), and convolutional block attention module (CBAM) outperform dual attention (DA) and self‐attention (SA) in crack recognition and resistance to interference. The modules achieve good convergence with learning rate and batch size combinations of 1 × 10−4/4, 1 × 10−4/8, and 1 × 10−4/16. We recommend using 1 × 10−4/4 as the initial hyperparameter setting in future work. Although DA and SA have higher parameter counts compared to CA, BAM, and CBAM, the FPS and FLOPs values of each module show minimal differences when the batch size is the same.

More from our Archive