TY - GEN
T1 - AM-UNet: Attention Mamba U-Net for Medical Image Segmentation
AU - Wang, Meiyun
AU - Guo, Changlu
AU - Yi, Yugen
PY - 2026
Y1 - 2026
N2 - Recently, architectures such as Mamba that leverage State Space Models (SSMs) have shown strong potential in rivaling conventional CNN and Transformer architectures. SSM is a deep sequence model known for its power to handle long sequence tasks, efficiently tracking intricate inter-sequence relationships with linear computational overhead. However, previous skip connection methods have not fully bridged the semantic gap across encoder and decoder features, which may lead to insufficient feature fusion and consequently affect fine-detail recovery. To address this problem, we propose the Attention Mamba UNet (AM-UNet), which integrates the traditional U-shaped architecture with Visual State Space (VSS) blocks to exploit richer contextual information. We further enhance the architecture by embedding a novel attention module into the skip connection framework, where dilated convolution broaden the perception range with no extra processing burden, while cross mechanism facilitates optimal feature fusion between the encoder and decoder, mitigating the semantic gap and enabling a more comprehensive understanding of spatial dependencies. Experiments on ISIC17, ISIC18, and ACDC datasets showing that AM-UNet delivers superior results compared to existing methods when applied to medical image segmentation tasks.
AB - Recently, architectures such as Mamba that leverage State Space Models (SSMs) have shown strong potential in rivaling conventional CNN and Transformer architectures. SSM is a deep sequence model known for its power to handle long sequence tasks, efficiently tracking intricate inter-sequence relationships with linear computational overhead. However, previous skip connection methods have not fully bridged the semantic gap across encoder and decoder features, which may lead to insufficient feature fusion and consequently affect fine-detail recovery. To address this problem, we propose the Attention Mamba UNet (AM-UNet), which integrates the traditional U-shaped architecture with Visual State Space (VSS) blocks to exploit richer contextual information. We further enhance the architecture by embedding a novel attention module into the skip connection framework, where dilated convolution broaden the perception range with no extra processing burden, while cross mechanism facilitates optimal feature fusion between the encoder and decoder, mitigating the semantic gap and enabling a more comprehensive understanding of spatial dependencies. Experiments on ISIC17, ISIC18, and ACDC datasets showing that AM-UNet delivers superior results compared to existing methods when applied to medical image segmentation tasks.
KW - Attention
KW - Dilated Convolution
KW - Mamba
KW - Medical Image Segmentation
U2 - 10.1007/978-981-95-6123-0_41
DO - 10.1007/978-981-95-6123-0_41
M3 - Article in proceedings
SN - 978-981-95-6122-3
VL - 16360
T3 - Lecture Notes in Computer Science
SP - 435
EP - 445
BT - Proceedings of the 19th Chinese Conference on Biometric Recognition, CCBR 2025
PB - Springer
T2 - 19th Chinese Conference on Biometric Recognition
Y2 - 21 November 2025 through 23 November 2025
ER -