•  
  •  
 

Abstract

Accurate segmentation of organs-at-risk (OARs) in head and neck CT scans is crucial for radiotherapy planning. The CNN-based decoder limitation of Swin UNETR hinders its capacity to process meaningful information from multiple organ positions essential for accurate medical segmentation. The proposed Transformer Decoder-enhanced Swin UNETR model targets the OpenKBP dataset multi-organ segmentation through its dedicated design for this purpose. The model utilizes transformers along with cross-attention approaches in its decoder to improve segmentation mask outputs through analysis of extensive global information. The model gets additional feature representation power through the addition of squeeze-and-excitation (SE) blocks linked with spatial attention mechanisms that allow the model to focus on image regions with maximum relevance. The presented variant of the model delivers exceptional performance through its 81.75% Dice score and 2.464 HD95 average while surpassing Swin UNETR's baseline scores of 54.13% Dice and 5.760 HD95 and matching the nnU-Net's scores of 65% Dice and 4.8 HD95. The model demonstrates high precision for segmenting difficult anatomical elements, including brainstem (91.50% Dice and 1.600 HD95) and mandible (94.00% Dice with 1.400 HD95) structures. Through advanced segmentation of significant treatment areas, the enhanced model provides critical value to medical experts who can deploy this tool for safer and more effective radiation therapy

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License.

Share

COinS