Hyper-Connected Transformer Network for Co-Learning Multi-Modality PET-CT Features
- Resource Type
- Authors
- Bi, Lei; Fu, Xiaohang; Liu, Qiufang; Song, Shaoli; Feng, David Dagan; Fulham, Michael; Kim, Jinman
- Source
- Subject
- FOS: Computer and information sciences
Computer Vision and Pattern Recognition (cs.CV)
Image and Video Processing (eess.IV)
FOS: Electrical engineering, electronic engineering, information engineering
Computer Science - Computer Vision and Pattern Recognition
Electrical Engineering and Systems Science - Image and Video Processing
- Language
- English
[18F]-Fluorodeoxyglucose (FDG) positron emission tomography - computed tomography (PET-CT) has become the imaging modality of choice for diagnosing many cancers. Co-learning complementary PET-CT imaging features is a fundamental requirement for automatic tumor segmentation and for developing computer aided cancer diagnosis systems. We propose a hyper-connected transformer (HCT) network that integrates a transformer network (TN) with a hyper connected fusion for multi-modality PET-CT images. The TN was leveraged for its ability to provide global dependencies in image feature learning, which was achieved by using image patch embeddings with a self-attention mechanism to capture image-wide contextual information. We extended the single-modality definition of TN with multiple TN based branches to separately extract image features. We introduced a hyper connected fusion to fuse the contextual and complementary image features across multiple transformers in an iterative manner. Our results with two non-small cell lung cancer and soft-tissue sarcoma datasets show that HCT achieved better performance in segmentation accuracy when compared to state-of-the-art methods. We also show that HCT produces consistent performance across various image fusion strategies and network backbones.
18 Pages