Semantic segmentation empowers various real-world applications. Nevertheless, the substantial computational cost, such as O(k 2 ) time complexity associated with the number of tokens in multi-headed self-attention, poses challenges for deploying these models on edge devices with constrained hardware resources. This paper introduces a novel family of backbones designed for real-time semantic segmentation, referred to as Linear and Re-parameter Vision Transformer (LARFormer). Particularly, we introduce a Re-parameter Mobile Block (RMB), which employs three branches during training and a single branch during inference. Furthermore, we introduce Linear Separable Self-Attention(LSSA), which reduces the computational complexity from O(k 2 ) to O(k). Extensive experiments on the ADE20K dataset and Pascal VOC 2012 dataset demonstrate the effectiveness of the proposed LARFormer by achieving a promising trade-off between segmentation accuracy and inference speed.