Recently, cascade instance segmentation inspired by cascade object detection has achieved notable performance. Due to the lack of global information, many methods suffer from incomplete segmentation such as missing edge regions and discontinuities within instances. To solve this problem, we proposed an effective and flexible semantic head to extract enhanced spatial context information. A vision transformer is utilized to generate global context features, and a convolution network is adopted to generate spatial context features. After combining the two modules, we obtain enhanced semantic segmentation features for segmentation. Extensive experiments show that the enhanced semantic head achieves 40.6% and 42.3% mask AP for cascade predictor HTC and DSC, which surpass about 0.9 and 1.4 percentage points respectively. The enhanced semantic head is universal and effective to improve the performance of different cascade predictors.