Recently, the exponential increase in video data makes video instance segmentation attracts significant attention in the field of computer vision. In this work, we propose a method for online multiple object segmentation. The proposed method describes each object by the mask coefficients with respect to the generated prototypes. Instead of tracking multiple objects in image/feature space, we address the segmentation and tracking issues directly in the mask coefficient space that is stable and discriminative for temporal matching. In the experiment, we validate the proposed method by using the DAVIS 2019 dataset.