Visual SLAM approaches have made significant breakthroughs in recent years. It has been widely used in mobile robots, autonomous driving, and AR/VR applications, which interact closely with the environment. However, the majority of SLAM algorithms are based on the static world assumption, which hinders the popularization of visual SLAM in a natural dynamic environment. This paper proposes a multi-object tracking visual odometry in a dynamic environment, segmented at the instance level, combined with the multi-object tracking function, and utilizes a new tracking strategy. The new tracking strategy divides the image into the static and potential dynamic regions combined with instance segmentation. For the static region, it is tracked by minimizing the reprojection error. The potential dynamic region is tracked by minimizing the photometric error. Our experimental results show that in the dynamic environment, the proposed algorithm can accurately estimate the camera pose and accurately estimate the motion of moving objects. Compared with the state-of-the-art algorithms, our proposed algorithm has much lower complexity while having similar accuracy performance.