Trichomoniasis is a common sexually transmitted disease caused by Trichomonas vaginalis and automatic trichomonas vaginalis (TV) detection is a problem of great concern in video object detection. However, existing algorithms are inadequate to identify and localize TV through the microscopic camera efficiently; the defocus, motion blur, resolution and computational efficiency, remain the major problems. To bridge the gap, we propose to learn the invariant side of the dynamic TV by capturing the optical flow. To make use of the motion information, we introduce OF-YOLO, a general-purpose framework for catching hold of the motion feature. We test it on a dataset with 1278 Trichomonas video clips including 51336 frames. Experiment results show how the OF-YOLO significantly boosts the detection performance on real-world scenes.