In the human-computer interaction, the moving gesture is prone to morphological changes while projected into a two-dimensional video space. In this paper, the Adaboost classifier is firstly used to detect the presence of gestures in a video frame to start the tracker, and the k-means algorithm is used for accurate Colour Space Modeling of hand gestures. The back projection is decomposed into a gesture motion region and a color interference region, and the Surendra algorithm is used to model the color interference region. Then the correctness of the tracking is performed by the Bhattacharyya Distance and the Perceptual Hash Coefficient. When the object is lost, the motion position of the gesture is re-detected by the Gaussian Mixture Model and the Bayesian Skin Color model. Finally, the tracked gesture is identified by a machine learning algorithm. The new method uses a combination of the feature space segmentation modeling and the probability density map to overcome the interference problem in motion tracking, and has a high tracking accuracy.