Feature selection is a technique for selecting a set of related features that contribute the most to the accuracy of the classifier from the feature set. Feature selection has always been an active research field in the past ten to fifteen years. Due to a large amount of data and the high computational complexity caused by the high dimensionality of data, the single-machine processing time is too long. To address this shortcoming, this paper proposes a Spark-based platform. The feature selection method of the parallel binary gray wolf optimization algorithm (SPBGWO) has higher search performance, which can improve the shortcomings of the algorithm which is easy to fall into local optimum and improve the convergence efficiency of the algorithm. Combining the improved search capabilities of the improved gray wolf algorithm with the computing power of the Spark platform simplifies the high complexity of large data volumes and computations, making the algorithm simpler, faster, better able to solve problems and reduce storage requirements.