Nonlinear classification problems are always assumed to be equivalent to a linear classification problem in some higher dimensional feature space. Kernel machines like support vector machines (SVMs) implicitly map the features to higher dimensional feature space by a kernel trick, and use all the mapped features for classification. This paper proposes an explicit polynomial feature expansion and feature selection method, which eliminates 'useless' features before designing a linear classifier. This method is almost automatic, and it is easier to use than selecting kernel functions and parameters for kernel machines. In our experiments, the method achieves good generalization ability on most of the benchmark datasets, and it can be a candidate method for solving general pattern recognition problems.