Privacy preservation is a sensitive and important issue in this ever-growing and highly-connected digital era. Functional encryption is a computation on encrypted data paradigm that allows users to retrieve the evaluation of a function on encrypted data without revealing the data, effectively protecting user's privacy. However, existing functional encryption implementations are still very time-consuming for practical deployment, especially when applied to machine learning applications that involve huge amount of data. In this article, we present a high-performance implementation of inner-product functional encryption (IPFE) based on ring-learning with errors on graphics processing units. We execute a systematic investigation to select the best strategy for implementing number theoretic transform for different security levels, which is the most time-consuming operations in the IPFE scheme. We further propose novel techniques to parallelize the Gaussian sampling. Compared to the existing AVX2 implementation, our implementation on a RTX 2060 GPU achieves $34.24\times$34.24×, $40.02\times$40.02×, $156.30\times$156.30× and $18.76\times$18.76× speed-up for Setup, Encrypt, KeyGen and Decrypt respectively. Finally, we propose a fast privacy-preserving SVM to classify data securely using our GPU-accelerated IPFE scheme. On average, our implementation can classify one input with 591 support vectors in 688 ms ($< 1$<1 second), which is $33.12\times$33.12× faster than the AVX2 version.