Artificial intelligence (AI) modes have been successfully applied in several areas like computer vision, and intelligent manufacturing. They have great potential to improve the efficiency and security of power control systems. Before selecting and applying an appropriate AI model in a commercial power control system, it is necessary to test all candidate AI models and evaluate their performance. However, there are few studies on the systemic process of testing AI models and evaluating their performance. To this end, this paper presents a novel AI model testing and evaluation framework and introduces the AI model performance evaluation metrics and methods. Then, the AI model’s testing process is described, and the methods of AI model training, model testing, result analysis, and model evaluation are described in detail. To facilitate the understanding of AI model evaluation methodology and for the task of malicious traffic identification, we give a specific case to show the testing and evaluation performance of the proposed text CNN and other benchmark deep learning models in terms of efficiency and accuracy.