When the power grid is abnormal or faulty, the substation monitoring system will generate a large amount of alarm information. There is a problem that a lot of alarm information cannot be used in depth. In view of the above problems, this paper proposes a data preprocessing method based on the semantic framework of alarm information. It provides data support for power grid fault diagnosis. Firstly, the semantic slot of the semantic frame is filled with the Horspool algorithm to realize the automatic classification and extraction of alarm information by constructing a semantic frame. Secondly, the alarm information extracted by automatic classification is applied to the fault diagnosis program of the power grid. The data preprocessing part of the power grid fault diagnosis architecture based on data mining technology has been solved. Finally, the reliability of the proposed method is verified in the proposed case.