To address attacks from different threat actors and scenarios, the generalizability of network attack detection methods is crucial for critical infrastructure like the power system. Machine learning-based attack detection has become the de facto standard solution in this field. However, existing research has primarily focused on scenarios where training and testing distributions are consistent, overlooking the machine learning models' ability to generalize to different distributions. This paper emphasizes two key dimensions: the impact of background traffic and the role of statistical features in the detection process. By embedding identical attack behaviors in different background flows, we reveal significant performance disparities in different traffic scenarios (data domains), highlighting the insufficient generalizability of existing methods. Through a comparison between DDoS attacks and XSS attacks, we identify the limitations of relying solely on statistical data for detecting various attack behaviors. Based on the new research perspective proposed in this paper for the attack detection domain, we advocate for adopting a comprehensive approach that integrates both feature and behavior considerations. It involves incorporating security domain knowledge before applying machine learning to enhance the generalization performance of machine learning-based attack detection methods in the complex and dynamic network environment of power systems.