Kubernetes, an open-source project initiated by Google for managing and organizing containers in cloud platforms, has become the preferred choice for deploying large-scale containerized microservice architectures. However, there exist challenges related to allocating hardware resources and optimizing load balancing strategies for external access, which require high reliability and low power consumption. This paper introduces the native scheduling and load balancing strategies in Kubernetes and highlights the current issues in these strategies. Subsequently, it presents four existing optimization and improvement strategies for addressing these issues, along with related research. Furthermore, the paper proposes research directions and development suggestions for large-scale cluster applications and diversified business expansions.