Fault-tolerant quantum computing based on surface code has emerged as an attractive candidate for practical large-scale quantum computers to achieve robust noise resistance. To achieve universality, magic states preparation is a commonly approach for introducing non-Clifford gates. Here, we present a hardware-efficient and scalable protocol for arbitrary logical state preparation for the rotated surface code, and further experimentally implement it on the \textit{Zuchongzhi} 2.1 superconducting quantum processor. An average of \hhl{$0.8983 \pm 0.0002$} logical fidelity at different logical states with distance-three is achieved, \hhl{taking into account both state preparation and measurement errors.} In particular, \hhl{the magic states $|A^{\pi/4}\rangle_L$, $|H\rangle_L$, and $|T\rangle_L$ are prepared non-destructively with logical fidelities of $0.8771 \pm 0.0009 $, $0.9090 \pm 0.0009 $, and $0.8890 \pm 0.0010$, respectively, which are higher than the state distillation protocol threshold, 0.859 (for H-type magic state) and 0.827 (for T -type magic state).} Our work provides a viable and efficient avenue for generating high-fidelity raw logical magic states, which is essential for realizing non-Clifford logical gates in the surface code.
Comment: In this version, we do not employ readout error mitigation strategies (in the previous version, we use readout transition matrix to mitigate the measurement error) to remove measurement errors because we believe it provides a more predictive assessment of the actual fidelity when generating and consuming magic states for a non-Clifford gate, as consuming the state involves measurement