To address the combination of computing-in-memory (CIM) architecture and magnetoresistive random-access memory (MRAM), a spin-transfer torque MRAM based resistance-sum-column architecture has been proposed to alleviate the low resistance problem that exists in the 2-terminal MRAM for CIM. In this paper, we propose a resistance-sum-row architecture with a hybrid reference based on the high performance 3-terminal voltage-controlled spin-orbit torque MRAM (VC-SOT-MRAM). This new architecture allows one-step parallel XNOR operation, and thus enabling ultra-high integration and throughput. The use of hybrid reference, which combines self-reference and auxiliary reference, is shown to effectively overcome the read errors. Our scheme is evaluated on a 28 nm technology node, showing extremely low latency (0.9 ns/write,