This brief proposes a split WL 6T SRAM computing-in-memory (CIM) macro for high signal margin and high throughput bit-serial multiply-accumulate (MAC) operation with 8-b input, 8-b weight, and 20-b output. The proposed architecture improves signal margin with input flipping WL control that limits the maximum number of turned-on WLs, throughput with a dual BL sensing scheme and pipelining, and energy efficiency with an input-aware trip point-based ADC and supply voltage adjustment of a near memory processor (NMP). The proposed architecture is fabricated in 65nm technology and achieves 152.32 bit-wise TOPS/W and 20.69 bit-wise TOPS/mm2. In addition, 91.23% inference accuracy at the CIFAR-10 image classification dataset with the ResNET-20 network is achieved which is 0.47% degraded from software accuracy.