3D NAND flash-based storage devices, i.e., Solid-State Drives (SSDs), are gradually regarded as promising candidates to lead the flash industry thanks to their rapidly growing density. However, 3D NAND SSD has relatively high flash command latency, which raises the phenomenon of chip-blocking write, yielding the read long-tail latency problem. Data replication is a viable strategy for increasing data availability. However, data replication brings extra time overhead to read and write data, which reinforces the original chip-blocking write problem. We first reveal that the conventional scheme writes a whole page to flush smaller data portion, resulting into time squandering. Based on this observation, we propose a novel scheme, RUSM (Replicate Using Subpage Merging), which reclaims the improperly used time from the conventional page writing operation to amend the replication mechanism. Through experiments, we show how RUSM controls the chip-blocking write problem and enhances reading performance at low overhead cost.