The demand for visual quality has been advanced by high display resolutions and frame rates. Nevertheless, these two issues have caused tremendous memory bandwidth in a video coding system. In this study, an efficient lossless embedded compression (EC) algorithm is proposed to save memory bandwidth, while keeping visual quality. The proposed lossless EC algorithm incorporates three core techniques: tree partition, half-pixel prediction and group-based binary coding. Tree partition classifies a 1 6 × 8 block into Trunk, Branch and Leaf. With tree partition, half-pixel prediction produces individual residues for Trunk, Branch and Leaf. Group-based binary coding converts theses residues to efficient codewords. The lossless compression ratio (CR) of the proposed EC is as high as 2.24 on average, saving memory bandwidth by 55.4%. This EC algorithm is implemented using CMOS 0.18 μ m technology. The maximum throughput can reach 6.4 Gpixels/s, which can accommodate 3840 × 2160@60fps. The experiment results demonstrate that this study presents better hardware efficiency of 337 Gpixels/J and 83.5 Kpixels/s/gate. [ABSTRACT FROM AUTHOR]