This paper presents a fast and low cost context adaptive binary arithmetic encoder for H.264/MPEG-4 AVC video coding standard through both algorithm level and architecture level optimizations. First in the algorithm level, we process the binarization and context generation in parallel to reduce the encoding iteration cycles to three or four cycles from five cycles in the previous design. Second, in the architecture level, we reduce the cycles of renormalization loops by employing one-skipping and bit-parallelism, and save hardware cost of arithmetic coder by merging three different modes. The implemented design shows that it can achieve the 333MHz frequency with only 13.3K gate count.