Motivated by commodity detection in retail, this paper studies the problem of generating object bounding boxes in retail images. The existing object detection algorithms are not suitable due to the characteristics of the intensive distribution of commodities in intelligent retail containers. Therefore, we present a unified object detection framework used for dense scenarios in retail images, which consists of the hierarchical labeling pattern, the Similarity Recognition Network (SRN) subnetwork and the optimized NMS algorithm on the basis of YOLOv3. Three datasets are built to verify the effectiveness of the proposed framework, two of which are used for training the object detection network, and the other one is used for training the SRN subnetwork. The experimental results show that the proposed framework achieves a significant improvement for dense scenario in retail images compared with the traditional object detection algorithms.