Simulating the shift character of visual attention, we propose a novel concept of hierarchical saliencyand develop a detection framework. First, a given image is over-segmented into coarse and fine layers whichrespond to two scale superpixels. Then, we estimate the saliency maps from coarse to fine. In the coarse layer, wepresent a new self-adaptive algorithm to construct the superpixels graph, employing the manifold ranking approachto optimize it. In the fine layer, sparse reconstruction is used to obtain the saliency regions. At last, we proposea Restricted Voting Strategy (RVS) to fuse two layer saliency maps into one hierarchical saliency map. Differentfrom the prior methods, the targets of the final map are labeled layer-wise. The final result can be directly applied tomore high-level computer vision tasks in various situations. For the requirement of hierarchical saliency evaluation,we construct the CAS-HAS dataset. We exhaustively evaluate the framework on the proposed data set and threebenchmark data sets. The experiment performance is comparable with the sate-of-the-art approaches.