Pedestrian attribute recognition is to predict the presence of a set of attributes from a given image, and it plays an important role in video surveillance applications. Most existing works model the task as a multi-label classification problem. Although effective, they ignore the existence of correlations among attributes. In this work, to learn multiple attributes jointly, the attributes are modeled as a subspace and a dictionary is introduced to represent the subspace. Furthermore, to extract the convolutional features which are more suitable for attribute prediction, the dictionary is modeled as a network layer which is learned jointly with the convolutional network. Finally, a novel learning algorithm is proposed to optimize the dictionary and the convolutional network corporately. Extensive experimental analyses and evaluations on two largest pedestrian attribute benchmarks PETA and PA-100K demonstrate that the proposed method achieves state-of-the-art performance.