Ridgelet can theoretically approximate low-level image features, and the ridgelet filter is constructed independently of the training sample, but it usually requires to preset a lot of parameters to achieve an ideal representation of complex scenes. Convolutional neural networks (CNNs) can adaptively exploit the high-level image features, but the training process depends on the selection of training samples. Hence, in this paper, the advantages of both the two aspects are fully considered for image feature extraction and a multi-resolution CNNs framework (MRCNNs) is proposed by fusing high-and-low-level features via ridgelet and CNNs. In the proposed method, multi-resolution low-level features are captured by the constructed ridgelet filters, and multi-resolution high-level features are exploited by the trainable convolutional filters. Then the ridgelet features and convolutional features are fused to reduce the dependence of CNNs on training samples, and consequently, improve the classification ability of CNNs. The proposed MRCNNs method is conducted on three very high-resolution (VHR) remote sensing images and compared with several deep-learning approaches. Experimental results present the effectiveness of the proposed method for VHR images classification.