Depression is a serious mental disorder that affects millions of people worldwide. The prediction of depression level at early stage is significant. This paper focuses on predicting the degree of depression not just for judging depression based on audio signal. To induce dynamic temporal information of frequency domain, we proposed Audio Delta Ternary Patterns (ADTP) algorithm in the spectrogram feature space. Moreover, we designed an integrated model of multiple stream, which uses joint tuning layers to encode temporal movement feature, high-level features of spectral and MFCC, and predict the Beck Depression Inventory-II (BDI-II) values from speech signal. Experiments on the AVEC2014 dataset show that our method performs better than some previous methods in predicting depression scores based on audio data.