Failure to diagnose and treat depression in a timely manner causes more than three hundred million people suffering from this mental health disorder worldwide. Depression, a global problem, affects not only people’s emotions, but also their physical and mental states. Early detection of depression is very important for the treatment of patients, so we need to achieve excellent accuracy and practicability of depression detection, among which the most important and challenging problem is to design an effective and robust depression detection model. To solve this problem, we propose a hybrid deep learning model, RoBERTa-BiLSTM, to extract features from depression text sequences. We know that the sequence models require a longer computation time as the processing is done sequentially. However, the Transformer models require less execution time with parallelized processing. This model consolidates the strengths of sequence model and Transformer model while suppressing the limitations of sequence model. Specifically, the model maps the words into a compact meaningful word embedding space through the Robustly optimized BERT approach, and then effectively captures the long-distance contextual semantics using the Bidirectional Long Short-Term Memory model. On the DAIC-WOZ and EATD-Corpus benchmark, our experiments demonstrate that our model outperforms state-of-art methods by a substantial margin.