Accurate diagnosis plays an important role in the clinical decision-making process. To reduce diagnostic errors, many researchers develop models to assist physicians in predicting the most probable diagnoses during a patient’s visit. In this paper, we focus on the development of the models for clinical diagnostic decision support using a real-world Electronic Health Record (EHR) dataset with 592,715 patient-visits. We propose a novel diagnosis prediction framework, which is based on Bidirectional Encoder Representations from Transformers (BERT), for the disease classification based on textual clinical notes and age information from EHR data, which differs from the previously proposed models by the way of the construction of the input representation by four special embeddings and the composition of the classification layer. We conduct the experiments on the task of multi-class classification for 1,987 diagnosis codes. The experimental results demonstrate the improved performance of our models compared to the baselines trained by the advanced text classification methods. We also investigate the influence of different types of clinical text and age information on the model performance.