Attention Deficit Hyperactivity Disorder (ADHD) is a prevalent childhood mental disorder that is mainly characterized by inattention, hyperactivity and impulsiveness. ADHD is not a self-limited disease, which means the symptoms may not resolve on their own as the child grows. Therefore, it is crucial to study the diagnosis methods and provide patients with early intervention treatment to control ADHD symptoms. At present, interactive diagnosis task is considered as a feasible quantitative evaluation method. In this paper, we present a novel approach for ADHD diagnosis utilizing a multi-stream graph-based model, which is based on the Schulte Grid task. Specifically, we propose an interactive scene-driven graph construction method, which builds scene graphs based on the spatio-temporal constraint relations to effectively model unstructured information in the interactive scene. To address the challenge of limited ADHD motion data, we employ both coarse-grained time-frequency domain statistical features and fine-grained local time series variation features for feature extraction. Moreover, we leverage lightweight multi-stream graph neural networks to integrate interactive scene information and motion sensor data to model the entire testing process, and then fuse them for ADHD subtype classification. We conducted comparison and ablation experiments on a real dataset to evaluate the effectiveness of our proposed model. Our results show that the proposed approach outperforms other baseline models and demonstrate the potential of our method to improve ADHD diagnosis.