Background:Single biomarkers have limited utility to date in guiding RA clinical care. Machine learning algorithms may better identify and stratify RA patients with differential outcomes.Objectives:To determine if unsupervised machine learning methods can be employed in a racially and ethnically diverse RA cohort to identify clusters of patients with different disease activity trajectories, as measured by DAS28ESR.Methods:Data are derived from the longitudinal, observational University of California, San Francisco RA Cohort. Along with routine labs, medications and disease activity assessments, a multiple biomarker of disease activity (MBDA) panel was obtained at each visit. The MBDA measures 12 serum biomarkers. Four patient clusters were identified by unsupervised K-prototype clustering after collapsing all observations into a cross sectional dataset. Plots were created to display longitudinal disease activity trajectories for each cluster. Lasso regression was applied to identify biomarkers associated with DAS28ESR.Results:We identified 4 distinct clusters in our cohort (Table 1) with visually different disease activity trajectories (Fig. 1). Cluster 1 (n=116) was older (63.6±9.7), had the highest proportion of Asian participants (n=73, 63%) with the most study visits and longest disease duration. Cluster 2 (n=70) had the highest mean DAS28ESR (5.5±0.7), and the highest mean dose of prednisone (8.6±4.9 mg/day). Cluster 3 had the lowest number of participants (n=14), study visits and lowest biologic use (28.6%). Cluster 4 was the largest cluster (n=173) with the shortest disease duration (4.9±3.8 years) and highest biologic use (61.3%). In the Lasso regressions, leptin was found to have significant positive associations with DAS28ESR in the whole group as well as Clusters 2 and 4. EGF had negative associations with DAS28ESR in the whole group, Cluster 1 and 4. CRP had positive associations with DAS28ESR in the whole group and Cluster 1. YKL40 and VCAM1 were found to have significant associations in Clusters 1 and 3, respectively.Conclusion:We identified 4 unique clusters of RA patients in a racially and ethnically diverse longitudinal cohort with different disease activity trajectories and biomarkers associated with disease activity. Although additional work is needed to explore longitudinal outcomes in each cluster, the application of machine learning methods may identify unique combinations of patient and disease characteristics influencing RA clinical outcomes.Table 1.Demographics and clinical characteristics of the RA cohort. Biomarkers significantly associated with DAS28ESR were determined by Lasso regression Values listed are per standard deviation of each biomarker.Overall(N=373)Cluster 1 (N=116)Cluster 2 (N=70)Cluster 3 (N=14)Cluster 4 (N=173)Demographics:Age54.8 ± 3.663.6 ± 9.750.8 ± 14.958.2 ± 15.850.3 ± 12.1Female Sex318 (85%)101 (87%)57 (81%)11 (79%)149 (86%)Race: Hispanic/Latino181 (49%)22 (19%)47 (67%)5 (36%)107 (62%) Asian123 (33%)73 (63%)8 (11%)6 (43%)36 (21%) Black35 (9%)12 (10%)8 (11%)2 (14%)13 (8%) White & Other34 (9%)9 (7%)7 (10%)1 (7%)17 (10%)Clinical Characteristics:Rheumatoid Factor315 (85%)104 (90%)56 (80%)13 (93%)142 (82%)ACPA297 (80%)98 (85%)54 (77%)12 (86%)133 (77%)Disease Duration7.8 ± 7.613.7 ± 9.75.4 ± 4.66.7 ± 5.74.9 ± 3.8csDMARD344 (92%)108 (93%)63 (90%)13 (93%)160 (93%)Biologic DMARD185 (50%)45 (39%)30 (43%)4 (29%)106 (61%)Prednisone Dose6.7 ± 3.86.0 ± 4.08.6 ± 4.95.8 ± 1.46.3 ± 2.8Body Mass Index28.2 ± 4.526.6 ± 3.828.7 ± 3.828.0 ± 6.229.1 ± 4.7DAS28ESR4.2 ± 1.14.2 ± 1.05.5 ± 0.83.9 ± 0.83.7 ± 0.9Lasso Results:EGF-0.16*-0.41***-----0.20**Leptin0.15**--0.21*--0.21**c-reactive protein0.34**0.51***------VCAM1-------0.73*--YKL40--0.26*-------EGF: epidermal growth factor; VCAM1: Vascular cell adhesion protein 1; YKL40: Chitinase-3-like protein 1.-*pFigure 1.DAS28ESR trajectory plots with 95%CIs for the whole cohort and by cluster.Acknowledgements:This work was supported by the Rheumatology Research Foundation Scientist Development Award.Disclosure of Interests:None declared