Context Metformin is the first-line drug for treating diabetes but has a high failure rate. Objective To identify demographic and clinical factors available in the electronic health record (EHR) that predict metformin failure. Methods A cohort of patients with at least one abnormal diabetes screening test that initiated metformin was identified at three sites (Arizona, Mississippi, and Minnesota). We identified 22,047 metformin initiators (48% female, mean age of 57 ± 14 years) including 2141 African Americans, 440 Asians, 962 Other/Multi-racials, 1539 Hispanics, and 16,764 Non-Hispanic whites. We defined metformin failure as either the lack of a target hemoglobin A1c ( Results In this large diverse population, we observed a high rate of metformin failure (33%). The XGBoost model that included baseline hemoglobin A1c, age, sex, and race/ethnicity corresponded to high discrimination performance (C-index of 0.731; 95% CI 0.722, 0.740) for risk of metformin failure. Baseline hemoglobin A1c corresponded to the largest feature performance with higher levels associated with metformin failure. The addition of other clinical factors improved model performance (0.745; 95% CI 0.737, 0.754, p Conclusions Baseline hemoglobin A1c was the strongest predictor of metformin failure and additional factors substantially improved performance suggesting that routinely available clinical data could be used to identify patients at high risk of metformin failure who might benefit from closer monitoring and earlier treatment intensification.