Datamart calibration. The circles represent A) the initial broad datamart identified using codified data, B) the second refined datamart in which electronic notes with the words polycystic ovary syndrome or PCOS were found, and C) patients from the entire Research Population Data Registry database, without codified exclusion criteria. The overlap represents patients that were found using both codified data and with a PCOS term in the note (AXB) or patients with a PCOS term in the note and without exclusion criteria (BXC). Of note, patients without exclusion criteria are also found in A and AXB, but are not shown here for clarity. The numbers in the orange circles represent the number of charts with a confirmed PCOS diagnosis over the total number of charts reviewed by an expert (CKW) and the percentage confirmed. The white box indicates the patients with evaluable charts who were not included in the broad definition datamart (no codified terms identified) but who did have a PCOS term in their note and were included in the refined datamart. Table S1. ICD 9 codes for diagnoses and procedures and laboratory values used for inclusion and exclusion in the broad PCOS datamart. Patients were all female, 18-74 years of age (current), with any of the listed parameters measured at Massachusetts General Hospital or Brigham and Womenâ s Hospital. Table S2. Inclusion and exclusion criteria used to create the second refined PCOS datamart. Patients were all female, 18-40 years of age at first identification of any listed parameter from records at Massachusetts General Hospital or Brigham and Womenâ s Hospital. (DOCX 36 kb)