Additional file 5: Figure S5. The regression model and residuals are consistent across multiple training and verification cohorts and potential confounding variables (array synthesis, ethnicity, sample collection site, and BMI). (A) The age regression residuals are consistent across multiple sub-training sets. The Training Cohort and Verification Cohorts were merged into a single large cohort (N = 1675). Two training sets were created, each of size 698, which left 279 samples as a holdout set. An elastic net regression model was trained on each of the training sets and then residuals were calculated on the holdout set of 279 samples (data points) using each of the models (axes). High correlation between model residuals suggests a low variance-error term, which is consistent with residuals potentially being biologically relevant. Result is representative of 100 simulated training set splits. (B) Age regression residuals (axes) are correlated when samples are assayed on different array formats, different training sets, and different algorithm parameters (subplots). Each dot is a single donor assayed for a single permutation of array type, training set, and algorithm parameters. Not all samples were assayed on all permutations of array type. Values on x- and y-axes are residuals, which normalize out the default transitive correlation of all models being correlated to chronological age. (C) Regression model yields similar results across samples binned by ethnicity and sample collection site. Each dot is single donor and each line shows regression predictions on grouped donors. Chronological age (x-axis) and prediction of age based on peptide array regression model (y-axis) are shown. Legend shows correlation coefficient, regression slope and intercept, and number of samples in a given bin. Data shown is regression model learned on Training Cohort and applied to the Verification Cohort. (D) The intercept (shift) and slope (interacting) terms associated with BMI, ethnicity, and collection state are not statistically significant. One-way ANOVA test statistics are shown in the Table. (E) Age-associated antibody-peptide binding to di-serine N-terminus peptides is highly consistent in a long-term study. A total of N = 16 donors consented to regular blood draws for > 1 yr. Donors with > 5 samples over > 1 yr (N = 13) had consistent age-regression values over this time period. Data shown for all donors (lines).