IntroductionEarly childhood development can be described by an underlying latent construct. Global comparisons of children’s development are hindered by the lack of a validated metric that is comparable across cultures and contexts, especially for children under age 3 years. We constructed and validated a new metric, the Developmental Score (D-score), using existing data from 16 longitudinal studies.MethodsStudies had item-level developmental assessment data for children 0–48 months and longitudinal outcomes at ages >4–18 years, including measures of IQ and receptive vocabulary. Existing data from 11 low-income, middle-income and high-income countries were merged for >36 000 children. Item mapping produced 95 ‘equate groups’ of same-skill items across 12 different assessment instruments. A statistical model was built using the Rasch model with item difficulties constrained to be equal in a subset of equate groups, linking instruments to a common scale, the D-score, a continuous metric with interval-scale properties. D-score-for-age z-scores (DAZ) were evaluated for discriminant, concurrent and predictive validity to outcomes in middle childhood to adolescence.ResultsConcurrent validity of DAZ with original instruments was strong (average r=0.71), with few exceptions. In approximately 70% of data rounds collected across studies, DAZ discriminated between children above/below cut-points for low birth weight (