Purpose: The /[voiced alveolar approximant]/ productions of young children acquiring American English are highly variable and often inaccurate, with [w] as the most common substitution error. One acoustic indicator of the goodness of children's /[voiced alveolar approximant]/ productions is the difference between the frequency of the second formant (F2) and the third formant (F3), with a smaller F3-F2 difference being associated with a perceptually more adultlike /[voiced alveolar approximant]/. This study analyzed the effectiveness of automatically extracted F3-F2 differences in characterizing young children's productions of /[voiced alveolar approximant]/-/w/ in comparison with manually coded measurements. Method: Automated F3-F2 differences were extracted from productions of a variety of different /[voiced alveolar approximant]/- and /w/-initial words spoken by 3- to 4-year-old monolingual preschoolers (N = 117; 2,278 tokens in total). These automated measures were compared to ratings of the phoneme goodness of children's productions as rated by untrained adult listeners (n = 132) on a visual analog scale, as well as to narrow transcriptions of the production into four categories: [[voiced alveolar approximant]], [w], and two intermediate categories. Results: Data visualizations show a weak relationship between automated F3-F2 differences with listener ratings and narrow transcriptions. Mixed-effects models suggest the automated F3-F2 difference only modestly predicts listener ratings (R2 = 0.37) and narrow transcriptions (R2 = 0.32). Conclusion: The weak relationship between automated F3-F2 difference and both listener ratings and narrow transcriptions suggests that these automated acoustic measures are of questionable reliability and utility in assessing preschool children's mastery of the /[voiced alveolar approximant]/-/w/ contrast.