Large imaging surveys will rely on photometric redshifts (photo-$z$'s), which are typically estimated through machine learning methods. Currently planned spectroscopic surveys will not be deep enough to produce a representative training sample for LSST, so we seek methods to improve the photo-z estimates from non-representative training samples. Spectroscopic training samples for photo-z's are biased towards redder, brighter galaxies, which also tend to be at lower redshift than the typical galaxy observed by LSST, leading to poor photo-z estimates with outlier fractions nearly 4 times larger than for a representative training sample. In this paper, we apply the concept of training sample augmentation, where we augment non-representative training samples with simulated galaxies possessing otherwise unrepresented features. When we select simulated galaxies with (g-z) color, i-band magnitude and redshift outside the range of the original training sample, we are able to reduce the outlier fraction of the photo-z estimates by nearly 50% and the normalized median absolute deviation (NMAD) by 56%. When compared to a fully representative training sample, augmentation can recover nearly 70% of the increase in the outlier fraction and 80% of the increase in NMAD. Training sample augmentation is a simple and effective way to improve training samples for photo-z's without requiring additional spectroscopic samples.
Comment: 10 pages, 4 figures, submitted to ApJ Letters