Federated Learning (FL) is a technique to train models on distributed edge devices with local data samples. Differential Privacy (DP) can be applied with FL to provide a formal privacy guarantee for sensitive data on device. Our goal is to train a large neural network language model (NNLM) on compute-constrained devices while preserving privacy using FL and DP. However, the noise required to guarantee differential privacy increases as the model size grows, which often prevents convergence. We propose Partial Embedding Updates (PEU), a novel technique to reduce the impact of DP-noise by decreasing payload size. Furthermore, we adopt Low Rank Adaptation (LoRA) and Noise Contrastive Estimation (NCE) to reduce the memory demands of large models on compute-constrained devices. We demonstrate in simulation and with real devices that this combination of techniques makes it possible to train large-vocabulary language models while preserving accuracy and privacy.