Deep learning methods for image dehazing achieve impressive results. Yet, the task of collecting ground truth hazy/dehazed image pairs to train the network is cumbersome. We propose to use Non-Local Image Dehazing (NLD), an existing physics based technique, to provide the dehazed image required to training a network. Upon close inspection, we find that NLD suffers from several shortcomings and propose novel extensions to improve it. The new method, termed NLD++, consists of 1) denoising the input image as pre-processing step to avoid noise amplification, 2) introducing a constrained optimization that respects physical constraints. NLD++ produces superior results to NLD at the expense of increased computational cost. To offset that, we propose NLDNet++, a fully convolutional network that is trained on pairs of hazy images and images dehazed by NLD++. This eliminates the need of existing deep learning methods that require hazy/dehazed image pairs that are difficult to obtain. We evaluate the performance of NLDNet++ on standard data sets and find it to compare favorably with existing methods.