Proper mask usage in public areas has been shown to be critical in the efforts to reduce infection spread in circumstances such as the COVID-19 pandemic. In this paper, we propose mask usage detection approach based on deep learning: a Mask Regional-Convolutional Neural Network (Mask R-CNN) that provides segmentation of faces and masks, and another CNN using a novel Soft Attention unit to detect the correctness of the mask usage. We also provide a small instance segmented subset of the Masked Faces (MAFA) dataset for instance segmentation problems. We use the Mask R-CNN to provide instance segmentations of faces and face masks to the visual relationship detection CNN and predict improperly and properly worn face masks. Various CNN architectures such as ResNet50 were tested and compared to determine its effectiveness for the above task. We evaluate the CNN architectures on accuracy, precision, recall, and specificity of detecting properly worn masks. The best performing network was determined to be the ResNet50V2 architecture with 76.27% accuracy, 84.76% precision, 74.38% recall, and 79.20% specificity.