Activation sparsity can improve compute efficiency and resource utilization in sparsity-aware neural network accelerators. While spatial sparsification of activations is a popular topic in DNN literature, introducing and exploiting spatio-temporal sparsity is a topic much less explored in DNN literature. However, it is in perfect resonance with the trend in DNNs, to shift from static-data signal processing (e.g., image processing) to stream-data real-time signal processing (e.g., video and audio) in embedded edge devices. Towards the goal of exploiting temporal sparsity, in this paper, we introduce a new DNN layer (called Delta Activation Layer), whose sole purpose is to promote both spatial as well as temporal sparsity of activations during training with time-distributed data. One may use the Delta Activation Layer either during vanilla training or during a refinement phase. We have implemented the Delta Activation Layer as an extension of the standard TensorFlow-Keras (2.0) library and applied it on training deep neural networks with several datasets including the Human Action Recognition (UCF101) dataset. We report an almost 3x improvement of activation sparsity, with recoverable loss of model accuracy after prolonged training. All source code for this project is available upon request for academic purposes.