While many methods such as Locally Interpretable Model-agnostic Explanation (LIME), Integrated Gradients and Layerwise Relevance Propagation (LRP) have been developed to explain how recurrent neural networks make predictions, the explanations generated by each method often times vary dramatically. There is no consensus about which explainability method most accurately and robustly determine features important for model prediction. We consider a classification task on a sequence of events with different types and apply both gradient-based and attention-based explanation models to compute explanations on the event type level. We show that attention-based models return a higher similarity score between explanations for models initialized with different random seeds. However, there are still significant differences in explanations between model runs. We develop an optimization-based model to find a low-loss, high-accuracy path between two sets of trained weights to understand how model explanations morph between different local minima. We use this low-loss path to provide insight as to why explanations vary on two sentiment datasets.