The quantum switch is a vital component for the development of the quantum internet. In an entanglement distribution network, the function of a quantum switch is to generate elementary entanglement with its clients followed by entanglement swapping to distribute end-to-end entanglement of sufficiently high fidelity between clients. The threshold on entanglement fidelity is any quality of service requirement specified by the clients as dictated by the application they run on the network. We consider a discrete time model for a quantum switch that attempts generation of fresh elementary entanglement with clients in each time step in the form of maximally entangled qubit pairs, or Bell pairs, which succeed probabilistically; the successfully generated Bell pairs are stored in noisy quantum memories until they can be swapped. We focus on establishing the value of entanglement distillation of the stored Bell pairs prior to entanglement swapping in presence of their inevitable aging, i.e., decoherence: For a simple instance of a switch with two clients, an exponential decay of entanglement fidelity, and a wellknown probabilistic but heralded two-to-one distillation protocol, given a threshold end-to-end entanglement fidelity, we use the Markov Decision Processes framework to identify the optimal action policy - to wait, to distill, or to swap that maximizes throughput. We compare the switch's performance under the optimal distillation-enabled policy with that excluding distillation. Simulations of the two policies demonstrate the improvements that are possible in principle via optimal use of distillation with respect to average throughput, average fidelity and jitter of end-to-end entanglement, as functions of fidelity threshold. Our model thus helps capture the role of entanglement distillation in mitigating the effects of decoherence in a quantum switch in an entanglement distribution network, adding to the growing literature on quantum switches.