Exponential growth of bandwidth demand, spurred by emerging network services with diverse characteristics and stringent performance requirements, drives the need for dynamic operation of optical networks, efficient use of spectral resources, and automation. One of the main challenges of dynamic, resource-efficient Elastic Optical Networks (EONs) is spectrum fragmentation. Fragmented, stranded spectrum slots lead to poor resource utilization and increase the blocking probability of in-coming service requests. Conventional approaches for Spectrum Defragmentation (SD) apply various criteria to decide when, and which portion of the spectrum to defragment. However, these polices often address only a subset of tasks related to defragmentation, are not adaptable, and have limited automation potential. To address these issues, we propose DeepDefrag, a novel framework based on reinforcement learning that addresses the main aspects of the SD process: determining when to perform de-fragmentation, which connections to reconfigure, and which part of the spectrum to reallocate them to. DeepDefrag outperforms the well-known Older-First First-Fit (OF-FF) defragmentation heuristic, achieving lower blocking probability under smaller defragmentation overhead.