Reinforcement learning presents a promising approach to bolster cybersecurity through the development of intelligent agents capable of learning from their environment and adapting to new threats. In the field of cybersecurity, reinforcement learning has various applications, including intrusion detection, malware classification, and vulnerability analysis. However, current reinforcement learning algorithms such as Deep Q-Learning rely on deep neural networks, which entail high computational costs and unsuitability for deployment on edge devices. To overcome this challenge, we proposed two solutions for efficient reinforcement learning on edge devices. The first solution is a Hyperdimensional Reinforcement Learning algorithm inspired by the brain's properties that facilitate robust and real-time learning using a lightweight brain-inspired model to learn an optimal policy in an unknown environment. Next, we propose a heterogeneous CPU-FPGA platform that maximizes the computing capabilities of FPGAs by applying hardware optimizations for hyperdimensional computing's critical operations. Our platform achieves faster and higher energy efficiency than state-of-the-art reinforcement learning accelerators while maintaining the same or better quality of learning. Additionally, we enhance the RL model's learning capabilities, such as learning throughput, energy efficiency, and robustness. Our proposed solutions offer efficient and scalable alternatives for reinforcement learning on edge devices, making it possible to support online and real-time learning with minimal memory capacity.