Offloading techniques are considered one of the key enablers of deep neural network (DNN)-based artificial intelligence (AI) services on end devices with limited computing resources. However, offloading DNN layers involves hard combinatorial problems. To this end, we develop a deep reinforcement learning (DRL)-based offloading algorithm for computing DNN layers with minimum end-to-end inference latency. We combine long short-term memory (LSTM) and graph neural network for state embedding that can exploit spatial correlation over the network to accelerate training, and temporal correlation over time to reduce the overhead of state monitoring. With this embedding, our DRL algorithm can draw multiple actions from a single state observation and adapt, without retraining, to new environments unseen in the training phase. We show through extensive simulations that our algorithm outperforms the existing ones in terms of both latency and robustness to feedback delay which is inevitable in practice, in particular, achieving a performance enhancement of up to 29.6% in some scenarios.