Deep reinforcement learning in POMDPs for 3-D palletization problem
- Resource Type
- Conference
- Authors
- Bo, Ai; Lu, Junguo; Zhao, Chunyu
- Source
- 2022 China Automation Congress (CAC) Automation Congress (CAC), 2022 China. :577-582 Nov, 2022
- Subject
- Aerospace
Communication, Networking and Broadcast Technologies
Components, Circuits, Devices and Systems
Computing and Processing
Power, Energy and Industry Applications
Robotics and Control Systems
Signal Processing and Analysis
Transportation
Deep learning
Three-dimensional displays
Automation
Heuristic algorithms
Reinforcement learning
Markov processes
Trajectory
bin packing problem (BPP)
deep reinforcement learning (DRL)
partially observable Markov decision process (POMDP)
- Language
- ISSN
- 2688-0938
Online 3D palletization problem is a generic variant in the family of bin packing problem (BPP). However, conventional deep reinforcement learning (DRL) methods merely have an excellent performance on combinatorial optimization problem modeled as Markov decision process (MDP). Since online BPP only provides information on fragments in successive items sequence, it is hard to describe online 3D palletization problem as MDP. Thereby, we formulated online 3D palletization problem as partially observable Markov decision processes (POMDPs) and proposed a novel DRL method to estimate state with observations trajectories. We also devised a DRL framework and train agents on environments with different boxes types. The result shows that our method is effective in a range of experimental settings and achieves higher space utilization than conventional heuristic algorithms.