In this paper, we propose a model-free adaptive learning solution for a model-following control problem. This approach employs policy iteration, to find an optimal adaptive control solution. It utilizes a moving finite-horizon of model-following error measurements. In addition, the control strategy is designed by using a projection mechanism that employs Lagrange dynamics. It allows for real-time tuning of derived actor-critic structures to find the optimal model-following strategy and sustain optimized adaptation performance. Finally, the efficacy of the proposed framework is emphasized through a comparison with sliding mode and high-order model-free adaptive control approaches. Keywords: Model Reference Adaptive Systems, Reinforcement Learning, adaptive critics, control system, stochastic, nonlinear system
Comment: IEEE Transactions on Automatic Control