In this paper, the optimal containment control of a class of unknown nonlinear multi-agent systems (MASs) is studied via a time-aggregation (TA) based model-free reinforcement learning (RL) algorithm. By proposing TA-based event-state, event-control, and integration-reward, the model-free TA-based policy iteration (TA-PI) approach is synthesized such that the policy evaluation and policy improvement steps are only executed for finite event-state, and the optimal control protocol is obtained with fewer computational requirements. Besides, the control input is intermittently updating only when the event-set is visited, which greatly reduce the updating frequency of control. Therefore, the proposed learning algorithm helps to save computational resources in both learning process and control updating. Moreover, armed with a finite predefined event-set, the developed TA-PI algorithm without employing function approximator and state discretization, resulting a strict convergence analysis via the mathematical induction. Finally, simulation results are given to show the feasibility and effectiveness of the proposed algorithm.