As the signal integrity (SI) issues become critical with high bandwidth and density applications, the SI analysis and optimization are necessary. The SI optimization loop including design, modeling, simulation, analysis and revision is repetitive and confined to specific applications. To overcome the recurrent issues, we proposed reinforcement learning (RL) model for SI and power leakage optimization in 3D X-Point memory operation. We defined the MDP components to reflect the optimization problem and the RL model shows learning convergence. The optimal design shows 6.2 % of crosstalk, 17.7 % of IR drop and 25.3 % of power leakage improvement than original design.