Recently, there has been an increasing number of attempts to perform the better treatment by performing patient-specific precision medicine. Genome-based customized treatments, one of the representative precision medicines, are receiving much attention as the cost and time of genetic analysis are dramatically decreased. However, since it is challenging to directly grasp the effects of thousands of gene variants present in cancer cell, existing methods perform analysis based on statistical associations from patient data. This study proposes a novel technique for finding gene variants which increase the effectiveness of treatment and ranking them according to their importance. In order to quantify the effect of individual genes on patient treatment, we design the Treatment Effect function based on a proportional hazard model. A feature selection technique based on reinforcement learning is introduced to extract effective genes from numerous gene candidates. We verify that the proposed algorithm can find effective gene variants without any medical knowledge and statistical analysis techniques.