Developing intelligent jamming methods to combat the multifunction radar (MFR) has become a vital task in electronic warfare, because the MFR can tune into different working modes according to surrounding environment information. In this article, we present a solution to the limitations of conventional jamming methods, including strong prior knowledge dependence and inaccurate selection strategies. Specifically, we study a joint optimization of jamming type selection and power control task (JO-JTSPC) for a general radar countermeasure scenario. In particular, we first model the sequential decision-making task JO-JTSPC as a Markov decision process (MDP). Subsequently, considering the differences in the designed action space, we accordingly develop two algorithms, i.e., dueling double deep Q-learning and hybrid proximal policy optimization, to solve the optimization problem. Taking into consideration the threat level of various MFR working modes and the corresponding required jamming effect, we elaborately design the reward function of MDP as a weighted summation of the mode switching factor, jamming performance factor, and jamming power factor. Further, the learned polices of these algorithms are derived based on the designed reinforcement learning elements. Extensive simulation results demonstrate that the proposed algorithms can learn highly adaptive polices in the radar countermeasure scenarios and achieve good jamming performance.