Online learning can improve chatbots’ conversational abilities. Although the online learning method has enhanced the diversity of chatbots’ statements, it also brings opportunities for corruption. The chatbot may be corrupted to generate offensive responses such as racist and hate speech. The key component to keeping chatbots from being corrupted is offensive-response detection. Until now, the training datasets for offensive detection have focused only on individual response sentences, disregarding user input sentences. In this paper, we introduce a dialogue-based offensive-response dataset, which consists of 110K input-response chat records. The dataset fills the gap in response detection for chatbots. Then, we build two challenging tasks based on the dataset: an offensive-response detection task and a corrupted chatbot purification task. In addition, we propose a strong benchmark method for the tasks: an encoder-classifier model to detect input-response pairs and a one-shot reinforcement learning (RL) method to reduce rapidly the probability of generating offensive responses.