Real-time fire detection systems are important and complex examples of Cyber-Physical Systems can be developed to optimize an escape route in an emergency for humans, with respect to the distance to exits. Various fire detection techniques have been introduced including traditional sensor detection to advanced deep learning-based techniques. However, only a handful of deep learning-based approaches aim to address real-time fire detection, and fire perception inefficiency should be crucial for early detection of fire. In this paper, a multi-stage architecture is proposed with two modules - a convolutional autoencoder module to extract anomaly region proposals and a convolutional neural network classifier to select the region proposals. The accuracy and efficiency of the proposed architecture are confirmed experimentally. By focusing more on surveillance cameras, the presented system is suitable for a real-time fire detection system.