We consider the problem of zero-shot anomaly detection in which a model is pre-trained to detect anomalies in images belonging to seen classes, and expected to detect anomalies from unseen classes at test time. State-of-the-art anomaly detection (AD) methods can often achieve exceptional results when training images are abundant, but they catastrophically fail in zero-shot scenarios with a lack of real examples. However, with the emergence of multi-modal models such as CLIP, it is possible to use knowledge from other modalities (e.g. text) to compensate for the lack of visual information and improve AD performance. In this work, we propose PromptAD, a dual-branch framework which uses prior knowledge about both normal and abnormal behaviours in the form of text prompts to detect anomalies even in unseen classes. More specifically, it uses CLIP as a backbone encoder network and an additional dual-branch vision-language decoding network for both normality and abnormality information. The normality branch establishes a profile of normality, while the abnormality branch models anomalous behaviors, guided by natural language text prompts. As the two branches capture complementary information or ‘views’, we propose a ‘cross-view contrastive learning’ (CCL) component which regularizes each view with additional reference information from the other view. We further propose a cross-view mutual interaction (CMI) strategy to promote the mutual exploration of useful knowledge from each branch. We show that PromptAD outperforms existing baselines in zero-shot anomaly detection on key benchmark datasets and analyse the role of each component in ablation studies.