Cognitive science related research offers valuable knowledge for the design, development, and evaluation of artificial intelligence (AI) systems. In this short position paper, leveraging first principles from cognitive science, we propose novel methods for testing pre-trained large language models (LLMs), specifically to assess their common sense reasoning abilities. The test cases are meant to aid in assessing the model’s ability 1) to analyze various dimensions of a prototype, and 2) to discover subtle and implied meanings (e.g. proverbs) across languages. We hope the ideas presented in the paper will spark interdisciplinary discussions concerning robust audit and evaluation of LLMs.