Advancements in virtual reality (VR) technology enable us to rethink the way of interactive 3D modeling - intuitively creating 3D content directly in 3D space. However, conventional VR-based modeling is laborious and tedious to generate a detailed 3D model in full manual mode since users need to carefully draw almost the entire surface. In this paper, we present a freehand mid-air sketching system with the aid of deep learning techniques for modeling structured buildings, where the user freely draws a few key strokes in mid-air using his/her fingers to represent the desired shapes and our system automatically interprets the strokes using a deep neural network and generates a detailed building model based on a procedural modeling method. After creating several building blocks one by one, the user can freely move, rotate, and combine the blocks to form a complex building model. We demonstrate the ease of use for novice users, effectiveness, and efficiency of our sketching system, BuildingSketch, by presenting a variety of building models.