Crowd counting is a challenging task that aims to compute the number of people present in a single image. The problem has a significant impact on various applications, for instance, urban planning, forensic science, surveillance and security, among others. In this paper, we propose and evaluate a multi-stream convolutional neural network that receives an image as input, generates a density map as output that represents the spatial distribution of people in an end-to-end fashion, and then we estimate the number of people in the image from the density map. The network architecture employs receptive fields with different size filters for each stream in order to deal with extremely unconstrained scale and perspective changes, which are complex issues in the crowd counting context. Although simple, the proposed architecture achieves effective results on the two challenging UCF_CC_50 and ShanghaiTech datasets.