In the last decade, there has been a significant upsurge in the demand for artificial intelligence. This remarkable growth can be attributed to the advancements in machine and deep learning techniques, coupled with the harnessing of hardware acceleration capabilities. However, achieving higher prediction accuracy and extending the applicability of machine learning to complex tasks necessitates a substantial volume of training data. Smaller machine learning models can effectively train with limited data, but when we transition to larger models such as neural networks, the demand for training data increases exponentially due to the rising number of parameters. As the necessity to process training data has outpaced the growth of computing power in individual machines, there's an urgent need to distribute the deep learning workload across multiple machines, transforming centralized systems into distributed ones. These distributed systems introduce new challenges, with efficient parallelization of the training process and the establishment of a coherent model being of primary concern. This paper provides an extensive overview of the present state of the art. It does so by delineating the challenges and prospects associated with distributed deep learning in contrast with the conventional centralized deep learning approach. Furthermore, it delves into the methodologies employed in deep distributed computing and furnishes a survey of the current systems operating within this domain.