It is well known that deep neural networks(DNN) have shown outstanding expressive learning ability in machine learning and other tasks. Stochastic gradient descent(SGD) algorithm is widely used to train a number of machine learning models due to its advantages of simple use, fast convergence, and reliable effect. In the background of big data, asynchronous parallel implementations of SGD have been widely applied for accelerating the training of machine learning in a distributed computing environment. However, there are a vast number of time-consuming calculations in training of DNNs. The emerging Sunway(SW) many-core processor has powerful computing capacity and high bandwidth, which makes it particularly suitable for processing parallel tasks with large computational overhead. In this paper, we present a scheme to update the parameter through scaling the asynchronous stochastic gradient descent(ASGD) method to multiple processors efficiently. Our novel method is based on swCaffe, which is an efficient parallel framework on Sunway TaihuLight supercomputer. We evaluate our implementation by training LENET neural network with the MNIST dataset and CIFAR10-quick with the CIFAR10 dataset. Experimental results show that our implementation achieves considerable speedup on TaihuLight supercomputer, compared to synchronous stochastic gradient descent(SSGD) used in swCaffe.