The kernel methods play a pivotal role in machine learning algorithms. Unfortunately, working with the kernel methods must deal with an n × n kernel matrix, which is memory intensive. In this paper, we present a parallel, approximate matrix factorization algorithm, which loads only essential data to individual processors to enable parallel processing of data. Ourmethod reduces space requirement for the kernel matrix from O(n2) toO(np/m), where n is the amount of data, p the reduced matrix dimension (p «n), and m the number of processors.