This paper proposes a scalable and efficient cacheupdate technique to improve the performance of in-memorycluster computing in Spark, a popular open-source system forbig data computing. Although the memory cache speeds up dataprocessing in Spark, its data immutability constraint requiresreloading the whole RDD when part of its data is updated. Suchconstraint makes the RDD update inefficient. To address thisproblem, we divide an RDD into partitions, and propose thepartial-update RDD (PRDD) method to enable users to replaceindividual partition(s) of an RDD. We devise two solutions to theRDD partition problem – a dynamic programming algorithm anda nonlinear programming method. Experiment results suggestthat, PRDD achieves 4.32x speedup when compared with theoriginal RDD in Spark. We apply PRDD to a billing system forChunghwa Telecomm, the largest telecommunication company inTaiwan. Our result shows that the PRDD based billing systemoutperforms the original billing system in CHT by a factor of24x in throughput. We also evaluate PRDD using the TPC-Hbenchmark, which also yields promising result.