We describe the design and performance of the GRAPE-MPs, a series of SIMD accelerator boards for quadruple/hexuple/octuple-precision arithmetic operations. Basic design of GRAPE-MPs is that it consists of a number of processing elements (PE) and memory components which handle data with quadruple/hexuple/octuple-precision. A GRAPE-MPs processor is implemented on a structured ASIC chip and an FPGA chip. GRAPE-MP (quadruple-precision) uses a structured ASIC chip from eASIC corp., which has 6 PE and operates with 100MHz clock cycle. The theoretical peak quadruple-precision performance of the single board is 1.2 Gflops and the achieved performance for the Feynman loop integrals is about 0.5 Gflops. GRAPE-MP4/6/8 (quadruple/hexuple/octuple-precision) uses an FPGA chip from Aletra corporation. For example, in the current implementation, MP8 has 10 PE with 70MHz operation clock cycle. We also present the performance results with the multiple GRAPE-MPs boards. The achieved performance of four MP8 boards is about 1.6 Gflops. It is roughly 90 times faster than the performance of a single core of a CPU with comparable precision. We show that our hardware based approach to evaluate the Feynman loop integrals in high precision arithmetic operations is highly effective.