학술논문

Home

자료검색

학술논문

검색결과 돌아가기

검색화면

내보내기 프린트

Accelerating Queries of Big Data Systems by Storage-Side CPU-FPGA Co-Design.

Resource Type: Article
Authors: Zhan, Jinyu; Jiang, Wei; Li, Ying; Wu, Junting; Zhu, Jianping; Yu, Jinghuan
Source: IEEE Transactions on Computer-Aided Design of Integrated Circuits & Systems. Jul2022, Vol. 41 Issue 7, p2128-2141. 14p.
Subject: *BIG data
*PARTICIPATORY design
*ELECTRONIC data processing
*FIELD programmable gate arrays
*GATE array circuits
*SQL
Language
ISSN: 0278-0070

Online Access

초록

As a promising technology of big data systems, storage and computing separated architecture has attracted increasing attention of famous companies, such as Tencent, IBM, Facebook, and Microsoft. Under this new architecture, conventional query engines like Hive and Presto choose all the original data from storage nodes and send them to computing nodes to be filtered, causing high data transmission overhead and great I/O bandwidth fluctuation. To address this problem, we design a novel data processing framework to prefilter data on storage side, and then propose a CPU-FPGA (field-programmable gate array) co-design to accelerate the queries with the purpose of reducing the communication overheads and the workloads of computing nodes. To obtain the optimal efficiency of CPU-FPGA co-processing, a workload-aware task scheduler is designed to allocate query tasks to CPU or FPGA according to the estimation of the filtering data size and processing time of query tasks. A data projection scheme is designed to support data in RCFile format which is widely used in modern systems, such as Tencent and Facebook applications. To make full use of the high parallelism of FPGA, we formulate the SQL conditions of combined predicates into Boolean parameters, and design two filtering schemes on FPGA (i.e., parallel sequential filter for fix-length data type and parallel pipeline filter for variable-length data type). Experiments on the TPC-H benchmark and Tencent data set demonstrate the efficiency of our approach, which can save up to 72.28% and 80.16% of time overheads compared with Presto and Hive, respectively. [ABSTRACT FROM AUTHOR]

공지

DAU Library

학술논문

요약정보

Accelerating Queries of Big Data Systems by Storage-Side CPU-FPGA Co-Design.

Online Access

초록