As a promising technology of big data systems, storage and computing separated architecture has attracted increasing attention of famous companies, such as Tencent, IBM, Facebook, and Microsoft. Under this new architecture, conventional query engines like Hive and Presto choose all the original data from storage nodes and send them to computing nodes to be filtered, causing high data transmission overhead and great I/O bandwidth fluctuation. To address this problem, we design a novel data processing framework to prefilter data on storage side, and then propose a CPU-FPGA (field-programmable gate array) co-design to accelerate the queries with the purpose of reducing the communication overheads and the workloads of computing nodes. To obtain the optimal efficiency of CPU-FPGA co-processing, a workload-aware task scheduler is designed to allocate query tasks to CPU or FPGA according to the estimation of the filtering data size and processing time of query tasks. A data projection scheme is designed to support data in RCFile format which is widely used in modern systems, such as Tencent and Facebook applications. To make full use of the high parallelism of FPGA, we formulate the SQL conditions of combined predicates into Boolean parameters, and design two filtering schemes on FPGA (i.e., parallel sequential filter for fix-length data type and parallel pipeline filter for variable-length data type). Experiments on the TPC-H benchmark and Tencent data set demonstrate the efficiency of our approach, which can save up to 72.28% and 80.16% of time overheads compared with Presto and Hive, respectively. [ABSTRACT FROM AUTHOR]