Fully Homomorphic Encryption (FHE), which enables arbitrary computation to be performed directly on encrypted data, is becoming promising for privacy-oriented applications, paving the way for widespread adoption of cloud computing with ideal security. The challenge for FHE lies in the speed-optimized and area-optimized implementation of Number Theoretic Transform (NTT), which is the most computation-intensive primitive in FHE. Moreover, most existing works concentrate on NTT implementations with small moduli and limited levels of parallelism. The NTT designs for a wider range of parameters with high scalability, however, are not fully developed. This paper proposes an FPGA-based hardware accelerator for NTT with high speed and area efficiency. A novel algorithmic implementation of NTT modeled on tensor products is first proposed, which provides high flexibility in parameter sets and high scalability in processing elements (PEs). Different levels of parallelism are then explored to adapt to the trade-off between performance and area efficiency. With the help of stride permutation, a non-conflict data flow control is built to significantly simplify the memory access pattern, contributing to higher performance of NTT. Implemented on a Xilinx VIRTEX-7 platform, our RTL-based design outperforms state-of-the-art FPGA works customized for FHE by 1.21× ∼ 2.73× in performance and 1.11× ∼ 9.81× in area efficiency. It can achieve an enhancement of 2.49×/ 1.25×/ 2.53×/ 2.15× on average on the resource usage of LUTs/ FFs/ BRAMs/ DSPs, respectively.