CRYSTALS-Kyber is the first quantum-resilient, lattice-based Public Key Encryption (PKE)/Key Encapsulation Mechanism (KEM) cryptosystem that is chosen by the ongoing National Institute of Standards and Technology post-quantum cryptography standardization (NIST PQC) for standardization. This work presents a lightweight and efficient, FPGA-based hardware implementation for polynomial multiplication unit (NTT), which is the major bottleneck in the Kyber scheme. As a first step, an optimzed modular multiplication architecture combining KRED and lookup table-based algorithms is presented, which reduces the resources of slices by 16.7%. It is used in a pipelined NTT/INTT architecture that is completely BRAM free and instead uses 3 FIFOs for coefficients storage. We hereby present the most compact FPGA based design for NTT architecture in Kyber till date. Experimental results bench marked on comparable FPGA devices show that our proposed design is 36-75% better than the state-of-the-art implementations in terms of hardware efficiency for NTT/INTT calculations and $3.4-4.4\times$ better for the Point-wise Multiplication (PWM) operation.