With the prevalence of CMOS cameras in many computer vision applications, there is an increase in the appearance of rolling shutter (RS) artifacts in captured videos. However, existing video super-resolution algorithms assume that the motion is globally consistent in each video frame and no rolling shutter effect is present. The problem of video super-resolution for video captured using RS cameras is challenging as the model needs to learn the row-wise local pixel displacements and the global structure of the frame for RS correction and super-resolution, simultaneously. Different from existing works, we address a more realistic problem of joint rolling shutter correction and super-resolution (RS-SR). We introduce a novel architecture, deformable Patch Attention Network (PatchNet), that utilizes patch-recurrence property along with deformable receptive fields to learn the global and local structure of the video. Specifically, PatchNet leverages bi-directional motion field in the feature space to extract relevant information from neighboring patches using attention mechanism, and deformable fields using deformable convolutions to extract local pixel-level information for joint rolling shutter correction and super-resolution. Our work is the first to tackle the task of RS correction and super-resolution on the recently released BS-RSCD dataset. Experiments on the BS-RSCD and FastecRS datasets demonstrate that our model performs favorably against various state-of-the-art approaches. Project details are available at https://akashagupta.com/publication/wacv23_patchnet/project.html