Recent advances in model pruning have enabled sparsity-aware deep neural network accelerators that improve the energy-efficiency and performance of inference tasks. We introduce SONA, a novel transform-domain neural network accelerator in which convolution operations are replaced by element-wise multiplications with sparse-orthogonal weights. SONA employs an output stationary dataflow coupled with an energy-efficient memory organization to reduce the overhead of sparse-orthogonal transform-domain kernels that are concurrently processed without any conflicts. Weights in SONA are non-uniformly quantized with bit-sparse canonical-signed-digit representations to reduce multiplications to simple additions. Moreover, for sparse fully-connected layers (FCLs), SONA introduces column-based-block structured pruning, which is integrated into the same architecture that maintains full multiply-and-accumulate (MAC) array utilization. Compared to prior dense and sparse neural networks accelerators, SONA can reduce inference energy by $5.1\times$ and $2.4 \times$ and increase performance by $5.2\times$ and $2.1\times$, respectively, for convolution layers. For sparse FCLs, SONA can reduce inference energy by $2.4\times$ and increase performance by $2\times$ compared to prior work.