Future surveys of the Universe face the dual challenge of data size and data statistics. The non-Gaussianity induced by gravity presents severe difficulties to robust data analysis, as the sampling distributions are unknown. In this landscape, machine learning and extreme data compression will play an essential role in being able to handle the data size and complexity, in order to reach the scientific goals such as finding the driver of the accelerated expansion of the Universe and resolving the tensions between early- and late-Universe observations. This thesis consists of three main parts. As a first application, we show that we can recover robust parameter constraints by combining emulation and compression for a challenging dataset such as KiDS-450, which consists of only 24 band powers. In particular, we build a Gaussian Process (GP) emulator for two different test cases, the first one being at the level of the band powers and the second one for the coefficients of the summary data massively compressed with the MOPED algorithm. In the former, cosmological parameter inference is accelerated by a factor of ~10-30 compared to the Boltzmann solver, CLASS, and this factor depends on whether we want to include the GP uncertainty in the inference mechanism. Importantly, with future surveys, the gain can be up to ~1000 when the Limber approximation is used. The GP formalism, along with the MOPED compression algorithm, provides us with the option of dropping the Limber approximation, without which each forward simulation is inconveniently very slow. By comparing the Kullback-Leibler divergence between the emulator likelihood and the CLASS likelihood, along with the uncertainty analysis on the inferred parameters, including the GP uncertainty does not justify the additional computational cost. In fact, the mean predictor from the GP is faster and requires a smaller memory footprint. Importantly, the number of summary statistics will be large ~10000 and the speed of the emulated MOPED coefficients depend on the number of parameters and not on the number of summary. Hence the gain in speed is very large. In the non-Limber case, this speed-up factor can be ~100 000. While the above pipeline elegantly provides a solution for speeding up computing via the MOPED algorithm, an important ingredient which we would like to propagate in the analysis is the n(z) uncertainty. Starting with the n(z) distributions and the 3D matter power spectrum, P(k,z), to the calculation of the weak lensing and intrinsic alignment power spectra, followed by the calculation of the band powers (and perhaps the MOPED coefficients if we want to), the most expensive part is the calculation of P(k,z). The latter can be computed either using a Boltzmann solver such as CLASS or using large-scale N-body simulations. Hence, in the second part of this thesis, we develop, document and share an emulator for P(k,z) using a semi-parametric Gaussian Process. Our code enables the calculation of the following quantities: the non-linear matter power spectrum with/without baryon feedback, the linear matter power spectrum at a fixed redshift, the weak lensing intrinsic alignment power spectra. In addition, the first and second derivatives of the 3D matter power spectrum with respect to the input parameters can also be calculated. The emulator is accurate when tested on an independent set of parameters, drawn from the prior. The fractional uncertainty is centered on zero. The emulator is also ~300 times faster compared to CLASS, hence opening up the possibility of sampling cosmological and nuisance parameters in a Monte Carlo routine. The software (emuPK) is distributed with a set of pre-trained Gaussian Process (GP) models, based on 1000 Latin Hypercube (LH) samples, which are roughly distributed according to priors used in current weak lensing analysis. In the third part, we use the KiDS+VIKING-450 dataset to test our emulator, emuPK. We also introduce two new components in the weak lensing likelihood analysis, namely a double sum approach (essentially re-casting the standard approach of numerically performing integration as a double sum) to calculate the weak lensing and intrinsic alignment power spectra. Moreover, we also include a novel Bayesian Hierarchical method for estimating the n(z) distributions. Early results using the novel n(z) distribution along with the emulator indicate promising avenue.