Crowdsourced measurements solve the problem of being able to assess the performance of a communication network from an end-user perspective, but the new characteristics of the data pose new challenges for QoE modeling. In contrast to existing laboratory or network measurements, this type of measurement at the end user device primarily involves taking a large number of short sample measurements, which, however, are rich in measured parameters, including many user-, application-, and device-related parameters. To test the applicability and to facilitate the integration of such data, we applied four QoE models from the literature to 290k worldwide video streaming measurements from a commercial data set from August to October 2020. In this work, we will therefore first describe the crowdsourcing video streaming data set to provide insights into the properties of video streaming KPIs in the real world. Second, we run four popular QoE models using this data set, compare the resulting QoE scores, and derive the impact of individual KPIs for each model. We show that the models assess the QoE at least differently, but sometimes with contradicting statements. Reading this paper, it becomes evident that more work and subjective studies, based on real-world data like the one we have shown, are needed to extend the current QoE models.