This study delves into the realm of part-of-speech (POS) tagging within Malaysia's multilingual context. We investigate the efficacy of CRF-QTAG, semi-supervised CRF models and rule-based systems within Bahasa Rojak Analytics—a Malay-based language processing system. By analyzing these models' performances, we observed the profound impact of retraining on their accuracy. While CRF-QTAG and semi-supervised CRF models showcased substantial improvements post-retraining, the rigidity of the rule-based system led to underperformance. The study sheds light on challenges posed by linguistic nuances in code-mixed languages and the dependence on labeled data. Our findings highlight the potential of semi-supervised models in addressing data scarcity issues and adapting to linguistic evolution. Additionally, we advocate for further research aimed at refining rule-based approaches by emphasizing linguistic comprehension and rule definition for enhanced adaptability and accuracy. Addressing these challenges can potentially pave the way for more inclusive and precise language technologies tailored to Malaysia's diverse linguistic fabric.