This study was designed to evaluate the susceptibility of various performance validity tests (PVTs) to limited English proficiency (LEP). A battery of free-standing and embedded PVTs was administered to 95 undergraduate students at a Romanian university, randomly assigned to the control (n = 65) or experimental malingering group (n = 30). Overall correct classification (OCC) at the first cutoff to clear .90 specificity (with group membership as criterion) was used as the main metric to compare PVTs. Mean OCC for free-standing PVTs (.784) was comparable to mean OCC for embedded PVTs (.780). Cutoffs on embedded PVTs often had to be adjusted (more conservative) to meet the specificity standard. Contrary to our predictions, embedded PVTs with high verbal mediation outperformed those with low verbal mediation (mean OCC .807 versus .719). Although multivariate models of PVTs performed very well (mean OCC = .892), several individual freestanding and embedded PVTs produced comparable mean OCC (.863-.895). Other embedded PVTs had trivial sensitivity (.03-.13) at ≥ .90 specificity. PVTs administered in both languages (English and Romanian) provided conclusive evidence of both the deleterious effects of LEP and the cross-cultural validity of existing methods of performance validity testing. Results defied most of our a priori predictions: level of verbal mediation was an influential, but not a decisive factor in the classification accuracy of PVTs; free-standing PVTs were not necessarily superior to embedded PVTs; multivariate models of performance validity assessment outperformed most, but not all their individual components. Our findings suggest that some PVTs may be inherently unfit to be used with examinees with LEP. The multiple unexpected findings signal a fundamental uncertainty about the psychometric properties of instruments developed and validated in North America when applied to examinees outside the US or Canada. Although several existing PVTs have the potential to be useful in examinees with LEP, their relevant psychometric properties should be independently verified in new target populations to ensure the validity of their clinical interpretation. The classification accuracy observed in native speakers of English cannot be assumed to transfer to members of linguistically and culturally different communities – doing so risks potentially consequential errors in performance validity assessment. Of course, the abundance of counterintuitive findings also serves as a note of caution: our findings may not generalize to different samples.