To evaluate the SAPS 3 score predictive ability of hospital mortality in a large external validation cohort.Prospective observational study.A total of 28,357 patients from 147 Italian ICUs joining the Project Margherita national database of the Gruppo italiano per la Valutazione degli interventi in Terapia Intensiva (GiViTI).None.Evaluation of discrimination through ROC analysis and of overall goodness-of-fit through the Cox calibration test.Although discrimination was good, calibration turned out to be poor. The general and the South-Europe Mediterranean countries equations overestimated hospital mortality overall (SMR values 0.73 with 95% CI 0.72-0.75 for both equations) and homogeneously across risk classes. Overprediction was confirmed among important subgroups, with SMR values ranging between 0.47 and 0.82.The result strictly supported by our data is that the SAPS 3 score calibrates inadequately in a large sample of Italian ICU patients and thus should not be used for benchmarking, at least in Italian settings.