The main utility of QSAR models is their ability to predict activities/properties for new chemicals, and this external prediction ability is evaluated by means of various validation criteria. As a measure for such evaluation the OECD guidelines have proposed the predictive squared correlation coefficient Q(2)(F1) (Shi et al.). However, other validation criteria have been proposed by other authors: the Golbraikh-Tropsha method, r(2)(m) (Roy), Q(2)(F2) (Schu?u?rmann et al.), Q(2)(F3) (Consonni et al.). In QSAR studies these measures are usually in accordance, though this is not always the case, thus doubts can arise when contradictory results are obtained. It is likely that none of the aforementioned criteria is the best in every situation, so a comparative study using simulated data sets is proposed here, using threshold values suggested by the proponents or those widely used in QSAR modeling. In addition, a different and simple external validation measure, the concordance correlation coefficient (CCC), is proposed and compared with other criteria. Huge data sets were used to study the general behavior of validation measures, and the concordance correlation coefficient was shown to be the most restrictive. On using simulated data sets of a more realistic size, it was found that CCC was broadly in agreement, about 96% of the time, with other validation measures in accepting models as predictive, and in almost all the examples it was the most precautionary. The proposed concordance correlation coefficient also works well on real data sets, where it seems to be more stable, and helps in making decisions when the validation measures are in conflict. Since it is conceptually simple, and given its stability and restrictiveness, we propose the concordance correlation coefficient as a complementary, or alternative, more prudent measure of a QSAR model to be externally predictive.

main util qsar model abil predict activitiesproperti new chemic extern predict abil evalu mean various valid criteria measur evalu oecd guidelin propos predict squar correl coeffici qf shi et al howev valid criteria propos author golbraikhtropsha method rm roy qf schuurmann et al qf consonni et al qsar studi measur usual accord though alway case thus doubt can aris contradictori result obtain like none aforement criteria best everi situat compar studi use simul data set propos use threshold valu suggest propon wide use qsar model addit differ simpl extern valid measur concord correl coeffici ccc propos compar criteria huge data set use studi general behavior valid measur concord correl coeffici shown restrict use simul data set realist size found ccc broad agreement time valid measur accept model predict almost exampl precautionari propos concord correl coeffici also work well real data set seem stabl help make decis valid measur conflict sinc conceptu simpl given stabil restrict propos concord correl coeffici complementari altern prudent measur qsar model extern predict