protonation states and docking
OK, this topic deserves a much longer and detailed discussion. I will do that another time, now just a quick reaction to a question on zinc-fans mailing list at docking.org:
> (/Why was there a need to create these subsets based
> on pH of ligands in ZINC database?Metalloenzymes deprotonate thiols, sulfonamides, and hydroxamic acids, for example. Thus you must create the deprotonated forms to “get the right answer for the right reasons”. See Irwin JJ, Raushel FM, Shoichet BK, “Virtual screening against metalloenzymes for inhibitors and substrates.”, Biochemistry, 2005, 44(37),12316-28. DOI <http://dx.doi.org/10.1021/bi050801k>.’
While it is definitely better to use ligands protonated at a specific pH that corresponds to the target environment as suggested in the response quoted above rather than using neutral forms (which is typically created by many software tools and dominates databases), I do not think it is sufficiently sophisticated. When a ligand is bound to a protein, the appropriate protonation states should be determined locally for each functional group. There are lots of examples for binding with protonation states that are “unexpected” at physiologycal pH. Correct choice of protonation states has to consider both the receptor and ligand environment, all the sorrounding effects (e.g. Serine protease catalytic triad ASP-HIS-SER capable of deprotonating even an alcohol). Therefore, the protonation state is not something that can be decided a-priori without considering the docking pose. One correct (but very time consuming) solution is to enumerate all feasible protonation states and dock each of them. A more efficient correct solution is what eHiTS does: choose the correct protonation state on-the-fly during the docking run considering the local environment and possible score values to reach with different states for each functional groups, so that a single run can find the correct pose with the right protonation state even if that differs from the input file state.
ZZ

May 16th, 2008 at 3:51 pm
I agree. I do hope you will have a chance to report to the community on use use of your algorithm. I would like to volunteer the DUD dataset for this purpose http://dud.docking.org. Kinases are fruitful targets for multiple tautomeric states of aromatic hetrocycles, and metalloenzymes are good targets for testing the effects of deprotonation in the presence of (e.g.) Zn. Both are included in ZINC.
May 17th, 2008 at 11:13 am
John,
Thanks for making these very useful databases available for the public. We have already used DUD for benchmarking our ligand based screening tool LASSO http://www.simbiosys.ca/ehits_lasso/ , see our publication: LASSO - ligand activity by surface similarity order: a new tool for ligand based virtual screening
Journal of Computer-Aided Molecular Design, http://dx.doi.org/10.1007/s10822-007-9164-5,
Published online: 18 January 2008, you can access a copy here:
http://www.simbiosys.ca/ehits_lasso/lasso_paper_JCAM_2008.pdf
We also use referred to it in several conference presentations, e.g.:
http://www.simbiosys.ca/science/presentations/2007-pfizer/LASSO-2007_Nov.pdf
http://www.simbiosys.ca/science/presentations/echeminfo-2007/LASSO_Wombat_Poster.pdf
We will definitely use it to report the results of the new eHiTS version with full protontaion treatment (see next blog post about details).
ZZ.