Archive for the 'Science' Category

eHiTS v12 is released

Monday, July 2nd, 2012

Cross posting from CCL:

Simulated Biomolecular Systems Inc. (SimBioSys) is happy to announce the release of eHiTS 12. eHiTS is a fragment-based molecular docking application that employs a statistically derived scoring function, and includes family-based enhancements for pose generation and scoring.

New improvements and features in eHiTS’ 12 include:

  • A new, rigorous protonation state handling mechanism that employs hydrogen bond network optimization and on-the-fly evaluation states. Output structures now include explicit hydrogens.
  • A more accurate algorithm for classification of targets to protein-families, considering both geometric and sequence criteria.
  • Family-based detection of hot-spots in binding sites, and utilization of those as attraction points during docking.
  • Docking constraints can be defined by users by fixing ligand fragments to specified locations, or by requiring specified receptor-ligand interactions to be satisfied.
  • A newly trained scoring function using an extended and highly curated knowledgebase of ligand-receptor complexes.

eHiTS’ approach to the docking problem has been unique in various aspects. It divides the ligand to fragments that are docked independently everywhere in the binding pocket, identifies compatible sets of fragment poses that reconstruct full ligand poses, and then optimizes the poses within the binding site. This algorithmic approach guarantees a comprehensive and unbiased sampling of the conformations space. An on-the-fly assessment of protonation states further alleviates potential biases, and reduces the burden on users of structure preparation. The use of protein family knowledge in eHiTS has been shown to improve pose prediction and virtual screening performance. Emphasis is given in eHiTS to ease of use, and optional automated assignment of hydrogens and charges is available.

A major advancement in the technology in the new version is the introduction of target-sites. Prior to docking, local minima are detected by probing the binding pocket thoroughly. Those local minima, along with optional user constraints and automated constraints are used as target-sites – hot-spots that are targeted during pose generation and are promoted by scoring. The use of target-sites gives rise to greater accuracy, and reduced dependence on initial conditions.

Evaluation copies of eHiTS 12 can be obtained by submitting a demo request on our website:
http://www.simbiosys.com/products/demo_request.html

About SimBioSys

SimBioSys Inc. is a Toronto based company dedicated to development of scientific tools for drug discovery and organic synthesis planning. It retains a constant focus on the innovation of algorithms to provide improved throughput and accuracy in the fields of flexible docking and virtual screening. SimBioSys is also a pioneer in the field of computer-aided retrosynthetic analysis where it supports chemists through the challenges of organic synthesis. With attention to detail, ease-of-use and improved productivity, SimBioSys has built a strong reputation of delivering state-of-the-art scientific solutions to biotechnology, pharmaceutical and other companies in the chemical industry.
www.simbiosys.com

For additional information please contact:
Orr Ravitz, PhD
Chief Operating Officer
(416) 741-4263
ravitz (_) simbiosys.com

ARChem is making a leap forward by including Stereochemistry in its retrosynthetic analysis engine

Thursday, May 10th, 2012

Our understanding of the role chirality plays in the activity of drugs has been steadily growing in recent decades. Although we cannot always explain mechanistically why different enantiomers can manifest strikingly diverging  pharmacological behaviors, we can often measure significant differences in their binding affinity, selectivity and ADME properties. Even for drugs that are currently marketed in racemic mixtures there is often evidence that one of the  enantiomers dominates the pharmacology of the drug. It is not surprising therefore, that stereo-selective methods and chiral starting materials have become pivotal to synthesis in this domain.

Including stereochemistry into our synthesis planning tool, ARChem, has been a major undertaking at SimBioSys. The development encompassed many layers, from algorithmic perception of the full spectrum of stereogenic types, through representation of stereochemical reactions, to the proper depiction of molecules. We are very excited to release the first version of ARChem to address stereochemistry. It offers the following capabilities:

  1. Full perception of stereochemistry in the target molecule.

  2. Matching of literature precedents with proper chirality during the retrosynthetic analysis.

  3. Matching proper enantiomers from the collection of starting materials.

The synthesis for Azalanstat below demonstrates the utility of these features. The synthetic route suggested by ARChem, generates one of the chiral centers of the target molecule using an enantioselective reaction step taken from a specific literature example, whereas the other chiral center is introduced using a chiral starting material.

Azalanstat

All steps in the plan are supported by literature examples, and all starting materials are found in catalogs of commercial suppliers.

While we hope you share our satisfaction with this accomplishment, our work on stereochemistry is far from complete. The next few months will be dedicated to developing the capability of generating enantioselective reaction rules. This will allow ARChem to provide the novelty and robustness it achieves in the synthesis design of achiral compounds, and will further enhance its usefulness as a synthetic idea generator.

Movie on: ARChem in a nutshell

Monday, November 28th, 2011

One of our customers recently asked us to provide him with a short presentation explaining our retrosynthetic analysis software, ARChem, so that he would be able to advertise it to potential users within his organization. Since, to paraphrase the old adage, a clip is worth a thousand slides, we opted for a 5 minutes video.

It’s not easy to squeeze the essence of a product like ARChem into a short video, since it has so many facets: the search engine, the solutions display, solutions filtering, interfacing with reaction databases not to mention all the science that is at work under the hood. So we decided to focus on the core value of ARChem: the ability to harvest knowledge from experimental data, and to convert the knowledge to ideas. In 5 minutes we show, without discussing the fascinating underlying technology, the available search strategies, solutions viewing and construction, sharing ideas with your fellow researchers, and viewing literature examples. Please see the movie at:

ARChem movie http://www.simbiosys.com/archem/video/

We hope you will find it interesting.

eHiST Tune and Score methods are published with validation results

Wednesday, November 16th, 2011

Our article on eHiTS Tune and Score is now available online:

Improving molecular docking through eHiTS’ tunable scoring function
Journal of Computer-Aided Molecular Design
DOI: 10.1007/s10822-011-9482-5

The article contains lots of useful information about eHiTS Score and Tune algorithms, as well as it gives validation of the same using a number of test sets, including CDK2 and BACE1 for pose prediction, DUD for virtual screening and PDBBind for affinity prediction. eHiTS results are compared with other programs’ published results where such data were available. For example enrichment results on the DUD set were compared using results from Cross et.al (DOI: 10.1021/ci900056c)

eHiTS comparision on DUD

In conclusion, the article states that knowledge-based approaches are mainstream methods today, because they benefit from the ever expanding base of experimental data and from continuous progress in computational methods, and that score tuning is a natural extension of that concept. The authors also hope to solicit for wider use of the score tuning methodology and creation of test sets in the user community.

See the full article here
http://www.springerlink.com/content/r1t66167718h5110/

Zsolt Zsoldos from SimBioSys to present at the MCADD Fall 2011 Seminar on Oct 5th

Wednesday, September 14th, 2011

MCADD announcement:

=================

The Montreal Computer-Aided Drug Design (MCADD) organizing committee is annoucing the Fall 2011 seminar will feature Zsolt Zsoldos of SimBioSys Inc. (http://www.simbiosys.ca/) . He will be presenting his talk entitled “Automated tuning of eHiTS scoring weights specific to protein families”. The seminar will be October 5th at 3pm in room 501 of the Goodman Cancer Center of McGill University. This seminar will be followed by a wine and cheese reception afterwards. We look forward to seeing you there and please feel free to forward this email to anyone interested in attending the seminar and/or joining the MCADD Group (students and post-docs are welcome).

Additionally we invite you to follow and receive announcements from the MCADD community on linkedin. Just ask to join the Montreal Computer-Aided Drug Design group
( http://www.linkedin.com/groups?gid=2983304&trk=myg_ugrp_ovr ).

Christopher Corbeil
Chair, 4th MCADD Organizing Committee

Organizing Committee members:
Pierre Bonneau, Boehringer Ingelheim (Canada) Ltd.
Araz Jakalian, Boehringer Ingelheim (Canada) Ltd.
Enrico O. Purisima, NRC-BRI
Constatin Yannopolous, Vertex Canada

==============================================================
MCADD Seminar

Date: October 5th, 2011

Location:    Room 501 (Karp Conference Room) Goodman Cancer Center, McGill University, 1160 Pine Ave. West, Montreal, Quebec

Time:     3:00pm - Seminar: /Automated tuning of eHiTS scoring weights specific to protein families/, Zsolt Zsoldos,  SimBioSys Inc. Toronto, Canada

4:00pm - Cocktail/Wine

===============================================
Abstract

The molecular docking paradigm, has thus far failed to produce a generic approach that would deliver accurate pose prediction capabilities, and reliable rank-ordering of conformations and ligands consistently for any biological system of interest. This reality, which has been addressed by numerous methodology papers and comparative studies, has been largely attributed to the inability of scoring functions to capture different chemical interaction types at a uniform level of accuracy. Several studies attempted to develop guidelines for choosing the most suitable docking and scoring method for a specific problem based on protein family classification of the target, dominant interactions, and other properties of the studied system. Consensus techniques, on the other hand, try to synergistically integrate information from multiple sources  assuming agreement between different methods is indicative of more accurate values. Both approaches, however, have shown only limited success in improving binding mode and activity prediction capabilities.

An alternative solution, and arguably a more rigorous one, would be to tailor the scoring function for the system of interest. eHiTS uses a novel scoring method consisting of a statistical knowledge base focused  on interacting surface points and physical terms combined with an adaptive parameter scheme. This  approach offers users the capability to fine-tune the scoring function using their data and thus incorporate  their full body of knowledge in a systematic and automatic fashion. In many realistic drug discovery  scenarios, structural and ligand-activity information is sufficient in a statistical sense to adjust a limited set  of parameters representing the relative weights of the various terms in the eHiTS scoring function. During tuning, receptor targets are clustered according to the chemical and shape similarity of the active site, and weight sets are optimized for each family. Pharmacophore constraint descriptions are thus generated automatically from the recurring interaction patterns observed in a specific active set profile. These constraints can be used for constrained docking or pharmacophore-enhanced scoring schemes.

In this talk, an overview of the eHiTS’ tuning utility will be given, outlining the underlying methodology. Results will be presented showing the enhancements achieved by the tuning process on docking and scoring performance.

Score tuning, available in eHiTS, is gaining grounds in docking

Friday, April 29th, 2011

You may have come across a recent paper by a group of researchers from UCSD, Leeds and Stony Brook that utilized eHiTS for identifying targets for drug repurposing. The paperA Machine Learning-Based Method To Improve Docking Scoring Functions and Its Application to Drug Repurposing” (http://pubs.acs.org/doi/abs/10.1021/ci100369f) introduces an “inverse screening” scenario in which one searches for receptors that may bind a compound, in this case a known drug, and will suggest new therapeutic benefits for the molecule. The authors demonstrate experimentally and using benchmarks, how customizing the score to the receptors of interest improves the ability to identify active compounds, and to give a rough estimate for their relative activity.

The authors chose to use eHiTS not only because of its good performance as a docking tool, but also because it facilitates the score tuning exercise. eHiTS reports to the user not only a single score value, but also the individual terms, 20 in total, that build-up this value. The study examined various ways to recombine some of the score terms such that receptor-specific properties can be reproduced with improved accuracy.

An erratum for the paper was published yesterday (http://pubs.acs.org/doi/abs/10.1021/ci2001346), revising all the “native” eHiTS results, i.e. the out-of-the-box results without additional tuning. The original paper displayed results that were not in line with results obtained by us and others (see, for example: http://pubs.acs.org/doi/abs/10.1021/ci100374f), which prompted us to contact the authors. A few good words about the authors are in place. Once they heard our concerns, they acted swiftly and with full transparency to elucidate the problems, and once the source of the error was identified, they moved rapidly to publish the correction. This level of openness, integrity and cooperation should not be taken for granted, and we salute the team of researchers for their approach.

The root of the error was in the misidentification of the relevant scoring value in the eHiTS output. As stated in the erratum, eHiTS’ output includes an Energy value and a Score value. The Score value is the term that should be used in pose prediction and virtual screening scenarios. It is a scoring scheme that is trained on PDB complexes and is designed to reproduce crystallographic poses with high fidelity. In the scoring function training, many ligand poses are generated for each PDB complex, and the scoring functions is optimized to generate good score-RMSD correlation. Implicitly this process involves positive and negative data – the correct poses which are the objective and are to be promoted vs unrealistic poses which are rejected and suppressed. The eHiTS-Energy, on the other hand, is a scoring scheme that is designed to rank-order known active molecules. It is trained to produce score-binding affinity correlation, and therefore it is trained on positive data only. Hence, the eHiTS-Energy prospects of differentiating between actives and inactives are slim, which is demonstrated in the ROC charts of the original paper. eHiTS-Score, as shown in the erratum, strongly outperforms the energy in almost all cases, and generally shows good screening capabilities in most cases.

The main two conclusions of the paper are that (i) score tuning is a powerful approach to improve docking results, specifically in screening, and that (ii) non-linear methods for combining the scoring terms are superior to linear methods in this respect. We strongly support both observations. In fact, we have learned those lessons during the development of eHiTS’ scoring function, and therefore eHiTS adopted these principles a couple of years ago. Family-based scoring is available for many cases, and non-linear methods are central in its implementation. For dozens of protein-families, for which several complexes are available in the PDB, eHiTS provides a customized scoring which is invoked automatically by analyzing the geometry of any receptor provided by the user. When the user’s target is not matched to any family in eHiTS’ knowledge-base, a default scoring scheme is used. More about the tuning approaches in eHiTS can be found in this presentation:

http://www.simbiosys.com/science/presentations/2010-03-acs/eHiTS_239_ACS_website.pdf

We are pleased to see that score customization is growingly recognized as a promising path for improving the molecular docking paradigm. The new upcoming version of eHiTS will include even more sophisticated methods of utilizing experimental data. More about this in future posts.

Posted by Orr

SimBioSys presentations at the Spring 2011 ACS meeting

Friday, April 1st, 2011

We gave two presentations this week at the Spring 2011 ACS meeting in Anaheim,  Calif. One was about eHiTS, in the session: “Docking and Scoring: A Review of Docking Programs”, and the other was about ARChem, in the computer aided synthesis design symposium in honour of Prof. J.B Hendrickson. Both presentations are now available online for anyone to review and comment. Feel free to post your comments here on this blog post or provide feedback offline.

Orr Ravitz presented on overview of the CASD field with lessons learnt from the past and suggested ways forward. Presentation title: “Back to the future of synthesis planning: how new technology and new resources revitalize the vision of computer aided synthesis design”;
view presentation. ( http://www.simbiosys.com/science/presentations/2011-03-acs/ARChem_241_ACS_web.pdf )

Zsolt Zsoldos presented eHiTS 2009.1 results on the cleaned up Astex-set  (goal assessment of docking pose prediction accuracy) and DUD-set (goal assessment of virtual screening power of docking). The data was curated and mandated by the symposium organisers by Dr. Greg Warren and Dr. Neysa Nevins. Lessons learnt: with clean / better data, one gets better docking accuracy! See the latest eHiTS results with this data set, in our presentation: “Recent developments in the eHiTS ligand docking and scoring software“;
view presentation. ( http://www.simbiosys.com/science/presentations/2011-03-acs/ACS2011_eHiTS.pdf )

posted by: Aniko

Computer-aided Organic Synthesis Design session at the upcoming ACS meeting

Monday, March 21st, 2011

We, at SimBioSys, are honoured to present at the upcoming Spring 2011 ACS meeting in a full day session devoted computer-aided organic synthesis design. The session will be this Sunday, Mar 27, titled:  “50 Years of Computers in  Organic Chemistry: Symposium in Honor of James B. Hendrickson”. Please visit this session (Location: Anaheim Convention Center, Room 213C) and our talk (at 10:40 am). It should be a great and rare event.

Dr. Rachelle Bienstock,  ACS CINF Chair, Program Committee wrote
( at http://www.acscinf.org/meetings/241/highlights241.php ):

Dr. Martin Walker has organized a very special symposium for this Spring Anaheim meeting in honor of his mentor, Dr. James Hendrickson, “Fifty Years of Computers in Organic Chemistry: A Symposium in Honor of James B. Hendrickson”. Dr. Hendrikson, Professor Emeritus of Chemistry, Brandeis University, was a pioneer in the field of computer-aided organic synthesis design and was one of the early visionaries in this field. The designer of the programs SYNGEN and WebReactions, much current work in the field is built on the early work of his research group. Many of the successful students who trained with Dr. Hendrickson over his long career, or those whose work was built on ideas and concepts originating from Dr. Hendrickson’s work, will be speaking in this symposium including Dr. Paul A. Wender, Bergstom Professor in Chemistry, Stanford University; Dr. Phil S. Baran, Professor, Scripps Research Institute; Dr. Valentina Eigner-Pitto, InfoChem GmbH and Dr. Orr Ravitz, SimBioSys Inc.

To see more on the SimBioSys’ upcoming ACS presentation in this session visit:
http://www.simbiosys.com/science/presentations/2011-03-acs/abstract1.htm)

ACS’ most read articles & eHiTS news

Tuesday, March 1st, 2011

We have just discovered some great eHiTS results in the e-mail notification from ACS publishing, titled: “Most Read Articles in January from the Journal of Chemical Information and Modeling”. In the list of papers that received the highest exposure in January is:

Virtual Decoy Sets for Molecular Docking Benchmarks
Izhar Wallach and Ryan Lilien
DOI:  http://c.acs.org/ceyra/321469/70/350639/6508/0/S/0/0/tvma.html

The paper demonstrates how the composition of a screening benchmark (the DUD set in this case), and particularly the physical properties of the decoy set affects the enrichment outcome. This is a valuable insight for us as software developers, and an important one for comp.chems at large. The paper presents the screening performance of eHiTS, as well as of Glide’s on the Dud targets, and although the paper is not attempting to compare the programs, they seem to be on a par with each other in retrieving active molecules in virtual screening.

Another publication with eHiTS, presenting a successful virtual screening workflow for Human IKK-2 Inhibitors was also brought to my attention in the last few days:
February 24, 2011:  PLoS ONE 6(2): e16903. “Identification of Human IKK-2 Inhibitors of Natural Origin (Part I): Modeling of the IKK-2 Kinase Domain, Virtual Screening and Activity Assays”
Sala E, Guasch L, Iwaszkiewicz J, Mulero M, Salvado M-J, et al. (2011)
DOI:10.1371/journal.pone.0016903
http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0016903

Both of these were added to our list of user papers with SimBioSys tools:
http://www.simbiosys.com/science/presentations/2011-03-acs/index.html 
We are looking forward to your paper with eHiTS results, please let us know about it, as soon as published, so that your work is added to our ever growing list of user publications.

To learn more about eHiTS, please contacts us by e-mail or join our docking talk at the upcoming ACS 2011 Spring meeting, in Anaheim, California:

COMP: 59
(http://abstracts.acs.org/chem/241nm/program/view.php?obj_id=63915&terms=)
“Recent developments in the eHiTS ligand docking and scoring software”
Zsolt Zsoldos, Orr Ravitz,  SimBioSys Inc., Toronto, Ontario, Canada
Session: Docking and Scoring: A Review of Docking Programs
Time: Monday, Mar 28, 2011 10:20 AM
Location: Anaheim Convention Center, Room 213 B

see more details at:
http://www.simbiosys.com/science/presentations/2011-03-acs/index.html

Can we trust docking results ?

Friday, September 3rd, 2010

The question is asked and answered by a group of researchers from the University of Warsaw in a recently published paper (http://onlinelibrary.wiley.com/doi/10.1002/jcc.21643/abstract). They performed a comparison of 7 docking and scoring programs to evaluate pose prediction and score accuracy on a large set of 1300 PDB complexes. They performed a fairly thorough study asking some important questions, such as how the starting ligand conformations influence the results and how the results differ for small or large ligands, mostly hydrophobic or mostly polar interaction. The good news they report is that, statistically, overall results do not seem to be influenced by the starting conformations, although there is a slight advantage in some programs for the X-ray conformation, which is understandable. The bad news is that ligand size does matter: while we are very successful with small, fairly rigid molecules, large floppy ones still prove to be hard to handle for all programs. The really ugly news is that none of the scoring functions provided adequate correlation with binding energy.

The results are divided into 3 major sections: pose prediction accuracy, score correlation with experimental binding energy and score-rmsd correlation (ranking performance of the scoring functions). The authors’ conclusion of the pose prediction exercise can be summarized by the following quote:

“On the basis of those results, we can order programs in the following way: GOLD ~ eHiTS > Surflex > Glide > LigandFit > FlexX > AutoDock. The best programs have the average RMSD top score around 2.7 A, and it increases to nearly 4.5 A for the weakest FlexX. As expected, better results were observed for best pose conformations (Fig. 4). For those poses, the mean RMSD value was even below 2 A for GOLD, eHiTS, and Surflex. … Moreover, the percentage of pairs for which top score conformation is below 2 A shows that even for the best programs the success rate is below 60%, and in some cases even below 40%.”

Based on the score-energy correlation performance, the authors divided the programs into three categories. The best one is “composed of functions implemented in eHiTS and in Surflex, which gave Pearson correlation 0.38 and 0.33, respectively. Moreover, for eHiTS scoring function very high-Spearman correlation was obtained…” The Pearson correlations for the middle and worse categories are  in the range of 0.17-0.25 and less than 0.1 respectively. The authors rightly conclude that the score-energy correlation results are inadequate even “for the best program, namely, eHiTS“.

Finally, in the ranking performance comparison (correlation of score with quality of poses) AutoDock achieved the highest 0.32 correlation with eHiTS as close second with ~0.3 correlation. So, what is the final conclusion of the authors with regards to answering the question in the title ? Here is the quote with the answer:

“Thus, can we trust docking programs? The answer must be given individually for two aspects of docking programs. In terms of pose prediction, we can say that GOLD and eHiTS performance is accurate enough … In the case of scoring functions, the answer must be negative, as virtually no correlations could be observed between docking score and in vitro binding affinities … the empirically derived functions have now reached the saturation of year-to-year improvement … The future direction should be either to use statistical approach based on increasing number of X-ray protein-ligand complexes, as can be determined from the results obtained by eHiTS scoring functions, or to develop completely new approaches in terms of predicting in vivo activity of the ligand.”

I am very happy to see that eHiTS came up among the best-2 contenders for all three aspects of the comparison (while the other-best were three different programs for the 3 aspects). On the other hand, I agree with the authors that there is still a lot of room and need for significant improvements both in terms of pose prediction (~60% success rate) and score accuracy (~0.4 correlation). Furthermore, we definitely need such thorough and large-scale performance comparisons as this one in the future to continuously assess the state of the art until some programs (hopefully eHiTS remaining on the lead) will reach adequate performance.


Posted by Zsolt