Archive for the 'Science' Category

Movie on: ARChem in a nutshell

Monday, November 28th, 2011

One of our customers recently asked us to provide him with a short presentation explaining our retrosynthetic analysis software, ARChem, so that he would be able to advertise it to potential users within his organization. Since, to paraphrase the old adage, a clip is worth a thousand slides, we opted for a 5 minutes video.

It’s not easy to squeeze the essence of a product like ARChem into a short video, since it has so many facets: the search engine, the solutions display, solutions filtering, interfacing with reaction databases not to mention all the science that is at work under the hood. So we decided to focus on the core value of ARChem: the ability to harvest knowledge from experimental data, and to convert the knowledge to ideas. In 5 minutes we show, without discussing the fascinating underlying technology, the available search strategies, solutions viewing and construction, sharing ideas with your fellow researchers, and viewing literature examples. Please see the movie at:

ARChem movie http://www.simbiosys.com/archem/video/

We hope you will find it interesting.

eHiST Tune and Score methods are published with validation results

Wednesday, November 16th, 2011

Our article on eHiTS Tune and Score is now available online:

Improving molecular docking through eHiTS’ tunable scoring function
Journal of Computer-Aided Molecular Design
DOI: 10.1007/s10822-011-9482-5

The article contains lots of useful information about eHiTS Score and Tune algorithms, as well as it gives validation of the same using a number of test sets, including CDK2 and BACE1 for pose prediction, DUD for virtual screening and PDBBind for affinity prediction. eHiTS results are compared with other programs’ published results where such data were available. For example enrichment results on the DUD set were compared using results from Cross et.al (DOI: 10.1021/ci900056c)

eHiTS comparision on DUD

In conclusion, the article states that knowledge-based approaches are mainstream methods today, because they benefit from the ever expanding base of experimental data and from continuous progress in computational methods, and that score tuning is a natural extension of that concept. The authors also hope to solicit for wider use of the score tuning methodology and creation of test sets in the user community.

See the full article here
http://www.springerlink.com/content/r1t66167718h5110/

Zsolt Zsoldos from SimBioSys to present at the MCADD Fall 2011 Seminar on Oct 5th

Wednesday, September 14th, 2011

MCADD announcement:

=================

The Montreal Computer-Aided Drug Design (MCADD) organizing committee is annoucing the Fall 2011 seminar will feature Zsolt Zsoldos of SimBioSys Inc. (http://www.simbiosys.ca/) . He will be presenting his talk entitled “Automated tuning of eHiTS scoring weights specific to protein families”. The seminar will be October 5th at 3pm in room 501 of the Goodman Cancer Center of McGill University. This seminar will be followed by a wine and cheese reception afterwards. We look forward to seeing you there and please feel free to forward this email to anyone interested in attending the seminar and/or joining the MCADD Group (students and post-docs are welcome).

Additionally we invite you to follow and receive announcements from the MCADD community on linkedin. Just ask to join the Montreal Computer-Aided Drug Design group
( http://www.linkedin.com/groups?gid=2983304&trk=myg_ugrp_ovr ).

Christopher Corbeil
Chair, 4th MCADD Organizing Committee

Organizing Committee members:
Pierre Bonneau, Boehringer Ingelheim (Canada) Ltd.
Araz Jakalian, Boehringer Ingelheim (Canada) Ltd.
Enrico O. Purisima, NRC-BRI
Constatin Yannopolous, Vertex Canada

==============================================================
MCADD Seminar

Date: October 5th, 2011

Location:    Room 501 (Karp Conference Room) Goodman Cancer Center, McGill University, 1160 Pine Ave. West, Montreal, Quebec

Time:     3:00pm - Seminar: /Automated tuning of eHiTS scoring weights specific to protein families/, Zsolt Zsoldos,  SimBioSys Inc. Toronto, Canada

4:00pm - Cocktail/Wine

===============================================
Abstract

The molecular docking paradigm, has thus far failed to produce a generic approach that would deliver accurate pose prediction capabilities, and reliable rank-ordering of conformations and ligands consistently for any biological system of interest. This reality, which has been addressed by numerous methodology papers and comparative studies, has been largely attributed to the inability of scoring functions to capture different chemical interaction types at a uniform level of accuracy. Several studies attempted to develop guidelines for choosing the most suitable docking and scoring method for a specific problem based on protein family classification of the target, dominant interactions, and other properties of the studied system. Consensus techniques, on the other hand, try to synergistically integrate information from multiple sources  assuming agreement between different methods is indicative of more accurate values. Both approaches, however, have shown only limited success in improving binding mode and activity prediction capabilities.

An alternative solution, and arguably a more rigorous one, would be to tailor the scoring function for the system of interest. eHiTS uses a novel scoring method consisting of a statistical knowledge base focused  on interacting surface points and physical terms combined with an adaptive parameter scheme. This  approach offers users the capability to fine-tune the scoring function using their data and thus incorporate  their full body of knowledge in a systematic and automatic fashion. In many realistic drug discovery  scenarios, structural and ligand-activity information is sufficient in a statistical sense to adjust a limited set  of parameters representing the relative weights of the various terms in the eHiTS scoring function. During tuning, receptor targets are clustered according to the chemical and shape similarity of the active site, and weight sets are optimized for each family. Pharmacophore constraint descriptions are thus generated automatically from the recurring interaction patterns observed in a specific active set profile. These constraints can be used for constrained docking or pharmacophore-enhanced scoring schemes.

In this talk, an overview of the eHiTS’ tuning utility will be given, outlining the underlying methodology. Results will be presented showing the enhancements achieved by the tuning process on docking and scoring performance.

Score tuning, available in eHiTS, is gaining grounds in docking

Friday, April 29th, 2011

You may have come across a recent paper by a group of researchers from UCSD, Leeds and Stony Brook that utilized eHiTS for identifying targets for drug repurposing. The paperA Machine Learning-Based Method To Improve Docking Scoring Functions and Its Application to Drug Repurposing” (http://pubs.acs.org/doi/abs/10.1021/ci100369f) introduces an “inverse screening” scenario in which one searches for receptors that may bind a compound, in this case a known drug, and will suggest new therapeutic benefits for the molecule. The authors demonstrate experimentally and using benchmarks, how customizing the score to the receptors of interest improves the ability to identify active compounds, and to give a rough estimate for their relative activity.

The authors chose to use eHiTS not only because of its good performance as a docking tool, but also because it facilitates the score tuning exercise. eHiTS reports to the user not only a single score value, but also the individual terms, 20 in total, that build-up this value. The study examined various ways to recombine some of the score terms such that receptor-specific properties can be reproduced with improved accuracy.

An erratum for the paper was published yesterday (http://pubs.acs.org/doi/abs/10.1021/ci2001346), revising all the “native” eHiTS results, i.e. the out-of-the-box results without additional tuning. The original paper displayed results that were not in line with results obtained by us and others (see, for example: http://pubs.acs.org/doi/abs/10.1021/ci100374f), which prompted us to contact the authors. A few good words about the authors are in place. Once they heard our concerns, they acted swiftly and with full transparency to elucidate the problems, and once the source of the error was identified, they moved rapidly to publish the correction. This level of openness, integrity and cooperation should not be taken for granted, and we salute the team of researchers for their approach.

The root of the error was in the misidentification of the relevant scoring value in the eHiTS output. As stated in the erratum, eHiTS’ output includes an Energy value and a Score value. The Score value is the term that should be used in pose prediction and virtual screening scenarios. It is a scoring scheme that is trained on PDB complexes and is designed to reproduce crystallographic poses with high fidelity. In the scoring function training, many ligand poses are generated for each PDB complex, and the scoring functions is optimized to generate good score-RMSD correlation. Implicitly this process involves positive and negative data – the correct poses which are the objective and are to be promoted vs unrealistic poses which are rejected and suppressed. The eHiTS-Energy, on the other hand, is a scoring scheme that is designed to rank-order known active molecules. It is trained to produce score-binding affinity correlation, and therefore it is trained on positive data only. Hence, the eHiTS-Energy prospects of differentiating between actives and inactives are slim, which is demonstrated in the ROC charts of the original paper. eHiTS-Score, as shown in the erratum, strongly outperforms the energy in almost all cases, and generally shows good screening capabilities in most cases.

The main two conclusions of the paper are that (i) score tuning is a powerful approach to improve docking results, specifically in screening, and that (ii) non-linear methods for combining the scoring terms are superior to linear methods in this respect. We strongly support both observations. In fact, we have learned those lessons during the development of eHiTS’ scoring function, and therefore eHiTS adopted these principles a couple of years ago. Family-based scoring is available for many cases, and non-linear methods are central in its implementation. For dozens of protein-families, for which several complexes are available in the PDB, eHiTS provides a customized scoring which is invoked automatically by analyzing the geometry of any receptor provided by the user. When the user’s target is not matched to any family in eHiTS’ knowledge-base, a default scoring scheme is used. More about the tuning approaches in eHiTS can be found in this presentation:

http://www.simbiosys.com/science/presentations/2010-03-acs/eHiTS_239_ACS_website.pdf

We are pleased to see that score customization is growingly recognized as a promising path for improving the molecular docking paradigm. The new upcoming version of eHiTS will include even more sophisticated methods of utilizing experimental data. More about this in future posts.

Posted by Orr

SimBioSys presentations at the Spring 2011 ACS meeting

Friday, April 1st, 2011

We gave two presentations this week at the Spring 2011 ACS meeting in Anaheim,  Calif. One was about eHiTS, in the session: “Docking and Scoring: A Review of Docking Programs”, and the other was about ARChem, in the computer aided synthesis design symposium in honour of Prof. J.B Hendrickson. Both presentations are now available online for anyone to review and comment. Feel free to post your comments here on this blog post or provide feedback offline.

Orr Ravitz presented on overview of the CASD field with lessons learnt from the past and suggested ways forward. Presentation title: “Back to the future of synthesis planning: how new technology and new resources revitalize the vision of computer aided synthesis design”;
view presentation. ( http://www.simbiosys.com/science/presentations/2011-03-acs/ARChem_241_ACS_web.pdf )

Zsolt Zsoldos presented eHiTS 2009.1 results on the cleaned up Astex-set  (goal assessment of docking pose prediction accuracy) and DUD-set (goal assessment of virtual screening power of docking). The data was curated and mandated by the symposium organisers by Dr. Greg Warren and Dr. Neysa Nevins. Lessons learnt: with clean / better data, one gets better docking accuracy! See the latest eHiTS results with this data set, in our presentation: “Recent developments in the eHiTS ligand docking and scoring software“;
view presentation. ( http://www.simbiosys.com/science/presentations/2011-03-acs/ACS2011_eHiTS.pdf )

posted by: Aniko

Computer-aided Organic Synthesis Design session at the upcoming ACS meeting

Monday, March 21st, 2011

We, at SimBioSys, are honoured to present at the upcoming Spring 2011 ACS meeting in a full day session devoted computer-aided organic synthesis design. The session will be this Sunday, Mar 27, titled:  “50 Years of Computers in  Organic Chemistry: Symposium in Honor of James B. Hendrickson”. Please visit this session (Location: Anaheim Convention Center, Room 213C) and our talk (at 10:40 am). It should be a great and rare event.

Dr. Rachelle Bienstock,  ACS CINF Chair, Program Committee wrote
( at http://www.acscinf.org/meetings/241/highlights241.php ):

Dr. Martin Walker has organized a very special symposium for this Spring Anaheim meeting in honor of his mentor, Dr. James Hendrickson, “Fifty Years of Computers in Organic Chemistry: A Symposium in Honor of James B. Hendrickson”. Dr. Hendrikson, Professor Emeritus of Chemistry, Brandeis University, was a pioneer in the field of computer-aided organic synthesis design and was one of the early visionaries in this field. The designer of the programs SYNGEN and WebReactions, much current work in the field is built on the early work of his research group. Many of the successful students who trained with Dr. Hendrickson over his long career, or those whose work was built on ideas and concepts originating from Dr. Hendrickson’s work, will be speaking in this symposium including Dr. Paul A. Wender, Bergstom Professor in Chemistry, Stanford University; Dr. Phil S. Baran, Professor, Scripps Research Institute; Dr. Valentina Eigner-Pitto, InfoChem GmbH and Dr. Orr Ravitz, SimBioSys Inc.

To see more on the SimBioSys’ upcoming ACS presentation in this session visit:
http://www.simbiosys.com/science/presentations/2011-03-acs/abstract1.htm)

ACS’ most read articles & eHiTS news

Tuesday, March 1st, 2011

We have just discovered some great eHiTS results in the e-mail notification from ACS publishing, titled: “Most Read Articles in January from the Journal of Chemical Information and Modeling”. In the list of papers that received the highest exposure in January is:

Virtual Decoy Sets for Molecular Docking Benchmarks
Izhar Wallach and Ryan Lilien
DOI:  http://c.acs.org/ceyra/321469/70/350639/6508/0/S/0/0/tvma.html

The paper demonstrates how the composition of a screening benchmark (the DUD set in this case), and particularly the physical properties of the decoy set affects the enrichment outcome. This is a valuable insight for us as software developers, and an important one for comp.chems at large. The paper presents the screening performance of eHiTS, as well as of Glide’s on the Dud targets, and although the paper is not attempting to compare the programs, they seem to be on a par with each other in retrieving active molecules in virtual screening.

Another publication with eHiTS, presenting a successful virtual screening workflow for Human IKK-2 Inhibitors was also brought to my attention in the last few days:
February 24, 2011:  PLoS ONE 6(2): e16903. “Identification of Human IKK-2 Inhibitors of Natural Origin (Part I): Modeling of the IKK-2 Kinase Domain, Virtual Screening and Activity Assays”
Sala E, Guasch L, Iwaszkiewicz J, Mulero M, Salvado M-J, et al. (2011)
DOI:10.1371/journal.pone.0016903
http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0016903

Both of these were added to our list of user papers with SimBioSys tools:
http://www.simbiosys.com/science/presentations/2011-03-acs/index.html 
We are looking forward to your paper with eHiTS results, please let us know about it, as soon as published, so that your work is added to our ever growing list of user publications.

To learn more about eHiTS, please contacts us by e-mail or join our docking talk at the upcoming ACS 2011 Spring meeting, in Anaheim, California:

COMP: 59
(http://abstracts.acs.org/chem/241nm/program/view.php?obj_id=63915&terms=)
“Recent developments in the eHiTS ligand docking and scoring software”
Zsolt Zsoldos, Orr Ravitz,  SimBioSys Inc., Toronto, Ontario, Canada
Session: Docking and Scoring: A Review of Docking Programs
Time: Monday, Mar 28, 2011 10:20 AM
Location: Anaheim Convention Center, Room 213 B

see more details at:
http://www.simbiosys.com/science/presentations/2011-03-acs/index.html

Can we trust docking results ?

Friday, September 3rd, 2010

The question is asked and answered by a group of researchers from the University of Warsaw in a recently published paper (http://onlinelibrary.wiley.com/doi/10.1002/jcc.21643/abstract). They performed a comparison of 7 docking and scoring programs to evaluate pose prediction and score accuracy on a large set of 1300 PDB complexes. They performed a fairly thorough study asking some important questions, such as how the starting ligand conformations influence the results and how the results differ for small or large ligands, mostly hydrophobic or mostly polar interaction. The good news they report is that, statistically, overall results do not seem to be influenced by the starting conformations, although there is a slight advantage in some programs for the X-ray conformation, which is understandable. The bad news is that ligand size does matter: while we are very successful with small, fairly rigid molecules, large floppy ones still prove to be hard to handle for all programs. The really ugly news is that none of the scoring functions provided adequate correlation with binding energy.

The results are divided into 3 major sections: pose prediction accuracy, score correlation with experimental binding energy and score-rmsd correlation (ranking performance of the scoring functions). The authors’ conclusion of the pose prediction exercise can be summarized by the following quote:

“On the basis of those results, we can order programs in the following way: GOLD ~ eHiTS > Surflex > Glide > LigandFit > FlexX > AutoDock. The best programs have the average RMSD top score around 2.7 A, and it increases to nearly 4.5 A for the weakest FlexX. As expected, better results were observed for best pose conformations (Fig. 4). For those poses, the mean RMSD value was even below 2 A for GOLD, eHiTS, and Surflex. … Moreover, the percentage of pairs for which top score conformation is below 2 A shows that even for the best programs the success rate is below 60%, and in some cases even below 40%.”

Based on the score-energy correlation performance, the authors divided the programs into three categories. The best one is “composed of functions implemented in eHiTS and in Surflex, which gave Pearson correlation 0.38 and 0.33, respectively. Moreover, for eHiTS scoring function very high-Spearman correlation was obtained…” The Pearson correlations for the middle and worse categories are  in the range of 0.17-0.25 and less than 0.1 respectively. The authors rightly conclude that the score-energy correlation results are inadequate even “for the best program, namely, eHiTS“.

Finally, in the ranking performance comparison (correlation of score with quality of poses) AutoDock achieved the highest 0.32 correlation with eHiTS as close second with ~0.3 correlation. So, what is the final conclusion of the authors with regards to answering the question in the title ? Here is the quote with the answer:

“Thus, can we trust docking programs? The answer must be given individually for two aspects of docking programs. In terms of pose prediction, we can say that GOLD and eHiTS performance is accurate enough … In the case of scoring functions, the answer must be negative, as virtually no correlations could be observed between docking score and in vitro binding affinities … the empirically derived functions have now reached the saturation of year-to-year improvement … The future direction should be either to use statistical approach based on increasing number of X-ray protein-ligand complexes, as can be determined from the results obtained by eHiTS scoring functions, or to develop completely new approaches in terms of predicting in vivo activity of the ligand.”

I am very happy to see that eHiTS came up among the best-2 contenders for all three aspects of the comparison (while the other-best were three different programs for the 3 aspects). On the other hand, I agree with the authors that there is still a lot of room and need for significant improvements both in terms of pose prediction (~60% success rate) and score accuracy (~0.4 correlation). Furthermore, we definitely need such thorough and large-scale performance comparisons as this one in the future to continuously assess the state of the art until some programs (hopefully eHiTS remaining on the lead) will reach adequate performance.


Posted by Zsolt

SimBioSys presentations at the Fall 2010 ACS meeting

Thursday, August 26th, 2010

SimBioSys co-founders, Prof. Peter Johnson (UK) and Dr. Zsolt Zsoldos (Canada), along with our collaborator Dr. Sean Ekins (USA), delivered five talks at this past ACS meeting. They said that the meeting was a great opportunity to catch up with people they know and to meet new people. According to them, many of the other talks they attended were inspiring, and now, as they are making their journeys back home, their presentations are being posted here to share the science with you:

http://www.simbiosys.com/science/presentations/index.html

Peter Johnson et.al.: “Automated retrosynthetic analysis: An old flame rekindled”
view slides
http://www.simbiosys.com/science/presentations/2010-08-acs/ARChem_ACS_Boston_2010_final.pdf
Sean Ekins et.al.: “LASSO-ing potential pregnane X receptor agonists”
view slide
http://www.simbiosys.com/science/presentations/2010-08-acs/ACS2010_LASSO.pdf
Zsolt Zsoldos et.al: “How eHiTS solves the docking and scoring problems”
view slides
http://www.simbiosys.com/science/presentations/2010-08-acs/ACS2010_eHiTS_lessons.pdf
Zsolt Zsoldos et.al: “Scoring performance of eHiTS on the CSAR dataset”
view slides
http://www.simbiosys.com/science/presentations/2010-08-acs/ACS2010_CSAR_ehits_score.pdf
Zsolt Zsoldos et.al: “Protein-ligand docking on the Cell/BE processor with eHiTS Lightning”
view slides
http://www.simbiosys.com/science/presentations/2010-08-acs/ACS2010_HPC_ehits.pdf

posted by: Aniko

Meet up with SimBioSys at the Fall ACS Meeting in Boston in 10 days

Wednesday, August 11th, 2010

Only 10 days left to the upcoming ACS meeting in Boston (Sun Aug 21 - Thurs Aug 26), and most of the people attending are preparing their personal schedules: the must-go-lectures, the booths at the expo floor showing the latest and greatest technology, and the social networking and get-togethers.

SimBioSys will be there with no exception. We will be showcasing our latest product releases at booth: # 945. The focus will be on the new:
* ARChem 2010 release:
http://www.simbiosys.com/blog/2010/07/06/a-new-archem-release-integrable-more-efficient-and-better-performing/
* The upcoming CLiDE v 4.0 release that is currently in BETA testing - and shows significant improvement in recognition of chemical structures from PDF files and images.
* eHiTS, the exciting participant of the first CSAR benchmark exercise!

The science, algorithms and software design that are embodied in these products will be discussed in five different talks given by:

Zsolt Zsoldos:

COMP 25:
How eHiTS solves the docking and scoring problems
Session:Drug Discovery (08:30 AM - 11:45 AM)
Time: Sunday, August 22, 2010 09:30 AM
Location: Boston Convention & Exhibition Center
Room:Room 154

COMP 59
Protein-ligand docking on the Cell/BE processor with eHiTS Lightning
Session: Scripting & Programming
Time: Sunday, August 22, 2010 03:50 PM
Location: Boston Convention & Exhibition Center
Room: Room 157A

COMP 122
Scoring performance of eHiTS on the CSAR dataset
Session: The Community Structure-Activity Resource (CSAR) Scoring Challenge (09:00 AM - 11:50 AM)
Time: Monday, August 23, 2010 10:05 AM
Location: Boston Convention & Exhibition Center
Room: Room 157B

Peter Johnson:
CINF 42
Automated retrosynthetic analysis: An old flame rekindled
Session: The Journal of Chemical Information and Modeling’s 50th Anniversary Symposium
Time: Monday, August 23, 2010 - 11:40 AM
Location: Boston Convention & Exhibition Center
Room: Room 156A

Sean Ekins who has been collaborating with us in the past few months:
TOXI 4
LASSO-ing potential pregnane X receptor agonists
Session: General Papers
Time: Sunday, August 22, 2010 09:00 AM
Location:Boston Convention & Exhibition Center
Room: Room 252B

We hope you will find these talks interesting and that you will catch up with the SimBioSys researchers either following these presentations or at the booth. If you would like to schedule a meeting with us in advance please contact: aniko *at* simbiosys dot com

Have a great ACS meeting, and an enjoyable trip to Boston.
posted by Aniko