ARChem is making a leap forward by including Stereochemistry in its retrosynthetic analysis engine

May 10th, 2012

Our understanding of the role chirality plays in the activity of drugs has been steadily growing in recent decades. Although we cannot always explain mechanistically why different enantiomers can manifest strikingly diverging  pharmacological behaviors, we can often measure significant differences in their binding affinity, selectivity and ADME properties. Even for drugs that are currently marketed in racemic mixtures there is often evidence that one of the  enantiomers dominates the pharmacology of the drug. It is not surprising therefore, that stereo-selective methods and chiral starting materials have become pivotal to synthesis in this domain.

Including stereochemistry into our synthesis planning tool, ARChem, has been a major undertaking at SimBioSys. The development encompassed many layers, from algorithmic perception of the full spectrum of stereogenic types, through representation of stereochemical reactions, to the proper depiction of molecules. We are very excited to release the first version of ARChem to address stereochemistry. It offers the following capabilities:

  1. Full perception of stereochemistry in the target molecule.

  2. Matching of literature precedents with proper chirality during the retrosynthetic analysis.

  3. Matching proper enantiomers from the collection of starting materials.

The synthesis for Azalanstat below demonstrates the utility of these features. The synthetic route suggested by ARChem, generates one of the chiral centers of the target molecule using an enantioselective reaction step taken from a specific literature example, whereas the other chiral center is introduced using a chiral starting material.

Azalanstat

All steps in the plan are supported by literature examples, and all starting materials are found in catalogs of commercial suppliers.

While we hope you share our satisfaction with this accomplishment, our work on stereochemistry is far from complete. The next few months will be dedicated to developing the capability of generating enantioselective reaction rules. This will allow ARChem to provide the novelty and robustness it achieves in the synthesis design of achiral compounds, and will further enhance its usefulness as a synthetic idea generator.

Stereochemistry in ARChem, and the state of CADD during the ACS meeting in San Diego

March 7th, 2012

We would like to invite you to two events that will take place during the ACS national meeting in San Diego later this month. The first is a talk describing a significant and highly anticipated development in our retrosynthetic analysis project, ARChem, and the second is a special symposium, cosponsored by SimBioSys and co-organized by one of us, reviewing the state of the art and the future of computer-aided drug design.

The introduction of stereochemistry capabilities in ARChem is very significant. Most of the other synthesis design tools, and until recently, our own automated retrosynthetic analysis system ignored any stereochemistry in the target molecule, and likewise, the reaction rules in ARChem did not include any stereochemical information, yet most of the molecules in the pharma industry today include at least one stereocenter. In a CINF symposium, Professor A. Peter Johnson will expose, for the first time, the various facets of stereochemistry in ARChem, including: representation and perception of stereogenic types, representation of streochemical reaction rules and strategies, stereochemical reaction searching and example extraction, and enantioselective rule application. Examples of the retrosynthetic analysis of stereochemically complex targets will be given to demonstrate the new capabilities of the system:

Title: Advanced reaction searching: A comprehensive treatment of stereoselctivity in reactions
Date/Time: Tuesday, March 27, 2012 - 03:45 PM
Location: San Diego Convention Center, room 27A
http://abstracts.acs.org/chem/243nm/program/view.php?obj_id=122967

SimBioSys is also a proud sponsor of a special session that will bring together thought leaders from the industry and the academia who are at the forefront of drug discovery to discuss the role computational methods have been playing in the field and the expectations for the future. 8 talks will feature opinionated accounts of the state-of-the-art in different sub-disciplines (e.g. QM methods, docking, MD simulations etc), and will describe the strengths and shortcomings of the methods in real life scenarios. The session is organized by our own, Orr Ravitz with Chris Corbeil from CCG and Jason Cross from Cubist, and is guaranteed to be a fascinating event.
Title: Computer-Aided Drug Design: Hopes, Reality and Prospects
Date/Time: Monday, March 26, 2012; 8:15 - 11:35 am & 1:00 - 4:30 pm
Location: San Diego Convention Center, room 27A

If you would like to meet Orr Ravitz and/or Peter Johnson and discuss our tools, learn about our research projects, pursue collaborations or simply say hi or get to know us, please contact us in advance or during the meeting by email or by phone.

Have a wonderful spring meeting in sunny California!

SimBioSys to present at the upcoming BAGIM meeting

January 12th, 2012

We are pleased to announce that Drs. Zsolt Zsoldos and Orr Ravitz from SimBioSys will be presenting at the upcoming BAGIM (Boston Area Group for Informatics and Modeling) meeting, on Thurs., Jan 19 2012.

Title: Family based scoring and docking constraints in eHiTS

Abstract: eHiTS uses a novel, statistical knowledge-based scoring method consisting of interacting surface points and physical terms combined with an adaptive parameter scheme. This approach offers users the capability to fine-tune the scoring function using their data and thus incorporate their full body of knowledge in a systematic and automatic fashion. (See recent paper: “Improving molecular docking through eHiTS’ tunable scoring function”, O. Ravitz, Z. Zsoldos, and A. Simon, Journal of Computer-Aided Molecular Design, 25(11), 1033-1051 (2011) DOI: 10.1007/s10822-011-9482-5).

Prior to the talk, a short overview of the CLiDE program will be given. CLiDE is a tool for extraction of chemical structures from documents and images. The program is capable of transforming 2D molecular depictions into standard chemical file formats, and it interfaces with the major chemical editors. The technical and scientific challenges of perceiving chemistry from images of varied qualities and representation conventions will be discussed, and examples for how those issues are addressed by CLiDE will be given. We will show how the different versions of CLiDE meet the different needs of chemists, as well as of IT and IP professionals constructing chemical databases from publications, patents, and reports.

More info on the event at the BAGIM page: http://bagim.org/next_meeting.html

If you are in the Boston area and would like to meet with us prior or after the event please let us know, just leave a comment on this post or send us an e-mail.

Holiday Greetings from the SimBioSys Team

December 15th, 2011

Thank you for your support and best wishes for the year ahead.

The SimBioSys Team

Happy Holidays

Movie on: ARChem in a nutshell

November 28th, 2011

One of our customers recently asked us to provide him with a short presentation explaining our retrosynthetic analysis software, ARChem, so that he would be able to advertise it to potential users within his organization. Since, to paraphrase the old adage, a clip is worth a thousand slides, we opted for a 5 minutes video.

It’s not easy to squeeze the essence of a product like ARChem into a short video, since it has so many facets: the search engine, the solutions display, solutions filtering, interfacing with reaction databases not to mention all the science that is at work under the hood. So we decided to focus on the core value of ARChem: the ability to harvest knowledge from experimental data, and to convert the knowledge to ideas. In 5 minutes we show, without discussing the fascinating underlying technology, the available search strategies, solutions viewing and construction, sharing ideas with your fellow researchers, and viewing literature examples. Please see the movie at:

ARChem movie http://www.simbiosys.com/archem/video/

We hope you will find it interesting.

eHiST Tune and Score methods are published with validation results

November 16th, 2011

Our article on eHiTS Tune and Score is now available online:

Improving molecular docking through eHiTS’ tunable scoring function
Journal of Computer-Aided Molecular Design
DOI: 10.1007/s10822-011-9482-5

The article contains lots of useful information about eHiTS Score and Tune algorithms, as well as it gives validation of the same using a number of test sets, including CDK2 and BACE1 for pose prediction, DUD for virtual screening and PDBBind for affinity prediction. eHiTS results are compared with other programs’ published results where such data were available. For example enrichment results on the DUD set were compared using results from Cross et.al (DOI: 10.1021/ci900056c)

eHiTS comparision on DUD

In conclusion, the article states that knowledge-based approaches are mainstream methods today, because they benefit from the ever expanding base of experimental data and from continuous progress in computational methods, and that score tuning is a natural extension of that concept. The authors also hope to solicit for wider use of the score tuning methodology and creation of test sets in the user community.

See the full article here
http://www.springerlink.com/content/r1t66167718h5110/

Zsolt Zsoldos from SimBioSys to present at the MCADD Fall 2011 Seminar on Oct 5th

September 14th, 2011

MCADD announcement:

=================

The Montreal Computer-Aided Drug Design (MCADD) organizing committee is annoucing the Fall 2011 seminar will feature Zsolt Zsoldos of SimBioSys Inc. (http://www.simbiosys.ca/) . He will be presenting his talk entitled “Automated tuning of eHiTS scoring weights specific to protein families”. The seminar will be October 5th at 3pm in room 501 of the Goodman Cancer Center of McGill University. This seminar will be followed by a wine and cheese reception afterwards. We look forward to seeing you there and please feel free to forward this email to anyone interested in attending the seminar and/or joining the MCADD Group (students and post-docs are welcome).

Additionally we invite you to follow and receive announcements from the MCADD community on linkedin. Just ask to join the Montreal Computer-Aided Drug Design group
( http://www.linkedin.com/groups?gid=2983304&trk=myg_ugrp_ovr ).

Christopher Corbeil
Chair, 4th MCADD Organizing Committee

Organizing Committee members:
Pierre Bonneau, Boehringer Ingelheim (Canada) Ltd.
Araz Jakalian, Boehringer Ingelheim (Canada) Ltd.
Enrico O. Purisima, NRC-BRI
Constatin Yannopolous, Vertex Canada

==============================================================
MCADD Seminar

Date: October 5th, 2011

Location:    Room 501 (Karp Conference Room) Goodman Cancer Center, McGill University, 1160 Pine Ave. West, Montreal, Quebec

Time:     3:00pm - Seminar: /Automated tuning of eHiTS scoring weights specific to protein families/, Zsolt Zsoldos,  SimBioSys Inc. Toronto, Canada

4:00pm - Cocktail/Wine

===============================================
Abstract

The molecular docking paradigm, has thus far failed to produce a generic approach that would deliver accurate pose prediction capabilities, and reliable rank-ordering of conformations and ligands consistently for any biological system of interest. This reality, which has been addressed by numerous methodology papers and comparative studies, has been largely attributed to the inability of scoring functions to capture different chemical interaction types at a uniform level of accuracy. Several studies attempted to develop guidelines for choosing the most suitable docking and scoring method for a specific problem based on protein family classification of the target, dominant interactions, and other properties of the studied system. Consensus techniques, on the other hand, try to synergistically integrate information from multiple sources  assuming agreement between different methods is indicative of more accurate values. Both approaches, however, have shown only limited success in improving binding mode and activity prediction capabilities.

An alternative solution, and arguably a more rigorous one, would be to tailor the scoring function for the system of interest. eHiTS uses a novel scoring method consisting of a statistical knowledge base focused  on interacting surface points and physical terms combined with an adaptive parameter scheme. This  approach offers users the capability to fine-tune the scoring function using their data and thus incorporate  their full body of knowledge in a systematic and automatic fashion. In many realistic drug discovery  scenarios, structural and ligand-activity information is sufficient in a statistical sense to adjust a limited set  of parameters representing the relative weights of the various terms in the eHiTS scoring function. During tuning, receptor targets are clustered according to the chemical and shape similarity of the active site, and weight sets are optimized for each family. Pharmacophore constraint descriptions are thus generated automatically from the recurring interaction patterns observed in a specific active set profile. These constraints can be used for constrained docking or pharmacophore-enhanced scoring schemes.

In this talk, an overview of the eHiTS’ tuning utility will be given, outlining the underlying methodology. Results will be presented showing the enhancements achieved by the tuning process on docking and scoring performance.

Join SimBioSys summer 2011 webinar series on CLiDE, eHiTS and ARChem

August 2nd, 2011

Whether you are at work, at home or on a trip this summer, you can stay informed about the latest software tools of SimBioSys. Three of our products: CLiDE, eHiTS and ARChem, will be showcased in rotation on our weekly seminar series. Join us for these one hour online sessions given every Thursday at noon, EDT.

Starting July 14th we presented CLiDE (Chemical Literature Data Extraction) office tool - which can extract chemical structures embedded in PDF files, Word documents, JPEG and TIFF files, and other document and picture formats. CLiDE is a productivity and convenience tool, it saves the time and trouble of copying useful, and often complex structures from an image into a chemical editor or an e-lab notebook. It is useful for your everyday work, as well as for creating chemical knowledge-bases from journal articles, patents, and web content. (http://pubs.acs.org/doi/abs/10.1021/ci800449t)

There will be two more sessions for CLiDE:
* Thurs., Aug 4, 12 noon EDT
* Thurs., Aug 25., 12 noon EDT

On July 21st we presented eHiTS and its utilities (LASSO, CheVi, Score and Tune) for molecular docking and virtual screening. With its exhaustive conformational search, automated protonation state handling mechanism, and a tunable scoring function eHiTS provides one of the top-performing algorithms in the field: “the fastest” [1], “the most accurate” [2], and “the easiest to use with automated protonation/tatutomerization assigments” [3].

There will be two more sessions for eHiTS
* Thurs., Aug 11, 12 noon EDT
* Thurs., Sep 1., 12 noon EDT

On July 28th we presented ARChem, the newest tool to help organic chemist with synthesis planning.  Synthetic chemists in industry nowadays face an enormous challenge: to develop novel  chemicals, faster, safer, greener and cheaper. In order to solve this multi-dimensional problem most chemists make some use of reaction databases but these are most helpful when the synthesis of the target entity has already appeared in the literature.

ARChem Route Designer is a tool which goes well beyond this and is a computer system designed to support the organic synthetic chemist in the planning the synthesis of novel as well as known compounds. Its features include:

* reaction rules generated by automated mining of large reaction databases
* application of those rules on-the-fly in a retrosynthetic fashion to convert a novel chemical target all the way to readily available starting materials
* display of information from multiple resources (such as literature reactions from Reaxys (*), and starting materials from multiple vendors) in the system
* scoring the many alternatives based on various criteria (shortest path, highest yield, lowest material cost and other options)

There will be two more sessions for ARChem
* Thurs., Aug 18, 12 noon EDT
* Thurs., Sep 8., 12 noon EDT

The webinar sessions are live, and they provide you an opportunity to ask questions and receive  immediate feedback. In case you missed the session you can always view its recording, or join us during the next session on the product of your interest.

Don’t miss out this opportunity, register now at:
http://www.simbiosys.com/products/webinar_request.html
We are looking forward to seeing you at our summer webinars!

posted by: Aniko

References:
[1]: Quote from Dr. Katie Simmons, University of Leeds, UK
[2]: http://onlinelibrary.wiley.com/doi/10.1002/jcc.21643/abstract
[3]: Quote from Dr. Mihaly Mezei, Mount Sinai School of Medicine, NY, USA

Notes:

(*) Reaxys and Reaxys Data represented in ARChem Webinar is used with kind permission of the copyright owner Elsevier Properties SA.
Copyright 2010-2011 (c), Elsevier Properties SA, All rights reserved. Authorized use only.  Reaxys(r) is a trademark owned and  protected by Elsevier Properties SA and used under license.

Score tuning, available in eHiTS, is gaining grounds in docking

April 29th, 2011

You may have come across a recent paper by a group of researchers from UCSD, Leeds and Stony Brook that utilized eHiTS for identifying targets for drug repurposing. The paperA Machine Learning-Based Method To Improve Docking Scoring Functions and Its Application to Drug Repurposing” (http://pubs.acs.org/doi/abs/10.1021/ci100369f) introduces an “inverse screening” scenario in which one searches for receptors that may bind a compound, in this case a known drug, and will suggest new therapeutic benefits for the molecule. The authors demonstrate experimentally and using benchmarks, how customizing the score to the receptors of interest improves the ability to identify active compounds, and to give a rough estimate for their relative activity.

The authors chose to use eHiTS not only because of its good performance as a docking tool, but also because it facilitates the score tuning exercise. eHiTS reports to the user not only a single score value, but also the individual terms, 20 in total, that build-up this value. The study examined various ways to recombine some of the score terms such that receptor-specific properties can be reproduced with improved accuracy.

An erratum for the paper was published yesterday (http://pubs.acs.org/doi/abs/10.1021/ci2001346), revising all the “native” eHiTS results, i.e. the out-of-the-box results without additional tuning. The original paper displayed results that were not in line with results obtained by us and others (see, for example: http://pubs.acs.org/doi/abs/10.1021/ci100374f), which prompted us to contact the authors. A few good words about the authors are in place. Once they heard our concerns, they acted swiftly and with full transparency to elucidate the problems, and once the source of the error was identified, they moved rapidly to publish the correction. This level of openness, integrity and cooperation should not be taken for granted, and we salute the team of researchers for their approach.

The root of the error was in the misidentification of the relevant scoring value in the eHiTS output. As stated in the erratum, eHiTS’ output includes an Energy value and a Score value. The Score value is the term that should be used in pose prediction and virtual screening scenarios. It is a scoring scheme that is trained on PDB complexes and is designed to reproduce crystallographic poses with high fidelity. In the scoring function training, many ligand poses are generated for each PDB complex, and the scoring functions is optimized to generate good score-RMSD correlation. Implicitly this process involves positive and negative data – the correct poses which are the objective and are to be promoted vs unrealistic poses which are rejected and suppressed. The eHiTS-Energy, on the other hand, is a scoring scheme that is designed to rank-order known active molecules. It is trained to produce score-binding affinity correlation, and therefore it is trained on positive data only. Hence, the eHiTS-Energy prospects of differentiating between actives and inactives are slim, which is demonstrated in the ROC charts of the original paper. eHiTS-Score, as shown in the erratum, strongly outperforms the energy in almost all cases, and generally shows good screening capabilities in most cases.

The main two conclusions of the paper are that (i) score tuning is a powerful approach to improve docking results, specifically in screening, and that (ii) non-linear methods for combining the scoring terms are superior to linear methods in this respect. We strongly support both observations. In fact, we have learned those lessons during the development of eHiTS’ scoring function, and therefore eHiTS adopted these principles a couple of years ago. Family-based scoring is available for many cases, and non-linear methods are central in its implementation. For dozens of protein-families, for which several complexes are available in the PDB, eHiTS provides a customized scoring which is invoked automatically by analyzing the geometry of any receptor provided by the user. When the user’s target is not matched to any family in eHiTS’ knowledge-base, a default scoring scheme is used. More about the tuning approaches in eHiTS can be found in this presentation:

http://www.simbiosys.com/science/presentations/2010-03-acs/eHiTS_239_ACS_website.pdf

We are pleased to see that score customization is growingly recognized as a promising path for improving the molecular docking paradigm. The new upcoming version of eHiTS will include even more sophisticated methods of utilizing experimental data. More about this in future posts.

Posted by Orr

SimBioSys presentations at the Spring 2011 ACS meeting

April 1st, 2011

We gave two presentations this week at the Spring 2011 ACS meeting in Anaheim,  Calif. One was about eHiTS, in the session: “Docking and Scoring: A Review of Docking Programs”, and the other was about ARChem, in the computer aided synthesis design symposium in honour of Prof. J.B Hendrickson. Both presentations are now available online for anyone to review and comment. Feel free to post your comments here on this blog post or provide feedback offline.

Orr Ravitz presented on overview of the CASD field with lessons learnt from the past and suggested ways forward. Presentation title: “Back to the future of synthesis planning: how new technology and new resources revitalize the vision of computer aided synthesis design”;
view presentation. ( http://www.simbiosys.com/science/presentations/2011-03-acs/ARChem_241_ACS_web.pdf )

Zsolt Zsoldos presented eHiTS 2009.1 results on the cleaned up Astex-set  (goal assessment of docking pose prediction accuracy) and DUD-set (goal assessment of virtual screening power of docking). The data was curated and mandated by the symposium organisers by Dr. Greg Warren and Dr. Neysa Nevins. Lessons learnt: with clean / better data, one gets better docking accuracy! See the latest eHiTS results with this data set, in our presentation: “Recent developments in the eHiTS ligand docking and scoring software“;
view presentation. ( http://www.simbiosys.com/science/presentations/2011-03-acs/ACS2011_eHiTS.pdf )

posted by: Aniko