Can we trust docking results ?
September 3rd, 2010The question is asked and answered by a group of researchers from the University of Warsaw in a recently published paper (http://onlinelibrary.wiley.com/doi/10.1002/jcc.21643/abstract). They performed a comparison of 7 docking and scoring programs to evaluate pose prediction and score accuracy on a large set of 1300 PDB complexes. They performed a fairly thorough study asking some important questions, such as how the starting ligand conformations influence the results and how the results differ for small or large ligands, mostly hydrophobic or mostly polar interaction. The good news they report is that, statistically, overall results do not seem to be influenced by the starting conformations, although there is a slight advantage in some programs for the X-ray conformation, which is understandable. The bad news is that ligand size does matter: while we are very successful with small, fairly rigid molecules, large floppy ones still prove to be hard to handle for all programs. The really ugly news is that none of the scoring functions provided adequate correlation with binding energy.
“On the basis of those results, we can order programs in the following way: GOLD ~ eHiTS > Surflex > Glide > LigandFit > FlexX > AutoDock. The best programs have the average RMSD top score around 2.7 A, and it increases to nearly 4.5 A for the weakest FlexX. As expected, better results were observed for best pose conformations (Fig. 4). For those poses, the mean RMSD value was even below 2 A for GOLD, eHiTS, and Surflex. … Moreover, the percentage of pairs for which top score conformation is below 2 A shows that even for the best programs the success rate is below 60%, and in some cases even below 40%.”
SimBioSys presentations at the Fall 2010 ACS meeting
August 26th, 2010SimBioSys co-founders, Prof. Peter Johnson (UK) and Dr. Zsolt Zsoldos (Canada), along with our collaborator Dr. Sean Ekins (USA), delivered five talks at this past ACS meeting. They said that the meeting was a great opportunity to catch up with people they know and to meet new people. According to them, many of the other talks they attended were inspiring, and now, as they are making their journeys back home, their presentations are being posted here to share the science with you:
http://www.simbiosys.com/science/presentations/index.html
Peter Johnson et.al.: “Automated retrosynthetic analysis: An old flame rekindled”
view slides
http://www.simbiosys.com/science/presentations/2010-08-acs/ARChem_ACS_Boston_2010_final.pdf
Sean Ekins et.al.: “LASSO-ing potential pregnane X receptor agonists”
view slide
http://www.simbiosys.com/science/presentations/2010-08-acs/ACS2010_LASSO.pdf
Zsolt Zsoldos et.al: “How eHiTS solves the docking and scoring problems”
view slides
http://www.simbiosys.com/science/presentations/2010-08-acs/ACS2010_eHiTS_lessons.pdf
Zsolt Zsoldos et.al: “Scoring performance of eHiTS on the CSAR dataset”
view slides
http://www.simbiosys.com/science/presentations/2010-08-acs/ACS2010_CSAR_ehits_score.pdf
Zsolt Zsoldos et.al: “Protein-ligand docking on the Cell/BE processor with eHiTS Lightning”
view slides
http://www.simbiosys.com/science/presentations/2010-08-acs/ACS2010_HPC_ehits.pdf
posted by: Aniko
Meet up with SimBioSys at the Fall ACS Meeting in Boston in 10 days
August 11th, 2010Only 10 days left to the upcoming ACS meeting in Boston (Sun Aug 21 - Thurs Aug 26), and most of the people attending are preparing their personal schedules: the must-go-lectures, the booths at the expo floor showing the latest and greatest technology, and the social networking and get-togethers.
SimBioSys will be there with no exception. We will be showcasing our latest product releases at booth: # 945. The focus will be on the new:
* ARChem 2010 release:
http://www.simbiosys.com/blog/2010/07/06/a-new-archem-release-integrable-more-efficient-and-better-performing/
* The upcoming CLiDE v 4.0 release that is currently in BETA testing - and shows significant improvement in recognition of chemical structures from PDF files and images.
* eHiTS, the exciting participant of the first CSAR benchmark exercise!
The science, algorithms and software design that are embodied in these products will be discussed in five different talks given by:
Zsolt Zsoldos:
COMP 25:
How eHiTS solves the docking and scoring problems
Session:Drug Discovery (08:30 AM - 11:45 AM)
Time: Sunday, August 22, 2010 09:30 AM
Location: Boston Convention & Exhibition Center
Room:Room 154
COMP 59
Protein-ligand docking on the Cell/BE processor with eHiTS Lightning
Session: Scripting & Programming
Time: Sunday, August 22, 2010 03:50 PM
Location: Boston Convention & Exhibition Center
Room: Room 157A
COMP 122
Scoring performance of eHiTS on the CSAR dataset
Session: The Community Structure-Activity Resource (CSAR) Scoring Challenge (09:00 AM - 11:50 AM)
Time: Monday, August 23, 2010 10:05 AM
Location: Boston Convention & Exhibition Center
Room: Room 157B
Peter Johnson:
CINF 42
Automated retrosynthetic analysis: An old flame rekindled
Session: The Journal of Chemical Information and Modeling’s 50th Anniversary Symposium
Time: Monday, August 23, 2010 - 11:40 AM
Location: Boston Convention & Exhibition Center
Room: Room 156A
Sean Ekins who has been collaborating with us in the past few months:
TOXI 4
LASSO-ing potential pregnane X receptor agonists
Session: General Papers
Time: Sunday, August 22, 2010 09:00 AM
Location:Boston Convention & Exhibition Center
Room: Room 252B
We hope you will find these talks interesting and that you will catch up with the SimBioSys researchers either following these presentations or at the booth. If you would like to schedule a meeting with us in advance please contact: aniko *at* simbiosys dot com
Have a great ACS meeting, and an enjoyable trip to Boston.
posted by Aniko
A new ARChem release: integrable, more efficient and better performing
July 6th, 2010
One of the aspects of maturation is the transition from the egocentric viewpoint to a phase where one engages and considers others. It is true for kids that begin to understand and cope with social situations. It is true for soccer players, or scientists for that matter, that understand that it is not all about personal skills and knowledge, but it is also about how you utilize those in the team play. And it is true for software applications that shift from the stage of proving their algorithms’ capabilities, to become integrable with other applications and merge into a workflow that creates real value for the user.
Since the previous release, work has continued on improving reaction rules generation in ARChem as well as the retrosynthetic search. Significant progress has been made in detecting and highlighting potential functional groups interference. The chemoselectivity issue is a challenge that requires a combination of data mining, profound chemical perception, and supplemental expert knowledge-bases. Another area that recorded a significant improvement is scoring. The retrosynthetic search commonly generates a vast solutions-space with hundreds, and possibly thousands of paths. Navigating systematically through all the options is typically too time consuming, and scoring becomes pivotal in prioritizing the solutions for the user to inspect. Scoring now better reflects a chemist’s assessment regarding the feasibility of a synthetic route. It accounts for synthetic depth, reliability of individual reaction steps, yield, wastage, chemical interference and other considerations in a successful balance.
Alongside the major improvements in the underlying technology, the focus of the last few months has been on usability and integrability:
-
Reaction examples are directly linked to the Reaxys records for full data and literature access.
-
Starting materials arrived at during the search are pointing to the corresponding records in online chemical vendors catalogues.
-
Costs of starting materials are displayed, and route cost is evaluated.
-
As a rule is being used in the analysis, the example reactions that were used to generate this rule are now ordered by relevance to the synthetic route.
-
The solutions space can be pruned using user-defined filters.
-
Changes to the GUI make solutions navigation more efficient, and the general look and feel of the system is more polished and refined.
Here is an example that demonstrates some of the features mentioned above, and also elegantly validates the concept of automated retrosynthetic chemistry. The suggested route was ranked number 1 by the system. It manifests a sequence of three reaction rules that simplify the target all the way to commercially available starting materials, shown with their associated prices per mole. In this particular case, all the suggested transforms were actually exactly found in the set of reactions that generated the respective rules during the automated process of retrosynthetic-rule extraction. All the examples, and the exact-matches can be accessed via the links provided along the retrosynthetic tree. At the bottom right we show a literature reference for a synthesis of the molecule validating the route. ARChem offers a set of 28 distinct solutions that constitute a gateway to a much larger solutions space that can be accessed through the “n of m transforms” links. The user can build different solutions by selecting any of the suggested alternative transforms.
ARChem has made a long way from its proof of concept days. It is now maturing into a tool that can offer real benefits to the medicinal or process chemist, not the least thanks to the continuous feedback that we get from users. In the next few months substantial changes are anticipated in all the aspects of the system. Maturity does not mean stagnation – ARChem is at the forefront of the field of computer aided synthesis design, and intensive R&D guarantees that major advances are still to come. Stay tuned.
posted by Orr
An interesting discussion on: The Ideal Synthesis
June 30th, 2010We are dedicated readers of Derek Lowe’s wonderful blog about the pharmaceutical industry and drug discovery. With his witty style, Lowe is covering many aspects of this field, and is shedding light on many facets that are not always very obvious for people like us who are not directly involved in drug discovery. It is no surprise that the blog has attracted a sizable group of commentators that add their own experienced perspectives to the posts.
One of his latest entries discussed the concept of the “Ideal Synthesis”. While largely an elusive notion, thinking about what constitutes a good synthesis, is an important discussion that we constantly hold between us and ARChem’s users. After all, typically ARChem generates a whole range of synthetic routes to target molecules, and while the user can browse through them all and choose the more useful routes in the specific scenario, the system does offer its own prioritization of solutions as a means of assistance to the user. The rank ordering of synthetic routes is trying to mimic a chemist’s perspective, but this in itself, is not a well defined entity. Although we know what are the essential components, like: yield, minimal wastage, few synthetic steps, and robust reactions, striking the (or a) right balance between the terms is tricky. Lowe’s blog post, the paper it refers to, and the ensuing discussion there, are very helpful.
links:
http://pipeline.corante.com/archives/2010/06/29/the_ideal_synthesis.php
http://pubs.acs.org/doi/abs/10.1021/jo1006812
posted by Aniko
Induced Protonation State Changes Upon Binding
March 31st, 2010There was an interesting article published recently in the Biophysical Journal, (Volume 98, Issue 5, 872-880, 3 March 2010, doi:10.1016/j.bpj.2009.11.016), in which biophysicists recognise the importance of protonation state induced changes upon binding - and mention that one of its key practical applications is in structure-based drug design.
Dr. Alexey Onufriev from Virginia Tech and his team investigated three types (small molecule, protein and nucleic acid) of ligands and their ionization state changes upon protein-ligand binding. They concluded that in all tree cases substantial changes can be observed both in the ligand and also in the receptor ionization states upon binding.
This is a very important observation for virtual screening and docking, because this proves our belief that protonation states of the proteins and ligands can not and should not be prepared and / or fixed for virtual screening experiments. Therefore eHiTS’ method of assigning the protonation states on-the-fly is probably the best method to-date offered to solve this problem. For more info on eHiTS’ automated protonation state handling, please see our technical note with the same title on this page: http://www.simbiosys.com/ehits/ehits_technical_notes.html
posted by Aniko
Presentations at the Fields Institute and at the Spring 2010 ACS meeting
March 8th, 2010The Fields Institute, located in Toronto, is a center for mathematical research activity - a place where mathematicians from Canada and abroad, from business, industry and financial institutions, come together to carry out research and formulate problems of mutual interest.
SimBioSys founder and CSO, Zsolt Zsoldos, who is both a mathematician / computer scientist and a chemist, was recently invited to speak at one of the Fields’ Seminars. This was a great honour and recognition of the scientific
work he does at SimBioSys with his team of exceptional and talented researchers. The title of the March 2nd, 2010 presentation was: “Algorithmic and mathematical challenges in protein-ligand docking and scoring”, which has been a significant part of Zsolt’s work in the past 10 years. He tried squeezing it into just a 1 hour session, and that alone was a huge challenge. Nevertheless, there were many sparkling eyes in the audience, and hopefully the whole topic created enough interest so that we’ll see a few more mathematicians in this challenging field of science in the future. You can check out Zsolt’s talk at: http://www.simbiosys.com/science/presentations/index.html#2010
the audio and slides of the talk will be also shortly posted at Fields Institute’s web site at: http://www.fields.utoronto.ca/audio/#optimization_seminar
Another current, and interesting talk by a SimBioSys’ scientist will be given by Orr Ravitz at the upcoming spring 2010 ACS meeting in San Francisco. He will be talking about “Improving molecular docking through eHiTS’ tunable
scoring function”, in the Drug Discovery session on Monday March 22, 2010 at 10:00 am.
Abstract: The molecular docking paradigm has been hampered by the lack of a generically well performing scoring function. We present two complementary family-based approaches for score-tuning that improve docking performance using experimental data. One technique treats the relative weights of the eHiTS energy terms as parameters that can be adjusted to improve score-RMSD correlations. The other technique is employing ligand-based similarity to rescale the docking score such that better enrichment factors are achieved in virtual screening. We discuss the algorithmic details of the methods, and demonstrate the effects of score tuning on a variety of targets, including CDK2, BACE1 and AChBP, as well as on common benchmarks. We observe an average improvement of 10% in the top-rank pose RMSD, and a similar improvement for docking success (top pose under 2 A). An average EF(1%) of 15 is achieved for the targets in the DUD set.
http://abstracts.acs.org/chem/239nm/program/view.php?obj_id=9832&terms=
Should be a discussion starter! Please join us for the session if you’ll be at the ACS meeting in SFO in two weeks, and contact us if you would like to meet with us during the days of the conference.
posted by Aniko
CLiDE – making chemical information a lot more accessible
January 27th, 2010As scientists we all learn to cope with ever growing amounts of information, coming from various sources. Scientific information, as virtually all types of information, is predominantly delivered in electronic formats – journal articles, patents, e-books, wiki pages, blogs, etc. We need this information to be readily accessible, and searchable, we archive it on our personal PCs, and on our organization’s servers and knowledge bases. As chemists, we have wonderful visualization techniques that allow us to sift through incredible amount of data, and information, but exactly in this place, there is a strong disconnect between the availability of information and its accessibility. 2D images of molecules are so pivotal to the way we digest chemistry, and yet, as images they are not too prone to our data mining tools. It would be great if publishers of chemistry articles were to retain the original structures in their electronic documents and there is no doubt that this will happen some time in the future. But, for now, we need a tool which can translate chemistry images into a connection table format which could allow integration of data from the literature into existing chemistry software.
CLiDE is an optical chemical structure recognition engine. It extracts connection tables of molecules from 2D images in various formats: PDF, postscript, JPEG, BMP, PNG, and TIFF. CLiDE has been around for some time now, but in the last two years it finally got the development boost it deserved, in order to make it a cool and useful instrument for every chemist’s toolkit. It is now equipped with a sleek GUI that can be used to read .pdf as well as a variety of image file formats. Any time you come across a structure of interest, simply select it, extract it, and save it, or send it to your favourite chemical editor (currently ChemDraw, ISISDraw and SymyxDraw are supported). We all know the feeling of looking at a page full of structures that are relevant to our work, and would like to transfer them to another application such as an Excel spreadsheet or a docking program but redrawing them using a graphic editor is tedious and prone to mistakes. CLiDE takes away this hassle. It comes in three flavours that can either process a single image at a time (standard), a whole document at a time (professional), or a full library of documents in one go (batch).
Below you’ll find a demo clip of the new CLiDE product, please contact us to obtain a password to watch it.
In Memoriam: Peter Csizmadia
December 15th, 2009I just learnt about a terrible news, that deeply saddened me. Peter Csizmadia, one of the founders of ChemAxon, and the father of Marvin, disappeared on a mountain climbing expedition in China, in October 2009. I learnt about Peter, a few months ago, when I saw his breath taking picture on one of the summits of the world with a ChemAxon memorabilia:
http://picasaweb.google.com/real.csizi/PeterCsizmadia#5409623424222466162
Very brave, and very talented - I thought - when I read about his background at the time. Since I know his brother Csizi, the other founder of ChemAxon, and many other exceptionally talented people at ChemAxon, this did not come as a surprise to me. ChemAxon IS a great team of exceptionally talented and hard working people, conquering even the most difficult peaks.
My sincere condolences to the family, the company, and to science. Loss of Peter is a great tragedy. However, his short life was not in vain, the fruits of his work, like Marvin, make his memory live forever.
posted by Aniko
ARChem 2009.1 is released
December 10th, 20092009 has been a year of major progress for ARChem, and the system has hit a number of significant milestones that secured its leading position in the field. We wanted to share a few of our achievements, and to extend our gratitude to many users whose comments have made an impact on the system.
-
Chemistry – Several changes to chemical perception algorithms have been implemented. They improve the way target molecules are being addressed, and the way reaction rules are being extracted and clustered from reaction databases. Those improvements have made a small set of manually coded reaction rules obsolete, and have enhanced the system’s capability to deal with some of the challenging aspects of organic synthesis such as chemical interference, stereochemistry and regioselectivity.
-
Data – As a knowledge-based system, ARChem is highly dependent on the quality and quantity of reactions data encapsulated in commercial databases. We are therefore grateful and proud to have further tightened our relationships with two leaders of the chemical information publishing industry: Elsevier and Symyx. Both CrossFire Beilstein, and Cheminform databases have been fully integrated into the system. Covering a vast spectrum of chemical reactions and offering valuable supporting information through the system.
-
Breaking up starting materials – The search down a branch of the retrosynthetic tree stops whenever a starting material from the educts database is found. Sometimes it is desirable to break such compounds to even simpler precursors, since they are expensive to purchase, not in stock, etc. The user can now exclude starting materials matching the target molecules, and find synthetic routes to those compounds.
-
Viewing solutions – The ability to browse through the manifold of generated solutions has been dramatically improved by a synoptic view of reaction steps. The user can see a “preview” of the various solutions by inspecting the list of the next proposed precursors, and jump directly to the associated solutions.
-
System design – ARChem is now a more complete system which can be used not only as a local installation, but also as an online service. A queueing system, security features, accelerated search times and many other features have upgraded the system performance, accessibility and usability.
Below is an example for a synthetic route found by ARChem for Maraviroc – an HIV drug that was developed in Pfizer’s labs in Sandwich, UK, and got FDA approval in 2007. ARChem’s solution includes 9 reactions, with 6 steps in the two longest paths. In this case, the retrosynthetic analysis leads all the way back to commercially available starting materials, shown with their corresponding providers and catalog numbers. ARChem supplies a lot more information to complete the experimental details of the synthetic scheme, such as, reaction conditions, bibliographic references, and additional starting materials providers and catalog numbers.
The above suggested synthetic route has been generated completely automatically with no user intervention. It is a strong demonstration of the huge potential of this concept, and of the accomplishments so far. We look forward to 2010 with plenty of items in the ARChem pipeline, and we are particularly eager to continue the dialogue with our industrial and academic users – a scientific exchange that guarantees that the development process maintains continuous, rigorous and coherent progress.
posted by Orr Ravitz


