Archive for the 'Science' Category

CLiDE for Converting Structure Images to Structure Files

Monday, June 15th, 2009

SimBioSys is a distributor of the CLiDE software, a software package for converting chemical structure images to chemical structures that can be read and interpreted by chemistry software packages, such as ChemDraw and ISIS Draw for example. The software package has been developed in the past by two of our founders: Aniko Simon PhD, Computer Scientist, currently VP of Business Development at SimBioSys, and over many years by Prof A. Peter Johnson (http://www.chem.leeds.ac.uk/People/Johnson.html), an expert in the field of de-novo structure design, synthetic chemistry and the applications of software to chemical structures. CLiDE is installed in organizations around the world and, for many years held a unique position. A new publication on CLiDE just came out a few weeks ago, by the current development team headed by: Aniko T. Valko, see the full citation at:
SimBioSys scientific publications page or ACS - JCIM page

This recent paper systematically evaluates CLiDE Pro’s performance on a large variety of structures, that surpasses our previous validation set for CLiDE. The authors are offering this new, carefully selected test set for base-lining and testing other optical chemical structure recognition (OCSR) tools. They suggest that this test set could be the starting point for a community-based effort to establish a benchmarking test set which would include different categories of images each of which dealt with specific problem types.
This new OCSR baseline testset is available from the publisher of the CLiDE paper as supporting information to the paper as well as downloadable from our web-site: http://www.simbiosys.ca/clide/validation.html

A Comprehensive Scoring Evaluation Paper from McGill

Monday, June 1st, 2009

Scoring is undoubtedly the most challenging aspect of docking. A new, comprehensive scoring evaluation paper was recently published by Nicolas Moitessier’s group from McGill in the Journal of Chemical Information and
Modeling. The group which is actively pursuing development of its own docking and scoring methods (Fitted and RankScore), evaluated the effect of protein flexibility and water molecules on the performance of 18 different scoring functions, and placed eHiTS among the top performers.

http://pubs.acs.org/doi/abs/10.1021/ci8004308

Docking Ligands into Flexible and Solvated Macromolecules. 4. Are Popular Scoring Functions Accurate for this Class of Proteins?
Pablo Englebienne and Nicolas Moitessier
Publication Date (Web): May 15, 2009 (Article)
DOI: 10.1021/ci8004308

TOC picture:

TOC Picture

We greatly value this recognition, and it is definitely reassuring us as developers that our special approach to scoring is delivering good results. However, we also bear in mind that scoring functions are still not performing at a desirable level, and that the docking paradigm critically depends on the ability to rank poses, and to evaluate binding energies in a way that will enhance the predictive capabilities of in-silico models. We are therefore continuously working on improving our algorithms, and our scoring function, and we believe that the scoring of our newest release, eHiTS 2009, is already an improvement over the 6.2 version that was used in the comparative study.

Our commitment to our users and to high scientific standards is among our core values, and we trust that the next release of eHiTS will raise the bar of scoring even higher.

eHiTS Lightning in the C&E News Digital Briefs today

Monday, May 25th, 2009

It was not even a year ago, when C&EN published an interesting article regarding the world’s fastest computer: “World’s Fastest Computer Debuts” http://pubs.acs.org/cen/news/86/i24/8624notw3.html
Today the subject was revisited by C&EN’s Digital Briefs section (http://pubs.acs.org/isubscribe/journals/cen/87/i21/html/8721sci3.html / mirrored on SimBioSys media page) when it featured eHiTS Lightning, the docking and screening product of SimBioSys, that is running on that same platform, i.e. IBM’s Cell/B.E. chip multiprocessor. The state-of-the-art chip powers the affordable Sony’s PlayStation 3 (PS3), making superfast computing a reality for drug discoverers everywhere at only $400 / machine price. This rapid response to technology paradigm shift is achieved by the technical brilliance of our founder Zsolt Zsoldos and diligent work our excellent programmers.

Users around the world have already started using eHiTS Lightning on PS3 clusters, making it truly amazing fast and economic. See quote from Trinity College Dublin, Ireland here:
http://www.simbiosys.ca/support/index.html#quotes
A technical note about the speedup achieved on the PS3 can be found here:
http://www.simbiosys.ca/ehits/ehits_technical_notes.html

posted by:
Aniko

Fragment Based Pose Prediction and Affinity Scoring

Monday, May 4th, 2009

Fragment based drug design (FBDD) approaches to inhibitor development often include in silico surveys of the binding preferences of small ligands against the target binding. There are instances of cases where FBDD has been applied in the course of drug development as well as retrospectively with truly useful outcomes.[1] High throughput docking approaches that exhaustively probe the binding site can be useful if they are validated as to their ability to reproduce known structural biology solutions delineating preferred fragment poses and binding affinities. A number of careful benchmark problems exist for calibrating docking and scoring approaches to this task. One of them is the series of papers by Stoichet and coworkers who `developed a small’ well characterized cavity in T4 Phage Lysozyme via mutations at sequence positions 99 and 102. In particular the, L99A and M102Q mutations were used to create a cavity, fragment libraries were then screened against the target binding site, crystallographic characterizations of bound fragments made, and predicted and actual binding affinities compared.[2,3,4] Figure 1A shows a depiction of the binding site with 2-allyl-6-methyl-phenol bound. Hydrophobic contacts complimented by 1 hydrogen bond determine the x-ray pose. While Figure 1B shows some of the characteristic fragments employed in one of the studies
1OV7 Fragment

Fragments T4L Series
Figure 2A shows the eHiTS pose prediction pose accuracy for a small fragment series (shown in figure 1B) with the closest and top rank poses depicted. Note that four of the six cases gave top-rank docking poses of ~0.5 angstrom. While two of the six cases are not acceptable with top-rank values ~2.5 angstroms, simple family training on the series leads to the pose accuracy profile comparison shown in figure 2B. The results illustrate two facets: 1) generally speaking pose accuracy for fragment prediction for eHiTS will be sub-1-angstrom and 2) and simple family training (tuning) on 4-5 ligand/fragment bound crystal structures improves the pose prediction. This weight tuning procedure while often unnecessary, is accomplished in a matter of minutes with the eHiTS Tuning package that enables the user to develop a customized scoring function.

FIG2A-TOP-CLOSE

TRAINING
One also wants to be able to predict affinities of fragment libraries to a target binding site. This is a challenging problem and generally speaking docking protocols perform better at pose prediction than affinity scoring. Nevertheless, at the same time one is attempting to screen fragments for their probable binding preferences in the cavity it would be useful to have at least a semiquantitative ranking. A classic example of fragment based screening via x-ray crystallography to discovery of inhibitors lies in the work of Congreve and coworkers developing inhibitors of beta-secretase.[5] A series of fragments and refinements on the path of optimization are shown in figure 3 along with their binding affinities/molecular weights and logP values.

FIG3_BACE1_FRAG_INH

Figure 4 Panel A) shows the IC50/score correlation for eHiTS and Panel B) a molecular mechanics Poisson-Boltzmann scoring of several of the fragments/ligands. The eHiTS Score ln(IC50) correlation had a correlation of R2=0.61 and a Pearson coefficient of 0.78 while the MMPBSA scoring had an R2=0.51 and a Pearson coefficient of 0.72. The eHiTS scoring took on the order of 7-minutes for this set while the MMPBSA including charge/parameterization/simulation took several hours. While `affinity estimation’ via docking scoring functions can only be approximate, the eHiTS scoring function is adequate to the task. Figure 4B highlights the fact that the eHiTS Scoring has a good correlation with the enthalpic portion of the MM-PBSA free energy. This short synopsis of more detailed in-house studies illustrate the manner in which eHiTS docking and scoring is a good underpinning to a fragment de novo based approach to inhibitor design.

FIG4A_EHITS_CORREL_AFF

eHITS_MMBPSA_CORREL

REFERENCES

[1] Congreve M, Chessari G, Tisi D, Woodhead AJ., “Recent developments in fragment-based drug discovery.”, J.Med.Chem., 51:3661-80 (2008).
[2] Wei, B.Q., Baase, W.A., Weaver. L.H., Matthews, B.W., Stoichet, B.K. “A Model Binding Site for Testing Scoring Functions in Molecular Docking.”, J. Mol. Bio. 322:339-355 (2002)
[3] Wei, B.Q., Baase, W.A., Weaver. L.H., Ferrari, A.M., Matthews, B.W., Stoichet, B.K., “Testing a Flexibile-receptor Docking Algorithm in a Model Binding Site.”, J.Mol. Bio. 327:1161-1182 (2004).
[4] Graves, A.P., Shivakumar, D.M., Boyce, S.E., Jacobson, M.P., Case, D.A. and Stoichet, B.K., “Rescoring Docking Hit Lead Lists for Model Cavity Sites: Predictions and Experimental Testing.” J.Mol.Bio. 377:914-934 (2008).
[5] Congreve M, Aharony D, Albert J, Callaghan O, Campbell J, Carr RA, Chessari G, Cowan S, Edwards PD, Frederickson M, McMenamin R, Murray CW, Patel S, Wallis N.,” Application of fragment screening by X-ray crystallography to the discovery of aminopyridines as inhibitors of beta-secretase.” J Med Chem. 50:1124-32 (2007).

Posted by Dan Harris

Sharing the presentations from the Spring 2009 ACS and CHI Tri-Conference

Tuesday, March 31st, 2009

The SimBioSys team is back from the ACS Meeting in Salt Lake City. It was a bit low on the attendance side, but was high on quality of some sessions like the FBDD, and the Adaptive Scoring Functions. There were lively discussions in drug discovery, and great events as usual, even with the smaller crowd.

Our Spring ACS 2009 presentations are posted at: http://www.simbiosys.ca/science/presentations/index.html
or you can also look at:
http://www.simbiosys.ca/science/presentations/2009-03-acs/index.html

We have also posted our poster presentation from the CHI’s Molecular Medicine Tri-Conference, (Feb 25-27 2009, San Francisco):
http://www.simbiosys.ca/science/presentations/2009-02-Tri-Conf/ SimBioSys_TriConference_Poster.pdf

You’re welcome to take a look.

by Aniko

SimBioSys science in the spotlight

Monday, March 16th, 2009

We are less than one week away from catching our flights out west to the beautiful Salt Lake City and our time at another American Chemical Society meeting. We will of course do what vendors do and meet with our users, spend time in our booth #316, and have friendly exchanges with those other players in our domain. As readers of this blog will be aware that one of our primary areas of research is of course docking. It is no secret that SimBioSys operates in a highly competitive environment. Over the years many academic and commercial groups have done excellent work in innovating new methods in the world of docking. In parallel we have been delivering our own innovative contributions to this area, and this fact is strongly manifested in the high visibility of the company in the technical program of the 237th ACS meeting: a total of seven presentations will be delivered by SimBioSys scientists.

If you’ve been watching this blog you will have seen that we were the first to deliver a working docking solution on a Cell processor <http://www.simbiosys.ca/ blog/?s=cell> (either the PlayStation PS3 or on the IBM Cell Processor Blades). We also developed a novel scoring function, that works well with our exhaustive, fragment based eHiTS docking engine <http://www.simbiosys.ca/blog/2009/03/10/ fragment-pose-prediction-and-score-rmsd-correlations/>. We delivered the LASSO approach <http://www.simbiosys.ca/ehits_lasso/index.html> to examine 3D ligand activity surfaced-based similarity. We continue to do fundamental research (e.g. in scoring) to improve the scientific algorithms underlying our software. We are driven by delivering to our users the best algorithms, most appropriate workflows and simply the best tools for docking.

Driven by our science we are motivated to present our work at conferences such as the ACS meeting. With this in mind we submitted a number of talks to present. The list is below and includes our work on fragment based applications of eHiTS and SPROUT, the eHiTS speedup on the Cell platform and the new scoring function of eHiTS, as well as CAESA - the retro-synthetic scoring function. In addition SimBioSys will exhibit all of these software tools at Booth# 316. Our applications scientists Danni Harris, and our chief scientist & founder, Zsolt Zsoldos, and Peter Johnson, we will be delivering a total of seven presentations. That’s a lot of presentations even for a company 5 times our size. If you’re at the ACS please contact us to set up a one-on-one meeting with our scientists for scientific discussion and consultation. If you aren’t there we will post the presentations onto our website, here: http://www.simbiosys.ca/science/presentations/2009-03-acs/

  1. COMP 1
    Session: Advancing Computational Chemistry through High-Performance Computing: From the Workstation to Petascale and Beyond: Michael Dewar Memorial Symposium

    Sunday, March 22, 2009 from 8:30 AM to 9:10 AM; SPCC — 257, Oral
    “Docking performance accelerated 30-50 fold on the Cell/BE processor”
    Presenter: Zsolt Zsoldos, See abstract

  2. CINF 044
    SESSION: Library Design, Search Methods and Applications of Fragment-based Drug Design

    Tuesday, March 24, 2009 from 10:10 AM to 10:40 AM; SPCC — 254 A, Oral
    “Fragment based docking and linking engine of eHiTS”
    Presenter: Zsolt Zsoldos, See abstract

  3. CINF 063
    SESSION: Adaptive Scoring Functions

    Wednesday, March 25, 2009 from 10:00 AM to 10:35 AM; SPCC — 254 A, Oral
    “eHiTS scoring function”
    Presenter: Zsolt Zsoldos, See abstract

  4. COMP 208
    SESSION: Drug Discovery
    Thursday, March 26, 2009 from 8:30 AM to 9:00 AM; SPCC — 258, Oral
    “eHiTS: Docking and scoring ligand/target interactions to give good score-rmsd and ic50 correlations in in silico high throughput screening
    Presenter: Danni Harris, See abstract

  5. CINF 033
    “Computational tools for fragment based drug design”
    Monday, March 23, 2009 Salt Palace Convention Center — 254 A, Oral; 1:35 PM
    Presenter: A. Peter Johnson, See abstract

  6. COMP 214
    Computational approaches to antibacterial and antimalarial hit finding”

    Thursday, March 26, 2009 Salt Palace Convention Center — 257, Oral, 1:00 PM
    . Presenter: A. Peter Johnson, See abstract

  7. CINF 073
    “Scoring synthetic feasibility: A very different problem”
    Wednesday, March 25, Salt Palace Convention Center — 254 A, Oral, 4:05 PM
    Presenter: A. Peter Johnson, See abstract

One additional exciting presentation will be presented by Peter Johnson for our partner Elsevier. Prof. Johnson will be a guest speaker at the Elsevier’s ACS launch of Reaxys, a new Innovation from CrossFire Beilstein:

where: Special Events Pavilion, ACS exhibition hall,
when: Tues, Mar 24, between 2:00 pm and 3:30pm
title: “An introduction to Reaxys - the workflow solution for synthetic chemistry”

See more about this event at the Elsevier web-site: http://www.info.reaxys.com/event

Fragment Pose Prediction and Score-RMSD Correlations:

Tuesday, March 10th, 2009

From the perspective of computational de novo screening, the ability to accurately predict the potential bound configurations of small low-molecular weight fragments to a target is essential. The challenge is significant given that the binding site may be large compared to the overall size of the fragment and the affinity of the fragments for particular regions of the binding site may be on the order of mM or less. Finally, it important to not only predict the poses with highest binding affinity (with small root mean square deviation to known structural biology solutions) but to predict a diverse ensemble of bound configurations of each fragment with a large distribution of binding free energies to the target. All of these requirements are crucial given that de novo/fragment design aims to obtain a robust nM affinity binding ligand by tethering weakly bound fragments derived mM to uM range.

The requirements for a fragment docking and scoring approach employed in a denovo design paradigm from our perspective are:
1) a procedure which provides exhaustive sampling of the potential bound fragment configurations in the binding site,
2) a sampling regime in which modest affinity fragment locations are sampled more effectively that low affinity sites,
3) for cases where structural biological solutions of the bound fragment configuration is known, one wants the pose prediction approach to have a good correlation of predicted docking score with docking poses with low RMS deviations from the structural biology solution.
A simple example is shown below that exemplifies these characteristics for the particular case of fragment docking of a fragment found in the pdb code 1OV7 where a small phenolic fragment (2-allyl-6-methyl phenol) is bound to T4 Phage Lysozyme.

Panel A in Figure 1 shows the overlap of the best scoring pose of the phenolic fragment with the crystal structure pose. The top scoring pose has an RMSD of 0.5 Å with respect to the crystallographic configuration. Panel B shows the correlation of the pose root mean square deviation with the eHiTS docking score. The plot illustrates that there is correlation of pose score with low RMSD particularly in the low RMSD/good score region. Panel C and D show the histograms of the RMS deviations and scores of the docked fragment poses. Note that the deviations peak in the good score/low RMSD regime, analogous to what is seen in the distribution in the RMSD vs. Score plot.

1OV7I’ll back in coming days to talk about fragment affinity scoring!

Posted by Dan Harris

SimBioSys Inc. Releases a New Version of eHiTS

Thursday, February 26th, 2009

TORONTO, ON - 26th Feb 2009: SimBioSys Inc. announces the release of eHiTS 2009 - a new version of its molecular docking and virtual screening software. The new release builds on eHiTS’ strengths of its fine, systematic and exhaustive search algorithm, its automatic protonation state handling, and its unique knowledge-based scoring function. It delivers the following new features:

One of the greatest performance enhanced strategic advantages of this new release is the port of this accurate docking tool to the Cell platform. Molecular docking is often used as a virtual screening method for large libraries of compounds in an effort to identify potent molecules for pharmaceutical purposes. The substantial computational cost of this process has so far required computer clusters of considerable size, but the level of speedup achieved on the Cell processor allows replacing roughly 10 cluster nodes with a single PlayStation 3. “This is a low-cost and green hardware solution that saves on operational costs like cooling, electricity and space,” says Zsolt Zsoldos SimBioSys’ chief scientist, “it delivers the same high quality results as traditional platforms, and opens up the virtual screening paradigm to small companies who could not afford the IT infrastructure required for the process”.

In addition to the Cell-port, eHiTS’ scoring function has undergone a significant overhaul toward the release. “Our knowledge-based approach mandates keeping pace with the most recent publicly available experimental data”, says Zsoldos, “the new scoring function was trained on thousands of PDB structures as well as on activity and binding affinity data”. The current release offers score weight-sets that were tuned for 500 new protein classes. eHiTS attempts to classify the user’s targets in one of those families, and to use the appropriate scoring scheme which often provides better correlations of the score with low RMSD ligand-poses and with binding affinity. “These changes were shown to produce cutting-edge performance in enrichment studies, and state-of-the-art binding affinity prediction capability, which are essential to structure-based drug design,” Zsoldos adds.

SimBioSys is confident that this release positions the company at the forefront of the molecular docking field. “eHiTS 2009 provides a very powerful drug-discovery tool, and during the development of this version we have laid the foundations for additional improvements that will follow in the coming months”, summarizes Dr. Zsoldos, “In addition, the PlayStation solution directly delivers on two key issues in today’s dire market conditions: significant cost reduction with no compromise to quality, and lower environmental footprint due to lower power consumption.”

About SimBioSys:
Privately owned, SimBioSys is a recognized leader in the field of rational drug discovery software. Providing a wide range of software solutions, the company is focused on the development of scientific tools to facilitate the drug discovery process. It retains a constant focus on the innovation of algorithms to provide improved throughput and accuracy in the fields of flexible docking, virtual screening and de-novo structure design. SimBioSys is also a pioneer in the field of computer-aided retrosynthetic analysis where it supports chemists through the challenges of organic synthesis. With attention to detail, ease-of-use and improved productivity, SimBioSys has built a strong reputation of delivering state-of-the-art scientific solutions to biotechnology and pharmaceutical companies.

eHiTS Lightning 2009

New Paper on ARChem / Route Designer

Friday, February 13th, 2009

SimBioSys started its venture into retrosynthetic analysis almost by chance when researchers at Pfizer were looking into CAESA and enquired whether the approach for evaluating synthetic accessibility can be expanded and developed enough to provide full synthetic routes for target molecules. Thus began our journey along a path that has been explored by so many others with limited success so far. SimBioSys with its inherent computer science and computational chemistry expertise, joined forces with Peter Johnson at the University of Leeds - the mind behind CAESA and a well recognized organic chemist - to meet the formidable challenge. Fast forward to 2009, ARChem now offers arguably the most comprehensive solution to the great challenge of computer aided synthesis design.

Given the complexity of chemistry, one cannot but admire and be amazed at the capability of synthetic chemists to build increasingly complex molecules from simple building blocks. ARChem offers the chemists an idea-generating tool that can help them jump-start their synthesis design by proposing a manifold of synthetic routes that sometime utilize less obvious chemistry, and often lead to less frequently used starting materials. This is achieved by ARChem’s exhaustive approach to the retrosynthetic search, and, even more importantly, by its automatic mechanism for creating synthetic rules from rich and thorough databases of chemical reactions. The software’s unique way of handling the reaction rule generation process, which is the crux of this endeavour, has been discussed here and in scientific forums, such as the ACS national meeting. Now, the synthetic chemistry community, and the computational chemistry audience can explore the details of the approach and the algorithms in a new article published in the Journal of Chemical Information and Modelling:

SimBioSys scientific publications page or ACS - JCIM page

We are confident that this paper will not only draw attention to ARChem, but will also encourage further research and discussion about the role of computers in synthesis design in the years to come.

Is FAST High Quality Docking Possible? The Data Say Yes…

Thursday, December 18th, 2008

We’ve been having the conversation within our company that the two dials of speed and accuracy work counter to each other. So, we’ve been espousing that even when it comes to the eHiTS Lightning solution that higher accuracy does take longer. We still stand by that BUT what we are happy about is the type of accuracy we can achieve very quickly using the new eHiTS Lightning algorithms. This becomes more obvious when our results are compared to the results of others. There has been a proliferation of arguments for GPUs being used as acceleration processors – we actually believe this is simply because of the business driver of “looking for new markets” for the GPU manufacturers. Zsolt has discussed his views regarding the future of High Performance Computing previously and commented on GPUs. Our belief is that while GPUs are clearly more “common” our decision to work with the Cell BE processor can certainly lead to far superior results…don’t forget that the RoadRunner computer is based on the Cell Processor, not GPUs. Did we make the right decision?

We are always watching for innovative solutions in docking. We acknowledge those scientists pushing towards the edge of performance and excellence. When we saw the recent announcement regarding the DockStar solution from Silicon Informatics we were interested to see whether they had made some of the promised breakthroughs with their GPU-based solution. Their website promises “With the combined power of the DockStar™ Linux Workstation, NVIDIA’s® Tesla™ GPU’s and our proprietary software kernels, Silicon Informatics’ DockStar™ solution outperforms conventional workstations by 10 - 20+ times.” The system is based on the Autodock 4.0 software platform. As commented in my recent blogpost we have been doing a lot of work to validate the performance of eHiTS Lightning and gathering validation data for throughput, pose accuracy and enrichment so we were interested to compare our data with those of the GPU-based DockStar solution. We’ll report the data in much more detail in a Case Study note presently in development but our observations at present are based on comparing to information they have on the site.

There are 3 examples posted on the home page of the DockStar site, 1stp, 3ptb and 1hvr, with the results shown below:

Protein

DockStar AutoDock 4.0 - Rigid

(secs)

eHiTS Lightning

(secs)

Difference Factor

3ptb 120 12 10 x
1stp 180 12 15 x
1hvr 720 69 10 x

The table shows us that for these three examples at least we see a difference of over 10x in performance for the Cell processor versus the GPU-based Dockstar solution. Now, this is only a comparison based on speed. Accuracy is clearly just as important so how do we do there?

We are presently finishing the results for all examples but one example is shown below, in all its glory! Notice the dramatic performance difference in the plots below. The eHiTS Lightning shows the expected behavior in terms of the expected good, i.e. low scores at low RMSD values whereas DockStar/AutoDock accuracy / score correlation has no tendency. These results show that eHiTS Lightning not only offers dramatic speed advantages but also the accuracy advantages we have been espousing. More detail will be published soon.

Autodock 4 1hvr

Img1: Autodock 4: 250,000 GA: 45 minutes, note the resultant RMSD distribution.
eHiTS Lightning 1hvr Img2: eHiTS Lightning, on the CELL B/E. 1 minute, note the nature of the Scrore/RMSD distribution, most poses are at low RMSD values.