Drug Design on the Playstation
Tuesday, March 10th, 2009David Bradley posted a great article today regarding “Drug Design on the Playstation”, read all about it here
http://www.reactivereports.com/chemistry-blog/drug-design-on-the-playstation.html
David Bradley posted a great article today regarding “Drug Design on the Playstation”, read all about it here
http://www.reactivereports.com/chemistry-blog/drug-design-on-the-playstation.html
TORONTO, ON - 26th Feb 2009: SimBioSys Inc. announces the release of eHiTS 2009 - a new version of its molecular docking and virtual screening software. The new release builds on eHiTS’ strengths of its fine, systematic and exhaustive search algorithm, its automatic protonation state handling, and its unique knowledge-based scoring function. It delivers the following new features:
One of the greatest performance enhanced strategic advantages of this new release is the port of this accurate docking tool to the Cell platform. Molecular docking is often used as a virtual screening method for large libraries of compounds in an effort to identify potent molecules for pharmaceutical purposes. The substantial computational cost of this process has so far required computer clusters of considerable size, but the level of speedup achieved on the Cell processor allows replacing roughly 10 cluster nodes with a single PlayStation 3. “This is a low-cost and green hardware solution that saves on operational costs like cooling, electricity and space,” says Zsolt Zsoldos SimBioSys’ chief scientist, “it delivers the same high quality results as traditional platforms, and opens up the virtual screening paradigm to small companies who could not afford the IT infrastructure required for the process”.
In addition to the Cell-port, eHiTS’ scoring function has undergone a significant overhaul toward the release. “Our knowledge-based approach mandates keeping pace with the most recent publicly available experimental data”, says Zsoldos, “the new scoring function was trained on thousands of PDB structures as well as on activity and binding affinity data”. The current release offers score weight-sets that were tuned for 500 new protein classes. eHiTS attempts to classify the user’s targets in one of those families, and to use the appropriate scoring scheme which often provides better correlations of the score with low RMSD ligand-poses and with binding affinity. “These changes were shown to produce cutting-edge performance in enrichment studies, and state-of-the-art binding affinity prediction capability, which are essential to structure-based drug design,” Zsoldos adds.
SimBioSys is confident that this release positions the company at the forefront of the molecular docking field. “eHiTS 2009 provides a very powerful drug-discovery tool, and during the development of this version we have laid the foundations for additional improvements that will follow in the coming months”, summarizes Dr. Zsoldos, “In addition, the PlayStation solution directly delivers on two key issues in today’s dire market conditions: significant cost reduction with no compromise to quality, and lower environmental footprint due to lower power consumption.”
About SimBioSys:
Privately owned, SimBioSys is a recognized leader in the field of rational drug discovery software. Providing a wide range of software solutions, the company is focused on the development of scientific tools to facilitate the drug discovery process. It retains a constant focus on the innovation of algorithms to provide improved throughput and accuracy in the fields of flexible docking, virtual screening and de-novo structure design. SimBioSys is also a pioneer in the field of computer-aided retrosynthetic analysis where it supports chemists through the challenges of organic synthesis. With attention to detail, ease-of-use and improved productivity, SimBioSys has built a strong reputation of delivering state-of-the-art scientific solutions to biotechnology and pharmaceutical companies.
SimBioSys started its venture into retrosynthetic analysis almost by chance when researchers at Pfizer were looking into CAESA and enquired whether the approach for evaluating synthetic accessibility can be expanded and developed enough to provide full synthetic routes for target molecules. Thus began our journey along a path that has been explored by so many others with limited success so far. SimBioSys with its inherent computer science and computational chemistry expertise, joined forces with Peter Johnson at the University of Leeds - the mind behind CAESA and a well recognized organic chemist - to meet the formidable challenge. Fast forward to 2009, ARChem now offers arguably the most comprehensive solution to the great challenge of computer aided synthesis design.
Given the complexity of chemistry, one cannot but admire and be amazed at the capability of synthetic chemists to build increasingly complex molecules from simple building blocks. ARChem offers the chemists an idea-generating tool that can help them jump-start their synthesis design by proposing a manifold of synthetic routes that sometime utilize less obvious chemistry, and often lead to less frequently used starting materials. This is achieved by ARChem’s exhaustive approach to the retrosynthetic search, and, even more importantly, by its automatic mechanism for creating synthetic rules from rich and thorough databases of chemical reactions. The software’s unique way of handling the reaction rule generation process, which is the crux of this endeavour, has been discussed here and in scientific forums, such as the ACS national meeting. Now, the synthetic chemistry community, and the computational chemistry audience can explore the details of the approach and the algorithms in a new article published in the Journal of Chemical Information and Modelling:
SimBioSys scientific publications page or ACS - JCIM page
We are confident that this paper will not only draw attention to ARChem, but will also encourage further research and discussion about the role of computers in synthesis design in the years to come.
We’ve been having the conversation within our company that the two dials of speed and accuracy work counter to each other. So, we’ve been espousing that even when it comes to the eHiTS Lightning solution that higher accuracy does take longer. We still stand by that BUT what we are happy about is the type of accuracy we can achieve very quickly using the new eHiTS Lightning algorithms. This becomes more obvious when our results are compared to the results of others. There has been a proliferation of arguments for GPUs being used as acceleration processors – we actually believe this is simply because of the business driver of “looking for new markets” for the GPU manufacturers. Zsolt has discussed his views regarding the future of High Performance Computing previously and commented on GPUs. Our belief is that while GPUs are clearly more “common” our decision to work with the Cell BE processor can certainly lead to far superior results…don’t forget that the RoadRunner computer is based on the Cell Processor, not GPUs. Did we make the right decision?
We are always watching for innovative solutions in docking. We acknowledge those scientists pushing towards the edge of performance and excellence. When we saw the recent announcement regarding the DockStar solution from Silicon Informatics we were interested to see whether they had made some of the promised breakthroughs with their GPU-based solution. Their website promises “With the combined power of the DockStar™ Linux Workstation, NVIDIA’s® Tesla™ GPU’s and our proprietary software kernels, Silicon Informatics’ DockStar™ solution outperforms conventional workstations by 10 - 20+ times.” The system is based on the Autodock 4.0 software platform. As commented in my recent blogpost we have been doing a lot of work to validate the performance of eHiTS Lightning and gathering validation data for throughput, pose accuracy and enrichment so we were interested to compare our data with those of the GPU-based DockStar solution. We’ll report the data in much more detail in a Case Study note presently in development but our observations at present are based on comparing to information they have on the site.
There are 3 examples posted on the home page of the DockStar site, 1stp, 3ptb and 1hvr, with the results shown below:
|
Protein |
DockStar AutoDock 4.0 - Rigid (secs) |
eHiTS Lightning (secs) |
Difference Factor |
| 3ptb | 120 | 12 | 10 x |
| 1stp | 180 | 12 | 15 x |
| 1hvr | 720 | 69 | 10 x |
The table shows us that for these three examples at least we see a difference of over 10x in performance for the Cell processor versus the GPU-based Dockstar solution. Now, this is only a comparison based on speed. Accuracy is clearly just as important so how do we do there?
We are presently finishing the results for all examples but one example is shown below, in all its glory! Notice the dramatic performance difference in the plots below. The eHiTS Lightning shows the expected behavior in terms of the expected good, i.e. low scores at low RMSD values whereas DockStar/AutoDock accuracy / score correlation has no tendency. These results show that eHiTS Lightning not only offers dramatic speed advantages but also the accuracy advantages we have been espousing. More detail will be published soon.
| Img1: Autodock 4: 250,000 GA: 45 minutes, note the resultant RMSD distribution. | |
| Img2: eHiTS Lightning, on the CELL B/E. 1 minute, note the nature of the Scrore/RMSD distribution, most poses are at low RMSD values. |
Yesterday I blogged about how excited we are about the latest version of eHiTS. The graph below is a teaser graph showing the difference observed in enrichment between eHiTS 6.2 and eHiTS Lightning.
The graph below communicates the enormous differences we are seeing between the two versions of eHiTS. The results are simply outstanding, especially at top2% of the DB, which is the most important part. The plots compare data taken from the paper entitled “Detailed Analysis of Scoring Functions for Virtual Screening” by Stahl and Rarey ( J. Med. Chem. 2001, 44, 1035-1042), executed with eHiTS 6.2 and eHiTS Lightning in accuracy 1 mode. We have used the data contained in this manuscript for a number of years to map our progress version to version. This is the largest jump we’ve seen AND we acknowledge that it took two years to get here. But, all good things to those who wait. The analysis of these data will be discussed in detail in a Case Study document that will be assembled during the holiday season. It is one of MANY such case studies. We will show over the next few weeks how eHiTS performs relative to other tools in the marketplace and make similar historical comparisons of performance. What is coming next? We have a lot more areas we know can tweak out even further improved performance in terms of speed, enrichment and pose accuracy.

When we announced eHiTS Lightning to the world to our users and at conferences we were greeted with a balance of scepticism and interest. Colleagues working with FPGAs and GPUs wondered why we would do the work to implement on the Cell processor rather than follow more tried and true tested methods. Well, we’re not a company to always follow the rules. Fortunately for us Zsolt Zsoldos,our Chief Technical Officer is of the nature to pursue the best solution. He is focused on delivering the best solutions for our users and doing the best science possible. Based on some of the news we are seeing floating around of late (and to be expanded upon in later posts) we made an appropriate decision in terms of choosing a processor that is becoming mainstream, is certainly outperforming FPGAs and GPUS and is offering us the ability to move away from some of the hype associated with what we promised to truly delivering the outstanding performance we expected. For now we are shipping to our beta testers the latest beta release of eHiTS Lightning for people to test. What have we done in the latest release?
During the development process to improve the system we have performed a lot of testing. These tests will form the basis of a series of future blog posts and/or technical notes in the near future. For now I’ll summarize the results here. In comparison to the previous beta-release, we have improved:
Relative to our previous eHiTS release on the Intel platform we have achieved:
In our hands we are seeing improvements across the board. The software is now in the hands of our users for their feedback. We will communicate some of the results summarized here in more detail shortly.

posted by Aniko
For those of you frequenting this blog you will know that we are working on enhancing our ARChem (http://www.simbiosys.ca/archem/index.html) retrosynthetic analysis software. We presented on ARChem recently (http://www.simbiosys.ca/science/presentations/2008-acs-08/archem_acs_081508.pdf) for those of you who are not aware of it. We are of course getting increasing interest from chemists in the application of ARChem but chemical vendors are also interested in how ARChem can help position their chemicals in front of potential clients. We are presently working to add additional chemical vendor catalogues to ARChem and are starting with the Alfa Aesar catalogue, a company now listed on our partners page (http://www.simbiosys.ca/company/partners.html). Other chemical vendors will be added in the next few months.
As developers of scientific software and algorithms we of course have our own approaches to validating the performance of our software tools. We do spend a lot of time testing, validating and improving our algorithms but, clearly, we apply our tools in a manner that might be different from the users of our software. This is because of our deeper understanding of our software, option settings etc. Despite our best efforts to document and improve ease of use different performance in the hands of different users is, in some ways, inevitable. Therefore, we see the results of our users as more of a test of the performance of our software and, where appropriate, we coach and mentor our users as to how to get improved performance from our tools. One of the common issues is that users treat all docking software in the same way. Therefore, they assume that because they have to do a lot of preparation of input data for software applications such as GLIDE that they have to do the same preparation work for eHiTS. This is not true. In fact, preparing data in the same way for different software tools can be potentially disastrous and comparing the results is invalid.
With this in mind we really appreciate the efforts of independent researchers to perform comparisons of performance of different software tools AND put in the work to understand, in depth, how to use the software tools properly. This includes data preparation, setting the appropriate parameters and examining the output in detail. Our experience is that eHiTS outperforms most other docking software available in the marketplace. This has been validated a number of times by our users in their feedback to us, but unfortunately not always published for public consumption. An example of such performance was shown in our examination of a Merck dataset as discussed here:
http://www.simbiosys.ca/ehits/ehits_enrichment.html and shown below in the image. We’ve seen such performance capabilities many times.

In the past few weeks we have received a number of manuscripts from our users. These have been submitted for peer-reviewed publication and represent the application of eHiTS to their problems and, in some cases, to the comparison of performance of eHiTS relative to other software packages. What we continue to see is a consistently enhanced performance of eHiTS over other software packages with eHiTS outperforming other algorithms, both commercial and in-house developments. We are, of course, proud of such achievements and feel validated in our own testing when our users can obtain such results independent of our coaching. These publications should be released in the near future and we will of course point you to them as they are released.
This week we held two design strategy meetings with Life Science companies from the East Coast. One was focused on a new de novo design package that we are working on while the other was for ARChem, our retrosynthetic analysis software platform (http://www.simbiosys.ca/archem/index.html). I’ll comment on the ARChem meeting I led while Zsolt can comment separately on the de novo design meeting he led.
When designing our software solutions we engage our users in providing feedback to us regarding needs, their biases in terms of scientific approaches and their thoughts about improving workflow, usability and algorithms etc. ARChem has been used in large pharma for about 3 years and our recent installations in new companies, and the resulting feedback from the users, has us focused on the next release cycle for the product. With this in mind we chose a different approach to gathering input.
We brought together scientists involved in the original design for ARChem (and therefore experienced users) as well as chemists who had recently trialed the system. We were interested to hear in a public forum, with issues regarding proprietary approaches put to one side, what would they like to see implemented in ARChem to satisfy their needs and move ARChem one step closer to being an ideal platform for chemists to perform computer assisted retrosynthetic analysis. By the end of the meeting we had rank ordered over 30 specific requests that came up in the meeting and the collective attendees had agreed to the primary issues to address for their needs. Some of the requests we had not even considered prior to this meeting and it was definitely one of the best uses of time and a great design session based on user needs…one to be repeated. We are off to work on the outcomes of the meeting and will keep you informed here of our progress.
posted by Aniko
Some of you are likely aware of ARChem, our retrosynthetic analysis software (http://www.simbiosys.ca/archem/index.html). ARChem is the result of 4 years of development and results from a collaborative project with a major pharmaceutical company. Since then we have delivered the system to a number of other companies and we have recently submitted a publication to JCIM (http://pubs.acs.org/journals/jcisd8/index.html). It should be in press shortly. We presented on ARChem recently at the ACS meeting and a copy of the presentation is here.
What we’ve been up to recently is delivering on the needs of some of our users in terms of integration to latest ChemDraw ActiveX component (http://www.cambridgesoft.com/software/details/?ds=2&dsv=92), expanding the list of starting material databases supported by ARChem and, our most exciting news, working with the entire Beilstein reaction database (http://en.wikipedia.org/wiki/Beilstein_database). I had reported previously on the fact that we had been working with the Beilstein database (http://www.simbiosys.ca/blog/2008/05/30/29/). Since then we have reached an agreement with Elsevier to utilize the entire reaction database in order to train our clustering algorithms. More about this in the future but an example image is shown using the Beilstein Database below. Notice on the left of the image that an example reaction from the Beilstein database is displayed.

Over the next few weeks you will see us blogging about the future development of ARChem. We are about to have a roundtable meeting with thought leaders from large pharna regarding the future development of ARChem and will be focused on the outcomes of this meeting to guide our development during our next coding cycle.