<?xml version="1.0" encoding="UTF-8"?><!-- generator="wordpress/2.0.11" -->
<rss version="2.0" 
	xmlns:content="http://purl.org/rss/1.0/modules/content/">
<channel>
	<title>Comments on: Floating point errors</title>
	<link>http://www.simbiosys.com/blog/2008/05/28/floating-point-errors/</link>
	<description>Addressing the challenges of computational drug discovery</description>
	<pubDate>Fri, 03 Sep 2010 19:30:16 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.0.11</generator>

	<item>
		<title>by: zsolt</title>
		<link>http://www.simbiosys.com/blog/2008/05/28/floating-point-errors/#comment-1509</link>
		<pubDate>Fri, 05 Sep 2008 17:53:24 +0000</pubDate>
		<guid>http://www.simbiosys.com/blog/2008/05/28/floating-point-errors/#comment-1509</guid>
					<description>Trevor,

No, unfortunately, you are wrong for 2 reasons:

1. First of all what you say is not true: differences like 4.200000 versus 4.199999 will happen in the same form for doubles, the only "advantage" is that you have a longer list of 9 digits for doubles, but the first difference still occurs at the first decimal place.

2. OK, I know that is not what you meant. You mean, that the difference is smaller, so if I subtract the two results than I get 0 for at least 6 decimal places. However, that does not "solve" the problem either. The problem is not how accurate the value is, but whether or not a fundamental algebraic identity is broken. ANY difference, however small in a+b+c!=c+a+b can lead to horrible consequences. For example, if you are sorting partial solutions and want to keep the "best 100" for further processing, then you selection set will be different depending on what order you added up numbers during calculations. Same problem if you use any kind of threshold decisions, e.g. "solutions under a given energy limit are accepted" -- again you get different solution sets that may even be radically different in terms of conformation -- once you drop a solution it is simply not present and there is no guarantee that you have another similar solution that made it through the threshold condition.

ZZ</description>
		<content:encoded><![CDATA[<p>Trevor,</p>
<p>No, unfortunately, you are wrong for 2 reasons:</p>
<p>1. First of all what you say is not true: differences like 4.200000 versus 4.199999 will happen in the same form for doubles, the only &#8220;advantage&#8221; is that you have a longer list of 9 digits for doubles, but the first difference still occurs at the first decimal place.</p>
<p>2. OK, I know that is not what you meant. You mean, that the difference is smaller, so if I subtract the two results than I get 0 for at least 6 decimal places. However, that does not &#8220;solve&#8221; the problem either. The problem is not how accurate the value is, but whether or not a fundamental algebraic identity is broken. ANY difference, however small in a+b+c!=c+a+b can lead to horrible consequences. For example, if you are sorting partial solutions and want to keep the &#8220;best 100&#8243; for further processing, then you selection set will be different depending on what order you added up numbers during calculations. Same problem if you use any kind of threshold decisions, e.g. &#8220;solutions under a given energy limit are accepted&#8221; &#8212; again you get different solution sets that may even be radically different in terms of conformation &#8212; once you drop a solution it is simply not present and there is no guarantee that you have another similar solution that made it through the threshold condition.</p>
<p>ZZ
</p>
]]></content:encoded>
				</item>
	<item>
		<title>by: Trevor</title>
		<link>http://www.simbiosys.com/blog/2008/05/28/floating-point-errors/#comment-1508</link>
		<pubDate>Fri, 05 Sep 2008 17:27:19 +0000</pubDate>
		<guid>http://www.simbiosys.com/blog/2008/05/28/floating-point-errors/#comment-1508</guid>
					<description>"This problem cannot be solved by using double precision"

Actually, it can. If you change the floats to doubles, the sums will be identical for at least the first 6 or so decimal places.</description>
		<content:encoded><![CDATA[<p>&#8220;This problem cannot be solved by using double precision&#8221;</p>
<p>Actually, it can. If you change the floats to doubles, the sums will be identical for at least the first 6 or so decimal places.
</p>
]]></content:encoded>
				</item>
	<item>
		<title>by: Unilever Centre for Molecular Informatics, Cambridge - petermr&#8217;s blog &#187; Blog Archive &#187; Quality is emerging in chemical software</title>
		<link>http://www.simbiosys.com/blog/2008/05/28/floating-point-errors/#comment-1486</link>
		<pubDate>Wed, 04 Jun 2008 08:30:45 +0000</pubDate>
		<guid>http://www.simbiosys.com/blog/2008/05/28/floating-point-errors/#comment-1486</guid>
					<description>[...] PMR: ZZ addresses this below in reporting a competition and I&#8217;ll continue there Are the docking and QSAR study results reproducible ? With eHiTS and LASSO, the answer is definitely YES! I understand that many tools on the docking/QSAR market use stochastic (read random) methods and therefore their results are inherently unreproducible. Again, I can only speak with authority about our own software, which uses strictly deterministic and reproducible techniques. So if a different researcher in a different location runs our software on the same input they will get the same result. However, I do not see how one could run the “same calculation” using a different software. By definition, if you are using a different software (which embodies the calculation) then you are not running the same calculation. I can assure you the same is true for QM software as well, for the simple floating point error reasons I have explained in a previous blog post. So any different QM implementation will necessarily involve computation steps in different orders (as simple as summation in different order will suffice) and therefore get slightly different results. [...]</description>
		<content:encoded><![CDATA[<p>[&#8230;] PMR: ZZ addresses this below in reporting a competition and I&#8217;ll continue there Are the docking and QSAR study results reproducible ? With eHiTS and LASSO, the answer is definitely YES! I understand that many tools on the docking/QSAR market use stochastic (read random) methods and therefore their results are inherently unreproducible. Again, I can only speak with authority about our own software, which uses strictly deterministic and reproducible techniques. So if a different researcher in a different location runs our software on the same input they will get the same result. However, I do not see how one could run the “same calculation” using a different software. By definition, if you are using a different software (which embodies the calculation) then you are not running the same calculation. I can assure you the same is true for QM software as well, for the simple floating point error reasons I have explained in a previous blog post. So any different QM implementation will necessarily involve computation steps in different orders (as simple as summation in different order will suffice) and therefore get slightly different results. [&#8230;]
</p>
]]></content:encoded>
				</item>
	<item>
		<title>by: SimBioSys Blog &#187; Blog Archive &#187; Quality in chemical software - the debate continues</title>
		<link>http://www.simbiosys.com/blog/2008/05/28/floating-point-errors/#comment-1483</link>
		<pubDate>Tue, 03 Jun 2008 22:44:54 +0000</pubDate>
		<guid>http://www.simbiosys.com/blog/2008/05/28/floating-point-errors/#comment-1483</guid>
					<description>[...] Are the docking and QSAR study results reproducible ? With eHiTS and LASSO, the answer is definitely YES! I understand that many tools on the docking/QSAR market use stochastic (read random) methods and therefore their results are inherently unreproducible. Again, I can only speak with authority about our own software, which uses strictly deterministic and reproducible techniques. So if a different researcher in a different location runs our software on the same input they will get the same result. However, I do not see how one could run the &#8220;same calculation&#8221; using a different software. By definition, if you are using a different software (which embodies the calculation) then you are not running the same calculation. I can assure you the same is true for QM software as well, for the simple floating point error reasons I have explained in a previous blog post. So any different QM implementation will necessarily involve computation steps in different orders (as simple as summation in different order will suffice) and therefore get slightly different results. PMR: [...]</description>
		<content:encoded><![CDATA[<p>[&#8230;] Are the docking and QSAR study results reproducible ? With eHiTS and LASSO, the answer is definitely YES! I understand that many tools on the docking/QSAR market use stochastic (read random) methods and therefore their results are inherently unreproducible. Again, I can only speak with authority about our own software, which uses strictly deterministic and reproducible techniques. So if a different researcher in a different location runs our software on the same input they will get the same result. However, I do not see how one could run the &#8220;same calculation&#8221; using a different software. By definition, if you are using a different software (which embodies the calculation) then you are not running the same calculation. I can assure you the same is true for QM software as well, for the simple floating point error reasons I have explained in a previous blog post. So any different QM implementation will necessarily involve computation steps in different orders (as simple as summation in different order will suffice) and therefore get slightly different results. PMR: [&#8230;]
</p>
]]></content:encoded>
				</item>
	<item>
		<title>by: ChemSpiderMan</title>
		<link>http://www.simbiosys.com/blog/2008/05/28/floating-point-errors/#comment-1479</link>
		<pubDate>Thu, 29 May 2008 00:31:56 +0000</pubDate>
		<guid>http://www.simbiosys.com/blog/2008/05/28/floating-point-errors/#comment-1479</guid>
					<description>Thanks for the education! ANd keep this type of post coming. It's likely educational for the majority of the people in the domain who don't understand the specifics.</description>
		<content:encoded><![CDATA[<p>Thanks for the education! ANd keep this type of post coming. It&#8217;s likely educational for the majority of the people in the domain who don&#8217;t understand the specifics.
</p>
]]></content:encoded>
				</item>
</channel>
</rss>
