Buy

Books
Click images for more details

Twitter
Support

 

Recent comments
Recent posts
Currently discussing
Links

A few sites I've stumbled across recently....

Powered by Squarespace
« US cold caused by warming | Main | Tip jar again »
Wednesday
Mar022011

Why code should be published

Nick Barnes has written an interesting article on why scientific code should be published, with particular reference to John Graham-Cumming's work on the Russell review code.

This report included a good algorithmic description, and has been accompanied by source code. We greatly welcome both of these departures from the norm, as setting a good example and following the report’s own recommendation. These facts also allow us to illustrate particular reasons why code release is important, and why science software skills should be improved.

The four separate bugs – in the description, in the code, in the configuration, and in the expectation of the reader – are, in this case, trivial and unimportant – they do not affect the broad results of the report in any way. However, each is characteristic of problems with science software which can be more serious, and which are impossible to discover unless code is released.

PrintView Printer Friendly Version

Reader Comments (16)

There seems to be a growing consensus that code etc should be published. I am glad that the issue is now becoming settled

Why do some scientists continue to deny the obvious? Are they just trying to hide the decline in their beliefs?

Mar 2, 2011 at 11:55 AM | Unregistered Commentergolf charley

According to author Steve McConnell in Code Complete: A Practical Handbook of Software Construction the average number of errors/defects per KLOC (1000 lines of code) is 15-50. And that is delivered or released code. Funny thing is, in the InfoSec industry, one error could be detrimental. In the Climate science industry, one error is no big deal.

This is one damn good reason all publicly funded science related code should be released.

Mar 2, 2011 at 12:40 PM | Unregistered CommenterKevin

I feel really sad. All the billions of dollars wasted on fake climate research and the global warming scam could have been better used to put the human race into space already. Whys does NASA study climate change? I thought NASA's goal was to put humans in space? Eventually we have to move out and colonise other planets, so why delay?

Mar 2, 2011 at 1:14 PM | Unregistered CommenterBuzz Lightyear

Buzz Lightyear

Maybe we should assemble all climate scientists, and send them into space, permanently

Mar 2, 2011 at 1:37 PM | Unregistered Commentergolf charley

Now Buzz, the believers are going to accuse you of making wicked serious threats against them.

Mar 2, 2011 at 1:39 PM | Unregistered Commenterhunter

@ Buzz

Good practical thinking.

If the sun enters a long Grand Minimum and temperatures plummet, climate scientists should be sent there to take direct measurements.

Oh, I forgot - they don't do empirical science ... and even if they did, their computer models would tell them to go at night!

Mar 2, 2011 at 2:27 PM | Unregistered CommenterR2

There are, or at least have been, a number of professional software engineers who have posted to this blog, particularly with regard to Climategate. They did discuss the engineering procedures that they must follow everyday in their work: Design reviews, design documentation, implementation plans, coding standards, coding, code review, code review, code review, testing, integration, testing, regression testing, and on and on. And as for peer review, just make one little mistake and see what happens at the code review. It can be brutal and so everyone triple checks their work before hand.

The typical "scientific programmer", and believe me I have seen a lot of their work at university, has a book in one hand something like "Fortran for Dummies" and a five line outline of his algorithm, which will change continually.

While some code done at the university can be quite elegant and beautiful, most is sloppy and rushed. Having the code published in it's full glory would instantly bring up the quality 1000 fold. Just ask any real software engineer about their peer reviews.

I consider any scientific work based on computer code without full disclosure of that code, either by publishing the full particulars of the commercial code used, or the actual purpose built code for the project as just "grey" science and suspect.

Mar 2, 2011 at 2:38 PM | Unregistered CommenterDon Pablo de la Sierra

At least if these climate scientists didn't want to release their precious code his precious code (ring)>, they would come up with a better reason like national security. I thought the NSA or the Pentagon was making this argument last year even if it is horsesh!t.

Mar 2, 2011 at 2:43 PM | Unregistered CommenterKevin

I hate it when I screw up html tags...

should have read:

At least if these climate scientists didn't want to release their precious code [insert Josh comic here with a climate scientist clutching his precious code (ring)], they would come up with a better reason like national security. I thought the NSA or the Pentagon was making this argument last year even if it is horsesh!t.

Mar 2, 2011 at 2:46 PM | Unregistered CommenterKevin

"The typical "scientific programmer", and believe me I have seen a lot of their work at university, has a book in one hand something like "Fortran for Dummies" and a five line outline of his algorithm, which will change continually. "

Even the grossest code can appear elegant to the superficial gaze when pretty-printed!

Mar 2, 2011 at 3:08 PM | Unregistered CommenterAJC

I hate it when I screw up html tags...

So do I, Kevin. I have fumbly fingers as well so I use the Preview Post option. Just a suggestion. :)

Mar 2, 2011 at 3:28 PM | Unregistered CommenterDon Pablo de la Sierra

AJC
Even the grossest code can appear elegant to the superficial gaze when pretty-printed!

Not if you read it, which software engineers would do.

Mar 2, 2011 at 3:46 PM | Unregistered CommenterDon Pablo de la Sierra

Don Pablo,

The preview option seems to lead to commenting errors. But, otherwise, yes I agree. In reality, I meant to to do it on purpose to make the case for how easy it is to make coding errors...lol. :-) Kidding of course.

Mar 2, 2011 at 4:04 PM | Unregistered CommenterKevin

Perhaps we should have SW engineers do a code review of your postings, Kevin. :)

Mar 2, 2011 at 4:10 PM | Unregistered CommenterDon Pablo de la Sierra

I don't work in software and know nothing about versioning, configurations, etc., yet even my experience demonstrates beyond doubt desirability of disclosing the code.

In decades of work as a patent lawyer, of which a significant part involved presenting in patent applications the inventions that inventors had described to me in disclosure documents, I found it depressingly common that the arrangements described in the inventors' disclosures could not possibly yield the results claimed--and/or that the actual code, when I could get it, was inconsistent with what the other disclosure documents had said.

In most cases these shortcomings did not result from the inventors' lack of honesty or intelligence. The inventors presumably had an incentive to avoid the patent invalidity in which inaccurate descriptions could result, and I can attest to the fact that many of those inventors were highly intelligent indeed. I think that in most cases the inconsistencies arose from laudable attempts to present the central concepts without imposing upon the reader the need to slog through distracting details. But inconsistencies there were--and those inconsistencies became much easier to detect when the code was provided.

Even when the other description was neither wrong nor inconsistent with the software, moreover, I found that access to the actual code often enabled me to eliminate in a few minutes ambiguities that even days of effort might otherwise have been inadequate to dispel.

And there's another aspect of my experience that confirms Mr. Barnes' analysis: when the disclosure did take the form of computer code, that code itself was often wrong. In one case, for instance, the heart of the invention--made by highly regarded and widely published experts--was embodied in a code snippet consisting of a mere fifteen or twenty lines of C code, yet I had to go back to the inventors repeatedly to show them that the latest version they'd given me still had fatal errors. (Those who find it hard to believe that competent workers could produce such errors repeatedly in so few lines should try to write thread-safe transaction software.)

I don't know whether the problem in that case was that the inventors had never yet actually implemented the invention or that they had but the code they gave me resulted from "cleaning up" the working code to make it more intelligible to an outsider. But I do know that what they were working on was software that is literally used by millions. The point is that software is just going to have errors--it's the nature of the beast--and revealing the software to others makes it much more likely that the errors will be identified and corrected--even when those to whom the software is disclosed are, as was true in my case, non-experts.

Mar 2, 2011 at 4:22 PM | Unregistered CommenterJoe B.

I wonder if the RealClimate/Tamino fans who parroted their anti-open-code talking points are having seconf thoughts?

Mar 2, 2011 at 4:40 PM | Unregistered CommenterMikeN

PostPost a New Comment

Enter your information below to add a new comment.

My response is on my own website »
Author Email (optional):
Author URL (optional):
Post:
 
Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>