Tuesday, September 26, 2006

Open Notebook Science

Thanks to Beth Ritter-Guth's efforts to clarify the definition of terms relating to Open Source Science, a good discussion has evolved on the Blue Obelisk mailing list. Peter Murray-Rust has made the point that this term may be confused with Open Source Software. However, as Peter notes in a follow-up post, Jamais Cascio from WorldChanging has used this definition of Open Source Science, which is fairly consistent with our use of it in UsefulChem:
...research already in progress is opened up to allow labs anywhere in the world to contribute experiments. The deeply networked nature of modern laboratories, and the brief down-time that all labs have between projects, make this concept quite feasible. Moreover, such distributed-collaborative research spreads new ideas and discoveries even faster, ultimately accelerating the scientific process.

In Open Source Software, the code is made available to anyone to modify and repurpose. What we have been trying to do with UsefulChem is to provide the analogous entity for chemical research, which is raw experimental data along with the researcher's interpretation in a format that anyone can easily re-analyze, re-interpret and re-purpose. A good example of re-purposing is using some results and observations from a failed experiment in a way that was never intended by the original researcher. This just doesn't happen regularly in science because failed experiments are almost never included in publications.

Unfortunately, in addition to the confusion with Open Source Software, others are using the term Open Source Science to mean discussions about pre-prints of regular journal articles.

To clear up confusion, I will use the term Open Notebook Science, which has not yet suffered meme mutation. By this I mean that there is a URL to a laboratory notebook (like this) that is freely available and indexed on common search engines. It does not necessarily have to look like a paper notebook but it is essential that all of the information available to the researchers to make their conclusions is equally available to the rest of the world. Basically, no insider information.

Sunday, September 24, 2006

Tuesday, September 19, 2006

Physics Views on Publishing

Not Even Wrong has a post on the future of scientific publication and open access with a good assessment of the costs involved. What makes this a particularly good read is the collection of over 50 comments mainly from the physics and math communities.

The point is made several times that the arXiv experiment in chemistry, the Elsevier Chemistry Preprint server, was a failure. I actually published multiple times using that vehicle and it was indeed a shame when they shut down. This was a totally free model - for authors and readers - and authors kept their copyright. Here is the explanation of why they shut it down.
Despite the wide readership of CPS preprints, the chemistry community has unfortunately not contributed articles or online comments to this service in sufficient numbers to justify further development. The decision has therefore been taken to stop the processing of any further submissions.

Monday, September 18, 2006

Book Chapter on Open Access

Taken from Knowledge Transfer Innovations:
Benkler addresses the forces affecting Scientific Publication in this chapter of his book, “The Wealth of Networks. (Click the link then scroll down through the chapter to the subhead Scientific Publication’.

This is basically a summary of the clash between publishers, scientists and those who judge them.

Saturday, September 09, 2006

Open Source Science in Class

Beth Ritter-Guth is currently running her English class with a very special assignment for her students: research and document the emerging role of Open Source Science. They are tackling this using multiple information sources, including blogs, articles and interviews.

They interviewed me this week and the podcast is now available.

Anyone with an interest in Open Source Science is free to comment on their work in progress posted on this wiki page. Also see Beth's Technical Writing class blog on OSS.

Another part of this assignment will involve understanding the larger context of the UsefulChem research projects on malaria, AIDS and arsenic in drinking water. They will be interviewing people involved in this type of research, including the students working in my lab, mainly on the malaria project.

Wednesday, September 06, 2006

Can Librarians Learn to Love Science Wikis?

The students in my lab have been posting their raw experimental data to a blog since February. Since that time we have evolved this process onto a wiki. This has been very useful for tracking the contributions of multiple students and editors. It has particularly convenient to annotate the text directly with temporary comments, such as, "dead link here" or "redo this analysis with these conditions" , etc. On a blog, the previous versions of a post are deleted, which makes comments inapplicable after revision.

However, the great flexibility of such a publication system has made many librarians uneasy, as I have learned through conversations with them.

One of the main issues is referenceability.

One of the advantages of using a wiki is that pages can be updated when new information becomes available. However, if that is the case, then how can morphing information sources be verified? Formal publications have already had to deal with this issue when citing websites for information that can only be found on the web. The current trend seems to be to state the time the website was accessed next to the hyperlink. This is a pretty weak form of referencing since there is no reliable way to verify what was on that site at that time. Of course, there is the Wayback Machine archiving the internet over time but that requires some forethought to submit the website and will only be updated about once a month.

Others have proposed services that freeze a copy of the web page on a third party site when making a citation. (I don't have a link for this - maybe someone can remind me of an example). The wiki (specifically Wikispaces) has both the third party time stamp and the version available automatically. If the publication time of a document is known, it should usually be possible to find the version of the wiki page available at that time. To make this explicit, one could just post the link to the version in addition to the current page.

