Cameron Neylon ONS Talk at Drexel
A Beginner’s Guide to Open Science
(not for beginners but by beginners)
2:00 Friday November 2, 2007
Disque 109, Drexel University
32nd and Chestnut streets, Philadelphia, PA
Cameron Neylon, STFC Rutherford Appleton Laboratory and School of Chemistry, University of Southampton
The modern biochemistry or molecular biology laboratory generates large quantities of data that are generally stored across multiple computers attached to multiple instruments. Much of this data is never published and the majority languishes on old computers and is ultimately lost. At a local level this is a frustration for investigators who will often struggle to obtain specific pieces of data produced in their own laboratory. On a larger scale this is becoming a much more serious issue with the obligation of researchers to funding bodies to both preserve research data and make it available to other users increasingly becoming a formal a condition of publicly funded grants. Systems are required that can capture and preserve data along with sufficient information and metadata to make it possible for others to use this data.
In parallel with this a movement is growing within the research community that advocates greater openness in providing both the raw data from published studies as well as making available the large quantities of data that are never published. The logical extreme of this approach is Open Notebook Science , pioneered at Drexel University , where the researcher’s laboratory notebook is made available on the internet as it is recorded. Achieving the aims of Open Notebook Science also requires systems which can capture data and provide it in a useful format. In addition these systems must make the data visible to relevant online searches.
We are developing and using an electronic laboratory notebook based on a Blog format to capture experimental data in a biochemistry laboratory [3,4]. Within the system each sample is recorded in a single post. Analysis and manipulations of the sample are recorded in separate posts with links back to the input sample and forward to any products. All the information is made immediately available on the Web as it is recorded. The Blog engine has been specially built in house and has a number of features designed to enable and encourage the effective capture of data and metadata in the environment of a biochemistry laboratory. I will describe the Blog system and our evolving approach to capturing metadata as well as the process of integrating this with other web services to provide an open environment for recording work in the laboratory, laboratory materials, and validated procedures. The challenges and problems encountered in reconciling the twin aims of capturing data and making it available and readable will also be discussed along with the similarities and differences emerging between different approaches to Open Notebook Science [2,5,6].