Updated on 8 February 2017
The concept of reproducibility is one of the foundations of scientific practice and the bedrock by which scientific validity can be established. However, the extent to which reproducibility is being achieved in the sciences is currently under question. Several studies have shown that much peer-reviewed scientific literature is not reproducible.1-3 One crucial contributor to the obstruction of reproducibility is the lack of transparency of original data and methods. Reproducibility, the ability of scientific results and conclusions to be independently replicated by independent parties, potentially using different tools and approaches, can only be achieved if data and methods are fully disclosed.
In the biomedical sciences, the issue is further complicated by the fact that we now typically deal with very large, multi-faceted and highly complex datasets (largely a result of technological advances which have led to a rapid growth in data generation). To truly facilitate reproducibility, not only does this data need to be fully disclosed, it also needs to be shared in way that is practical for other scientists to search and scrutinize: it needs to be reusable.
Several initiatives have been established to facilitate data reusability and drive improved reproducibility. The outcomes of one such recent initiative, designed and endorsed by a multi-disciplinary group of stakeholders, are known as the ‘FAIR Principles'.4 They advise that data be Findable, Accessible, Interoperable and Reusable:
(Meta)data are assigned a globally unique and persistent identifier
Data are described with rich metadata (defined in the ‘reusability' bullet below)
Metadata clearly and explicitly include the identifier of the data it describes
(Meta)data are registered or indexed in a searchable resource
(Meta)data are retrievable by their identifier using a standardized communications protocol
The protocol is open, free, and universally implementable
The protocol allows for an authentication and authorization procedure, where necessary
Metadata are accessible, even when the data are no longer available