The considerations about open data and making data more accessible and reusable have increasingly come to the forefront in the agenda of funders, institutions and publishers, and they are likely to remain the subject of active discussion and developments in the near future. At PLOS we take the view that more openness in data availability provides huge benefits to science and we are delighted to see this being discussed by all stakeholders.
To underscore our position about the availability of data, PLOS updated its data policy  in March 2014, such that we now require the data underlying the results of all PLOS research articles to be available with only a few rare exceptions. At the time, and ever since the policy was implemented, we have regularly received and sought comments from researchers and from our contributors – we always appreciate any feedback that helps us to develop clearer guidelines for authors and also to take a broader perspective on some of the issues related to the implementation of the updated policy.
Three years on, our experience with the updated policy has been very positive. We have observed increased availability of data as well as growing awareness of the policy among authors, editors and reviewers. We have also experienced, as expected, that there are areas where data sharing is more challenging than others—research involving clinical patient data is a clear example, where realizing the benefits of data sharing comes with real challenges . The updated policy reset expectations regarding availability of data at the time of publication – not a requirement under the earlier policy – and this is something we have had to carefully consider for submissions involving human-subjects research.
Request for data from PLOS ONE publication related to the PACE trial
While careful considerations of patient privacy have led to the sharing of de-identified patient-level data in most papers that we have published since strengthening the data policy, we have also encountered challenging cases. One of the cases we have followed up on has highlighted both the challenges of data sharing in clinical research and the drawbacks of limited frameworks for data availability. Published in 2012 under our earlier data policy, the PLOS ONE article “Adaptive pacing, cognitive behaviour therapy, graded exercise, and specialist medical care for chronic fatigue syndrome: a cost-effectiveness analysis”  reports a cost-effectiveness analysis on results from the PACE (Pacing, graded Activity and Cognitive behaviour therapy; a randomised Evaluation) trial, originally published in The Lancet in 2011 .
The PLOS policy in place at the time of publication in PLOS ONE reads as follows:
‘Publication is conditional upon the agreement of the authors to make freely available any materials and information described in their publication that may be reasonably requested by others for the purpose of academic, non-commercial research.’
We received a request for the data underlying the article from a reader who indicated to us that he had previously contacted the authors but the dataset had not been provided, either by them or their institution. Other readers raised concerns about aspects of the cost analyses – open exchanges about methodology and assumptions for the analyses had taken place via comments on the article soon after publication.
Evaluation of data requests and follow up
The initial steps in our follow up had two main aims: establish if the concerns about the analyses were grounded and, within the framework of the editorial policy, determine what data would be necessary to replicate the work. We sought advice from two editorial board members who recommended pursuing access to the dataset and advised that individual-level patient data for Tables 1-5 were necessary to allow replication of the analyses.
In line with the advice received, we approached the authors to request the data and further information about the analyses. The authors indicated that releasing individual-level patient data represented a risk towards patient confidentiality and felt that releasing the data would go beyond the stipulations of the consent signed by the participants of the trial. The authors offered to have the data reanalyzed by a commonly agreed adjudicator; however, in light of the concerns raised by multiple readers and the requests for access to the data, we considered that this proposal would come short of the expectations of the policy in place when the article published.
We take concerns over patient confidentiality seriously and therefore approached Queen Mary University of London (QMUL), the principal investigator’s institution, to request their support in establishing a mechanism independent of the authors to facilitate other researchers’ data access while preserving patient privacy, for example via a Data Access Committee. The rationale for involving a Data Access Committee is that such a committee can evaluate whether the proposed re-analysis compromises patient privacy in the context of the original consent, while providing independence from the authors in this evaluation.
At the time we approached QMUL, a Freedom of Information (FOI) request for the data from the main PACE trial  was ongoing. Upon completion of the evaluation of the FOI request, the Tribunal ruled that the data for the main outcomes of the trial should be released, based on their position that the identification of patients was a remote possibility. Following this decision and in line with the Tribunal’s ruling, QMUL released data for some of the primary outcomes.
Expression of Concern
From our follow up with the authors and QMUL, we understand that a framework is in place to consider requests for data from the PACE trial. This framework entails direct involvement by the authors on considerations on whether the data can be shared, and imposes other restrictions that we view as incompatible with the relevant data sharing policy.
At this point, PLOS has not yet received confirmation that QMUL has established a mechanism, compatible with the relevant data policy, that would allow independent evaluation of requests to access data underlying the PLOS ONE article. Since we feel we have exhausted the options to make the data available responsibly, and considering the questions that were raised about the validity of the article’s conclusions, we have decided to post an Expression of Concern  to alert readers that the data are not available in line with the journal’s editorial policy. It is our intention to update this notice when a mechanism is established that allows concerns about the article’s analyses to be addressed while protecting patient privacy.
Current challenges and opportunities ahead
During our follow up it became clear that there is little consensus of opinion on the sharing of this particular dataset. Experts from the Data Advisory Board whom we consulted expressed different views on the stringency of the journal reaction. Overall they agreed on the need to consider the risk to confidentiality of the trial participants and on the relevance of developing mechanisms for consideration of data requests by an independent body or committee. Interestingly, the ruling of the FOI Tribunal also indicated that the vote did not reflect a consensus among all committee members.
What lies at the heart of the complexity of this case, and the question of data availability for clinical data in general, is the tension between encouraging open science and the duty to protect those who generously contribute towards public benefit by participating in clinical research. There is a need for mechanisms and policies to address the different challenges related to confidentiality and these will require input from all stakeholders to maximize responsible data sharing and reuse.
Funding agencies and institutions can support the development of scalable infrastructure that allows long-term preservation and access to datasets, as well as education and training among researchers to encourage data management plans as one of the steps in the research process. The availability of dedicated committees and services within institutions to evaluate data access requests and provide advice on ethical and legal considerations related to data deposition and access would also support greater data availability, while allowing more consistency in considerations over clinical data.
There is also a need to adequately recognize those researchers who take steps to make their data available and reusable. Publishers can and should support that authors get credit for sharing their data, for example via the development of policies and the implementation of technological infrastructures towards machine-readable data citations.
Our Advisory Board noted the need for trialists to consider amendments to consent procedures to cover the eventuality of data being shared. This is also referred to in ICMJE’s proposal  for the sharing of clinical trial data, which proposes to require that authors of clinical trials share de-identified individual-patient data (those underlying the results presented in the article) no later than six months after publication. The proposal received a large number of responses and mixed critiques, with some supporting the request for data to be shared while others raising concerns over risks of de-identification or inappropriate re-analyses unless sufficient information was provided with the datasets, as well as the costs involved in the process of anonymizing and preparing datasets for deposition.
Towards more openness in clinical research data
The debate among the research community and relevant stakeholders should continue. At our level, we have already observed benefits from setting clearer expectations for data sharing at the time of publication via our updated policy: for most submissions involving clinical data, we have been able to work with authors to provide clear indications on where the data can be accessed or which restrictions apply and why.
We are very much aware of the challenges involved and the fact that new ones will arise as policies for open data and new technologies are developed but rather than settling on the position that data cannot be shared we feel we should renew efforts to overcome the barriers towards appropriate channels for the sharing of data from clinical research. It will take time and will be complex but having an open debate on these matters is already a huge step towards a framework that allows more openness in clinical data. We view this as key to benefit not only science but also society as a whole.
- Bloom T, Ganley E, Winker M (2014) Data Access for the Open Access Literature: PLOS’s Data Policy. PLoS Biol 12(2): e1001797.
- The PLOS Medicine Editors (2016) Can Data Sharing Become the Path of Least Resistance? PLoS Med 13(1): e1001949.
- McCrone P, Sharpe M, Chalder T, Knapp M, Johnson AL, Goldsmith KA, et al. (2012) Adaptive Pacing, Cognitive Behaviour Therapy, Graded Exercise, and Specialist Medical Care for Chronic Fatigue Syndrome: A Cost-Effectiveness Analysis. PLoS ONE 7(8): e40808.
- White PD, Goldsmith KA, Johnson AL, Potts L, Walwyn R, et al. (2011) Comparison of adaptive pacing therapy, cognitive behaviour therapy, graded exercise therapy, and specialist medical care for chronic fatigue syndrome (PACE): a randomised trial. Lancet 377: 823–836.
- The PLOS ONE Editors (2017) Expression of Concern: Adaptive Pacing, Cognitive Behaviour Therapy, Graded Exercise, and Specialist Medical Care for Chronic Fatigue Syndrome: A CostEffectiveness Analysis. PLoS ONE 12(5): e0177037.
- Taichman DB, Backus J, Baethge C, Bauchner H, de Leeuw PW, Drazen JM, et al. (2016) Sharing Clinical Trial Data: A Proposal from the International Committee of Medical Journal Editors. PLoS Med 13(1): e1001950.