- Shouldn’t authors get to maintain control of their materials/data?
- Is the PRO Initiative a “boycott”?
- What about copyright?
- How can I ensure that participants’ identities remain private?
Shouldn’t authors get to maintain control of their materials/data?
There is a tendency to have a sense of ownership of the products of our research. We believe that, although natural, this sense is misguided. None of us could do research without the support of the community of researchers in which we are embedded, or the support of society at large. Openness is a way of acknowledging the debt that we owe others.
We note, however, that there is nothing in the Initiative preventing any researcher from putting conditions on the use of the research products. Although we do not, in general, advocate for this practice to become normative, one could meet the requirements of the Initiative by placing all materials in a data repository, and ask that the materials only be used for verification unless written permission for another use is obtained. If possible, it would be wise to explicitly give a date after which the materials can be used without restriction, to ensure that the status of the data is not uncertain if you cannot be reached for permission. If you release data with restrictions, ensure that the data repository allows licensing with restrictions.
We note that currently, the American Psychological Association “open access” journal has an exceptionally onerous procedure for requesting data from authors published in their journals. To even begin the process, a long form must be submitted which, among other things, requires offering authorship to the originators of the data and a professional CV. (for critiques of the APA position, see Denny Borsboom’s comment and Jelte Wichert and Marjan Bakker’s comment on APA policy). This level of control is contrary to the scientific values of openness, and simply ensures that the originators of data always have complete control over who has access to data, and thus what conclusions are ultimately drawn from it. High-profile examples exist of authors refusing to share data with critics (see here; a “reason” for refusal can always be manufactured). This is unacceptable in a professional scientific setting.
We encourage all authors to be be more open than restrictive policies such as the APA’s would require them to be.
Is the PRO Initiative a “boycott”?
No, the PRO initiative is not a boycott. Under the Initiative, there are no cases where a review is not given. If a manuscript is received for review which does not include data or materials, PRO reviewers ask the authors to either post them or to justify why they are not posted, and agree to share their reason with readers. Only if the authors refuse to justify their reporting decisions will the review be any different from what it would otherwise have been.
Even in this case, however, PRO reviewers offer a non-comprehensive review. It is not uncommon for reviewers who believe that something vital is missing from a manuscript — say, an additional experiment, critical interface with relevant literature, or even simply adequate clarity of language — to focus only on the vital missing part. In other cases, this is not a considered a “boycott” of the review process. And it should not be in the case of PRO. PRO reviewers, in fact, are more careful than is often typical, because they engage the authors before writing their review.
This brings in to focus the difference between a PRO reviewer and a non-PRO reviewer: a PRO reviewer believes that, at the very least, a justification for the lack of data and materials is vital for quality scientific reporting. A non-PRO reviewer places less value on this. This is a judgment call that every reviewer can make for themselves. But it is not, in any sense, a boycott.
What about copyright?
[The contents of this section should not be construed as legal advice. We recommend you read the policy of your own employer. See also “Licensing Issues”]
Copyright is a complicated topic, particularly given the international and multi-media nature of scientific research. Policies differ across countries, universities, and even for different people in the same university. Copyright, by default, lies with one’s employer; but often University policy hands the copyright for written work, and the related materials, back to the author in order to allow them to publish without interference.
Data, on the other hand, typically cannot be copyrighted, because it is a set of facts. It often can be shared without worry about who owns the copyright. Any restrictions on the use data must arise outside of copyright law (e.g., a contract) or from ethical considerations (e.g., that it would be unethical to use the data without attribution).
Analysis and experimental code is more complicated; often, policy states that the copyright for code is retained by the employer (under the assumption that software is monetizable). In our conversations with University intellectual property liasons, we have been ensured that these policies are not intended to interfere with the release of experimental programs or analysis code, but authors should check with their own organizations’ intellectual property office for more details.
How can I ensure that participants’ identities remain private?
The details that may allow an individual to be personally identified differ based on circumstances. It is important to keep in mind that the information in the data file itself is not necessarily the only available information that can lead to identification. For instance, if the description of the data in the published work mentions that all research participants were students of a particular course at a particular university, then a birthdate alone given in the data set may be sufficient to identify an individual. Individual researchers are expected to thoughtfully consider which details may be safely released, and to omit information that compromises participant confidentiality from the available data set. Note however, that concerns about anonymity alone do not justify failure to disclose an entire data set. See also “Enlisting help from your institution”]
One thought on “Frequently Asked Questions”
A strong ethical argument for data sharing applies to data collected from human subjects.
Clearly there must be a strong presumption that most persons agreeing to answer a survey, to release their medical records, or even to contributing biological samples, are doing so with the intent to “advance science” rather than the career or profits of a single or small group of scientists.
While these volunteers obviously also have a right and expectation to have personal identifying information expunged or obfuscated, this privacy concern does not eviscerate the separate right and expectation that the value of the data contributed by volunteers (or even paid subjects) should be maximized by making it available to as many researchers as possible.
Obviously, if human subjects knowingly agree that they want to make their data available specifically to a certain scientists or institution, and that is part of their consent form, then the above presumption of an intent to help all scientists is void. But even in that case, the consent form should be published along with a mention of the limitation this puts on the published results, namely issues related to potential selection bias . . . for if a subject chooses to allow their data to be analyzed only by specific persons that limitation itself points to the possibility that the subjects and the researchers share some agenda that is fearful of third party re-analyses. There may be legitimate reasons for such limited collaborations, perhaps when there a profit sharing agreement related to unique cell lines, for example. But even in these cases, the fact that the human subjects chose to limit access of their data or materials to only specific researchers, and the reason for doing so, should be part of the public record.