Reproducibility PUBLISHING: Taking the Pulse
The community meeting on publishing reproducible research took place on June 24, 2021 (see meeting notes, slides, and motivating questions). Over 25 people participated in the conversation representing researchers, publishers, librarians, and data and information professionals.
These are the main themes and key questions that emerged from the conversation: goals and standards, reforms and initiatives, peer review, and support for conducting reproducible research in the first place.
Goals and standards of publishing reproducible research
Doing reproducible research can be difficult in and of itself and publishing reproducible research has its own set of challenges. To overcome some of the socio-technical challenges to publishing reproducible research, the community would do well to be very clear on its meaning and its value.
· Why publish reproducible research?
· Who is responsible for ensuring that published materials are reproducible?
There is broad recognition of the rationale for computational reproducibility as a mechanism for achieving the goals of transparency and verifiability. The 2019 report, Reproducibility and Replicability in Science, by the National Academies of Science, Engineering, and Medicine (NASEM) recommends that journals, “consider ways to ensure computational reproducibility for publications that make claims based on computations, to the extent ethically and legally possible,” recognizing that this poses technological and practical challenges for journals (as well as for researchers). Publishers and scholarly societies are beginning to write policies that reflect their commitment to reproducibility.
For example, the American Economic Association articulated its goals when it revised its data and code policy in 2019: “(1) to encourage and reward incorporating basic principles of replicability into researchers’ workflow; (2) to prioritize linking to existing data and code repositories, as the primary mechanism of providing source materials, with a journal-sanctioned repository as a fallback archive; (3) to require and facilitate proper documentation of restricted-access data; (4) to enforce a limited measure of verification; and (5) to balance the previous goals with the need to reduce the burden on authors, not increase it.”
· What is it that we want to reproduce? What does it mean to publish reproducible research?
Participants indicated that there is no consensus about what should be published in support of reproducibility. If there is agreement that the basic elements of computational research include code or software and data (and associated documentation and metadata), we recognize that each is subject to broad variation in usage and interpretation among disciplines and research traditions. This is closely tied to another fundamental question:
· If computational reproducibility is not a one-size-fit-all, can we expect (or should we work toward) universal standards for publishing reproducible research? Or is the system we envision a federated one?
In some scenarios, the goal of computational reproducibility is to test that software is executable. In others, the goal of computational reproducibility is to verify that a particular finding is correct. Accordingly, the source materials that need to be published in order to satisfy reproducibility in each scenario vary greatly. For example, what does it mean to publish executable code? In some disciplines the code requires resource-heavy builds, in others it may be a few lines of script using open-source software. Similarly, requirements and conventions around publishing data may vary depending on the type and source of the data. In some disciplines, it might be completely unobjectionable to publish a simulated dataset while in others access to original data is critical.
Given domain-specific checklists, tools, reproduction infrastructure, and guidelines intended to help researchers work reproducibly, is it realistic to expect we can achieve a universal standard for metadata and semantics for publishing this work?
· How should journals approach threats to computational reproducibility over time?
Another question is whether publication is “one and done” or needs to be ongoing. Is it sufficient (i.e., is it in the interest of scientific progress) that journals publish reproducible research on the day of publication? (Note: we’ll explore questions of preservation next month.)
Several participants expressed the opinion that, while active maintenance is an ideal state, transparency is perhaps the most likely or even reasonable outcome from reproducibility efforts. Keeping track of the data, software (exact versions), compute logs, etc., enable users to go back later and determine how results were produced and, more importantly, why results might have changed.
Scholarly communication reforms in support of reproducibility
Several initiatives in scholarly communication in support of publishing reproducible research were mentioned in the meeting. Many are experimental, having been tested and implemented in a particular domain or journal. These include,
New roles on journal editorial boards and publishers’ staff specializing in reproducible and open science. For example, an Innovation Officer at eLife who is on the Executive staff, overseeing the eLife Innovation Initiative. Another example is a Data Editor at the American Economic Association).
New formats for published research articles (i.e., beyond PDF), such as executable research articles (see for example, Stencila).
New policies. Journals are increasingly requiring not only that data and code are made available (there is great variation in what that actually means) but mandatory full reproductions.
Reproducibility checking. Some journals are investing in efforts to verify that materials they publish are indeed reproducible. This process may be manual: Some journals are contracting with third parties to perform human-based reproducibility checking, such as CASCAD and the Odum Institute. In other cases, it might be automated. The software community uses reproducible builds for building software and addresses a similar problem, and can be a promising source of information, toolchains, and functionality. Examples include a “software assemblage” for each paper that is continually re-executed using a continuous integration system, such as Travis (more on this from James Howison), or software development practices that create an independently-verifiable path from source to binary code, such as reproducible builds.
· How do journal policies on reproducibility manifest in our communities (e.g., TOP Guidelines)? How well are they followed and enforced?
Participants expressed the sense that this is an evolving space, with a fair degree of experimentation. More is needed, especially in the realm of automation. Automation can be useful as a means for reporting back to authors any problems with reproducibility at the time of submission can encourage authors to deliver higher quality source materials. Additionally, automatic (and standard-based) packaging and publication of a certifiably-reproducible research would cut down on publisher cost.
The question of peer review
There is a fair amount of labor involved in verifying reproducibility during the journal publication process and this topic garnered quite a bit of interest from this group. In particular, labor provided by peers (as opposed to specialized third parties) to verify reproducibility was a topic of conversation. The group identified several problematic aspects of peer review in this context.
· Who reviews reproducibility materials before publication? How is that process integrated with the general publication process? How can or should peer review be updated as a process to align with the goals of promoting reproducible research?
Cultural issues relating to a legacy of free labor by peer (manuscript) reviewers are problematic. Computational reproducibility is more complex, requiring more time investment and access to resources. Moreover, reproducibility review by humans is time consuming and researchers are currently not adequately incentivized to perform it as part of standard peer review. Participants expressed that this work should be compensated. One suggestion to help encourage a cultural shift is for funders to allow this as a budget item. As reproduction infrastructure matures, human labor may decrease.
(Note: a recent article by Willis and Stodden explores how “expanding the peer review process can be used to improve or even ensure the reproducibility of computational research at the time of publication” and is relevant to this conversation.)
Finally, this topic relates to the broader issue of the labor of reproducible research (this topic was addressed also at our previous conversation about reproducibility principles). While the actors may be different depending on the stage in the research lifecycle – graduate students during active research vs. peers during journal review, for example – unresolved problems including proper credit, training, capacity, and resources are evident.
To publish reproducible research, we need to practice reproducible research
Publishing reproducible research is much more difficult when the research was not conducted in a reproducible way from its inception. To stay on top of fast-evolving policies, technologies, and norms in this space, researchers need support. Importantly, their work needs to be recognized and rewarded.
· How can we align our assessment and metrics to foster an environment that promotes reproducible research?
In many cases, more training is the answer. Some researchers want to do open and reproducible research, but don’t know how to do it and more training and education is required (see our previous blog post on reproducibility training and education). In other cases, especially in certain fields, to work reproducibly researchers must rely on professionals who support their data and computation needs. Funding and developing such a workforce is a growing concern (an Interest Group at the Research Data Alliance is working on this topic).
Finally, two things. We note an issue we raised in our provocation post that we did not get to at the meeting: open scholarship, FAIR, and CARE and how they relate to publishing reproducible research. Please let us know if this is of interest for further discussion. We also note that several of the themes in this meeting echo previous conversations held in this forum.
We’ll be holding one more topical meeting, on reproducibility and preservation, on July 29, 2021 (join us! RSVP here). Please stay tuned for future blog posts to continue the conversation!
By Limor Peer and Vicky Rampin