NLM Reproducibility Workshop


a hands-on workshop to facilitate reproducible bioinformatics research

About


The National Library of Medicine is excited to announce our three-day workshop on Reproducibility in Bioinformatics, which will take place from May 15 to 17, 2019 (9 a.m. – 5 p.m.) in 6001 Executive Boulevard. This hands-on workshop is geared towards NIH researchers with programming knowledge at all career stages who would like to better understand how to conduct reproducible research.

Reproducibility can be defined as the ability of a researcher to duplicate the results of a prior study using the same materials (such as data) and procedures (such as analytic approaches) as were used by the original investigator. Reproducibility has become a priority at NIH, requiring that researchers adopt best practices to facilitate reproducible research early on.

During this workshop, participants will learn about essential tools for reproducible research by working to reproduce the results of a paper in the bioinformatics literature with available data. Suggestions for papers to reproduce have been provided in the application. Participants may choose to reanalyze the underlying data by either replicating the methods detailed in the paper or using an alternate analysis methodology. At the end of this workshop, participants will have:

  1. A working knowledge of tools for reproducible research and NLM's data resources for bioinformatics
  2. An understanding how to incorporate those tools into their research practices
  3. A path towards a deliverable, in the form of an executable notebook and/or publication.

Applications to participate in this workshop have closed, but results from the workshop will be shared with those who are interested. This website will be regularly updated with resources and tools covered in the workshop.

If you have any questions, please contact maryam.zaringhalam [at] nih.gov

Important Dates


Friday, April 19 Applications close
Monday, April 22 Participants notified of their acceptance
Wednesday, May 15 –
Friday, May 17
Reproducibility in Bioinformatics Workshop
6001 Executive Boulevard, Room B1/B2

Resources


Webinars

Teaching Resources

Software Downloads

Useful Tutorials

Recommended Reading

Webinars


Containerization for Reproducible Bioinformatics Research

As computational work becomes increasingly embedded in biomedical research practices, computational reproducibility has become an issue of increasing importance. Computational reproducibility requires that other researchers are able to deploy and use software and analysis workflows in their own computing environments. Platforms like Docker and Singularity allow the creation and configuration of software containers, which can be distributed and deployed across a range of systems. This lecture, presented by Steve Tsang, gives an introductory overview of containerization and how containers can facilitate reproducible bioinformatics research, providing examples from the NCI Cloud Resources and various hackathons.

Presentation slide deck can be found here.

Close

Software Downloads


Docker

Docker allows you to build and deploy application in containers, wrapping software and its dependencies into a standardized unit. You can download and play with Docker for Mac and Windows. During the workshop, we'll be using the Amazon cloud, so you'll need to install Docker for Ubuntu using these instructions. You can get a feel for how Docker works with this interactive tutorial.

Jupyter Notebook

Jupyter is an executable notebook we'll be using during the workshop. Python 3.3 or greater or Python 2.7, which you can download here, is a requirement for installing Jupyter Notebook. Jupyter's developers strongly recommend installing both Python and Jupyter using the Anaconda Distribution, and the tutorial included here provides useful instructions on how to go about doing just that for Mac/Linux and Windows.

Git

Git is a free and open source version control system to allow you to track changes in computer files and coordinate work on those files with multiple collaborators. We'll be covering how to use this tool during the workshop.

Close

Useful Tutorials


Play with Docker

Play with Docker is an interactive Docker playground that guides you through some exercises to familiarize yourself with Docker containers. The training module also includes quizzes on container terminology so you can familiarize yourself with terms commonly used in containerization.

Fundamentals of Git

This tutorial from Bitbucket provides a nice overview of version control and of Git. We will not be teaching Bitbucket during the workshop, but the interactive tutorial is nevertheless useful.

Principles, Statistical and Computational Tools for Reproducible Research

For those of you looking to take a deeper dive into tools that facilitate reproducible research, this HarvardX course is a survey of best practices with lectures covering fundamentals of reproducible research, case studies in reproducible research, data provenance, statistical methods, and computational tools for reproducible science.

Jupyter Notebook for Beginners

This tutorial from Dataquest has put together a tutorial for beginners to get a handle on Jupyter notebooks. The tutorial covers the basics of installing Jupyter, learning the important terminology, and sharing and publishing notebooks online.

Introduction to Reproducible Research with Jupyter Notebook and Python

This is another great introduction to Jupyter Notebook, as well as a primer on the importance and fundamentals of reproducible research. It contains interactive modules to get you working in Jupyter notebook that show you how to navigate Jupyter and access the wider Jupyter community.

Close

Recommended Reading


1,500 scientists lift the lid on reproducibility

In 2016, Nature surveyed 1,500 researchers from different scientific fields on their thoughts around reproducibility in research and found that more than 70 percent of researchers questioned have tried and failed to reproduce other scientist's experiments.

Enhancing reproducibility for computational methods

Science published this paper in their Policy Forum section detailing recommendations for enhancing reproducibility, which hinge on making data, code, and computational workflows available in open, trusted repositories, as well as making them citable so credit can be given where it's due.

NIH Guidelines on Rigor and Reproducibility

NIH has released guidelines to address rigor and transparency in NIH grant applications and progress reports. This site provides an overview of factors to consider when writing grants and conducting research, as well as examples of how rigor and transparency have been incorporated into existing applications.

Guidelines for Transparency and Openness Promotion

The Center for Open Science developed a set of eight transparency standards to improve transparency of the research process and products. The guidelines were written to guide journals in their policies and practices in supporting research transparency, but the document is a worthwhile read for researchers as well.

A toolkit for data transparency takes shape

This piece from Nature goes through considerations for developing a toolkit for computational reproducibility, which includes version control, scripting, computational notebooks, and containerization — all topics we'll be covering over the course of the workshop.

Close

Teaching Resources


Open and Reproducible Research Lecture Materials

GitHub repository with lecture slides from Burke Squires' opening lecture, overviewing open science and what is meant by reproducible research.

Jupyter and Anaconda Training

GitHub repository with resources for Jupyter notebooks and Anaconda, from the Jupyter breakout session led by Burke Squires.

Git and Version Control Tutorial

GitHub repository with lecture notes and resources covering the fundamentals of version control, Git, and GitHub, from the Version Control breakout session led by Keith Hughitt.

Containerization for Reproducible Bioinformatics Research

GitHub repository with Steve Tsang's presentation on containerization and Docker, with resources on how to get started with Docker.

Close

Project Name


Lorem ipsum dolor sit amet, consectetur adipisicing elit. Mollitia neque assumenda ipsam nihil, molestias magnam, recusandae quos quis inventore quisquam velit asperiores, vitae? Reprehenderit soluta, eos quod consequuntur itaque. Nam.

Close

Project Name


Lorem ipsum dolor sit amet, consectetur adipisicing elit. Mollitia neque assumenda ipsam nihil, molestias magnam, recusandae quos quis inventore quisquam velit asperiores, vitae? Reprehenderit soluta, eos quod consequuntur itaque. Nam.

Close Project