Workflow pipelines¶
The importance of running reproducible and automated data analysis and Nextflow is gaining traction by a wide community of Life Sciences researchers and bioinformaticians to do this. Current tutorials often focus on writing your own basic pipelines, however it is not always necessary to reinvent the wheel. This tutorial and set of exercises will guide you through your first steps on how to use pipelines that are available on the internet.
Besides Nextflow, we would also like to stress the importance of other workflow languages like Snakemake and Common Workflow Language (CWL). A list of tutorials and workshop materials on the former is available in the existing materials page, for the latter we refer to the current efforts of the training materials being developed in the Carpentries style by the CWL community to teach workflow thinking to novices.
Learner’s profiles¶
Who are they?
Already have some background with bioinformatics data analysis
Linux users → they know how to use CLI
Familiarity with git / GitHub advantageous but not strictly required. They can take - for example - the dedicated Software Carpentry tutorial.
What problem are they having?
Big data, or multiple experiments with similar data and/or many data files → automate processes
Want to make their analysis reproducible
How will the tutorial/workshop help them?
Understand conceptually, Understand how to run, debugging - know what which
errors might mean what, resuming/retry, workdir, if process fails → where to find fails (.command./* ), .nextflow.pid
Reproduce failing process from workdir (maybe coupled to run from associated container)
Learner’s objectives¶
Prerequisites for the course/tutorial
Have an understanding of bioinformatics data analysis
Experience in CLI/Linux, feeling comfortable moving around in the Terminal
Define exercises with learning outcomes (measurable)*
Describe terminology/different concepts so you can explain to a developer what might need to be changed.
To be able to find and run different existing pipelines.
Do small trouble-shootings yourself (unrelated to the programming of the pipeline).
Navigate the
workdirand find specific output(files).
*Goals: “After following one of these tutorials, learners will be able to …”