Computational work becomes increasingly important for modern biologist to analyze massive data sets generated by high-throughput sequencing techniques. Many biologists are now challenged to work with gigabytes of sequencing data generated in only hours by parallel sequencing machines. These sequencing data are usually produced in the form of large text files, which makes the Unix operating system (including Linux) particularly suited to processing such files --especially when operated from the command line. Hence, the aim of this workshop is to introduce computational biology work using Linux to everyone who has never worked with the Linux command line before.
During the first weeks of the course, you will learn basic but powerful Linux commands to manage your folder structure, handle large files, and install and execute programs. After you feel comfortable in the Linux command line environment, you will apply your new knowledge to: assemble high throughput sequencing data into continuous genomes, verify the integrity of these genomes by sequence mapping, use search methods to identify gene regions, and use these regions for phylogenetic reconstruction – all on the Linux operating system. We’ll finish this course with a basic introduction to concepts of the programming language Python which will allow you to script your personal bioinformatics routines.
You’ll learn how to:
- Operate Linux from the command line
- Install and execute Linux programs
- Work with the structure of large high-throughput sequencing data formats (e.g. fastq, sam, vcf)
- Assemble and analyze a genome sequence (using SPAdes and/or velvet)
- Verify genome integrity by sequence mapping (using bowtie and samtools)
- Identify gene regions (using blast)
- Use the assembled sequences for phylogenetic reconstruction (using RAxML and/or PhyML)
- Basic concepts of programming with python
What you need to know about Linux:
- Nothing, absolute beginners are more than welcome (If you already know your way around the Linux command line or the python programming language, there’s no need to attend this class)
What you need to bring and prepare:
- Please bring a personal laptop with the CBFM Workshop Appliance installed, since this class will be more workshop than lecture
- Installation instruction of the CBFM Workshop Appliance and course material will be made available on our website.
File location:
All files that we used in this class can be downloaded here.