Skip to content

NSURP Research Project 2020

The materials in this repository are designed to facilitate learning bioinformatic techniques while working through a metagenomics project using publicly-available data. If you see a mistake or something is not clear, please submit an issue.

During this project, you will learn how to:

  • keep a detailed lab notebook
  • interact with an HPC (we'll use Farm)
  • install and manage software environments using conda
  • download sequencing data and other files from the internet and public databases
  • interpret and use different file formats in bioinformatics and computing
  • conduct quality analysis and control for sequencing data
  • determine the taxonomic composition of sequencing reads
  • quickly compare large sequencing datasets
  • build reproducible workflows using snakemake
  • document workflows using git and GitHub
  • troubleshoot errors during your analysis

Most of the work done in this rotation will be completed on Farm. However, you will need to access Farm from your own computer. We will use an SSH-client to be able to interact with Farm. If you are using a Mac or running Linux, your computer comes with a program (e.g. Terminal on Mac) that we can use as an SSH-client. If you are on a Windows running Windows 10, you can install the Ubuntu Subsystem. Otherwise, please follow the instructions for Windows found at this link.