Skip to content

Eel Pond Protocol Workflow

The Eel Pond protocol (which inspired the elvers name) included line-by-line commands that the user could follow along with using a test dataset provided in the instructions. We have re-implemented the protocol here to enable automated de novo transcriptome assembly, annotation, and quick differential expression analysis on a set of short-read Illumina data using a single command. See more about this protocol here.

The "Eel Pond" Protocol for RNAseq consists of:

  • trimmomatic adapter and read quality trimming
  • fastqc read qc evaluation
  • khmer k-mer trimming and (optional) digital normalization
  • trinity de novo assembly
  • dammit annotation
  • salmon read quantification to the trinity assembly
  • deseq2 differential expression analysis

Running Test Data

This is the default workflow. To run:

elvers examples/nema.yaml

(You can be explicit and run the full default workflow with elvers examples/nema.yaml default)

This will run a small set of Nematostella vectensis test data (from Tulin et al., 2013).

Running Your Own Data

Set sample info and build a configfile first (see Understanding and Configuring Workflows).

To build a config, run:

elvers ep.yaml --build_config

The resulting ep.yaml configfile for this workflow will look something like this. The order of the parameters may be different and does not affect the order in which steps are run. Please see the documentation file for each individual program (linked above) for what parameters to modify.

  ####################  Eelpond Pipeline Configfile  ####################
basename: elvers
experiment: _experiment1
samples: samples.tsv ### PATH TO YOUR SAMPLE FILE GOES HERE

  ####################  assemble  ####################
get_data:
  download_data: false
khmer:
  C: 3
  Z: 18
  coverage: 20
  diginorm: true
  extra: ''
  ksize: 20
  memory: 4e9
trimmomatic:
  adapter_file:
    pe_name: TruSeq3-PE.fa
    pe_url: https://raw.githubusercontent.com/timflutre/trimmomatic/master/adapters/TruSeq3-PE-2.fa
    se_name: TruSeq3-SE.fa
    se_url: https://raw.githubusercontent.com/timflutre/trimmomatic/master/adapters/TruSeq3-SE.fa
  extra: ''
  trim_cmd: ILLUMINACLIP:{}:2:40:15 LEADING:20 TRAILING:20 SLIDINGWINDOW:4:15 MINLEN:35
trinity:
  add_single_to_paired: false
  extra: ''
  input_kmer_trimmed: true
  input_trimmomatic_trimmed: false
  max_memory: 30G
  seqtype: fq

  ####################  annotate  ####################
dammit:
  busco_group:
  - metazoa
  - eukaryota
  db_dir: databases
  db_extra: ''
sourmash:
  extra: ''

  ####################  quantify  ####################
salmon:
  input_trimmomatic_trimmed: True
  index_params:
    extra: ''
  quant_params:
    extra: ''
    libtype: A

  ####################  diffexp  ####################
deseq2:
  contrasts:
    time0-vs-time6:
    - time0
    - time6
  gene_trans_map: true
  pca:
    labels:
    - condition