Trinity¶

The Eel Pond protocol uses the Trinity de novo transcriptome assembler to take short, trimmed/diginorm Illumina reads data and assemble (predict) full-length transcripts into a single fasta file output. Each contig in the fasta assembly file represents one unique transcript. Trinity is a single-ksize assembler, with a default of k = 25.

We recommend using kmer-trimmed reads (output of khmer) as input into Triniity to reduce dataset complexity without losing valuable kmers. The resulting output assembly fasta file can then be used to align the trimmed (not diginorm) short Illumina reads and quantify expression per transcript.

Note, the current version of Trininty (after 2.3.2) is configured to diginorm the input reads before assembly begins. Since we have already applied diginorm to our reads, the result will be a negligible decrease in read counts prior to the assembly. We provide options to disable this digital normalization via the config file, but applying diginorm twice is not really a problem. For data sets with large numbers of reads, applying diginorm as a separate step as we have via khmer may decrease the memory requirements needed by the Trinity pipeline.

The ID for each transcript is output (version 2.2.0 to current) as follows, where the TRINITY is constant, the DN2202 is an example of a variable contig/transcript ID, c stands for component, g gene and i isoform:

TRINITY_DN2202_c0_g1_i1

Trinity Command¶

On the command line, the command elvers runs is approximately:

Trinity --left left.fq \
  --right right.fq --seqType fq --max_memory 10G \
  --CPU 4

But we highly recommend you modify max_memory and CPU to fit your data and compute resources.

Quickstart¶

Run Trinity via the "default" Eel Pond workflow or via the assemble subworkflow. To run Trinity as a standalone program, see "Advanced Usage" section below.

Output files:¶

Your main output directory will be determined by your config file: by default it is BASENAME_out (you specify BASENAME).

Trinity will output files in the assembly subdirectory of this output directory. The fasta file will be BASENAME_trinity.fasta and the gene-trans map will be BASENAME_trinity.fasta.gene_trans_map.

Modifying Params for Trinity:¶

Be sure to set up your sample info and build a configfile first (see Understanding and Configuring Workflows).

To see the available parameters for the trinity rule, run

elvers config trinity --print_params

This will print the following:

  ####################  trinity  ####################
trinity:
  input_kmer_trimmed: true
  input_trimmomatic_trimmed: false
  add_single_to_paired: false # would you like to add the orphaned reads to the trinity assembly?
  max_memory: 30G
  seqtype: fq
  extra: ''
  #####################################################

In addition to changing parameters we've specifically enabled, you can modify the extra param to pass any extra trinity parameters, e.g.:

  extra: '--no_normalize_reads' # to turn off Trinity's digital normalization steps

Within the "default" Eel Pond workflow or the assemble subworkflow, these options enable you to choose kmer-trimmed, quality-trimmed, or raw sequencing data as input. We recommend using kmer-trimmed reads as input. If both input_kmer_trimmed and input_trimmomatic_trimmed are False, we will just use raw reads from the samples.tsv file.

See the Trinity documentation to learn more about these parameters. Be sure the modified lines go into the config file you're using to run elvers (see Understanding and Configuring Workflows).

Advanced Usage: Running Trinity as a standalone rule¶

You can run trinity as a standalone rule, instead of withing a larger elvers workflow. However, to do this, you need to make sure the input files are available.

For trinity, the default input files are kmer-trimmed input data (e.g. output of khmer).

If you've already done this, you can run:

elvers my_config trinity

If not, you can run the prior steps at the same time to make sure khmer can find these input files:

elvers my_config get_data trimmomatic khmer trinity

Snakemake rule¶

We wrote a Trinity snakemake wrapper to run Trinity.

For snakemake afficionados, see the Trinity rule on github.