Skip to content

Get Reference Utility Rule

For elvers workflows that start with or otherwise utilize a reference, the keyword get_reference can be used to provide that reference (and for transcriptomes, an optional tab-separated gene-to-transcript map). If you provide the get_reference keyword and information in your configuration file, the get_reference utility rule with either download or softlink your reference into your elvers directory so it can be utilized for any specified workflows.

Specify Input reference

The default get_reference parameters are as follows:

get_reference:
  reference: examples/nema.assembly.fasta
  gene_trans_map: examples/nema.assembly.fasta.gene_trans_map
  reference_extension: _input
  download_ref: false # download the reference using http (or ftp)
  use_ftp: false # switch download method from http to ftp

( to see these, you can run elvers config get_reference --print_params)

The default filenames correspond to a test Trinity transcriptome we provide for the nema test data. To use your own files, specify the file path to your fasta file via the reference parameter. If you have a gene to transcript mapi (transcriptomes only), please specify it as well. If not, do not include the gene_trans_map line in your config. The reference_extension parameter is an optional parameter that allows us to use multiple references or assemblies. For example, you can build a de novo transcriptome with Trinity and input a reference transcriptome and do all downstream steps (e.g. quantification and differential expression) on both assemblies. All assemblies generated via elvers rules will have an extension corresponding to the program they were generated with. If you choose to add a reference_extension, feel free to use _input, as specified above, or pick something equally simple yet more informative, but please don't use additional underscores (_) in this extension!

Output Files

The output of the get_reference step is your reference (and optional gene-transcript map) copied into a subdirectory (reference) within your output directory (BASENAME_out).