Short-paired end reads trump long single-end reads for gene expression analyses.

The “gold standard” for gene-level expression analysis for genome-enabled species has been almost exclusively moderate-sized single end (SE) reads … just see how many mouse studies have used 1×75 as an RNA-seq strategy. @timsackton and I had an intuition that short paired-end (PE) reads should achieve greater alignment specificity than the SE, and that should lead to more robust expression estimates. As the costs of sequencing on current Illumina insruments are per base … that means that PE 2×40 reads could be achieved at a cost identical to SE 1×75. So, we pulled down sequencing data for SRA from 12 different projects with long paired-end reads, spanning a variety of model organisms with well-annotated genomes. We then evaluated the ability of 1×75 and 2×40 (as well as 1×125) to reproduce expression estimates and downstream differential expression results relative to the 2×125 “gold standard.” Lo and behold, 2×40 consistent outperformed 1×75 and in many cases outperformed 1×125. This paper came out at the end of 2020 in BMC Bioinformatics, with help from our co-author John Gaspar. [link]

Adam H. Freedman

Evolutionary Genomics, Bioinformatics, Data Wrangling

Short-paired end reads trump long single-end reads for gene expression analyses.

Leave a comment Cancel reply

Share this:

Related

Leave a comment Cancel reply