Short-paired end reads trump long single-end reads for gene expression analyses.

The “gold standard” for gene-level expression analysis for genome-enabled species has been almost exclusively moderate-sized single end (SE) reads … just see how many mouse studies have used 1×75 as an RNA-seq strategy. @timsackton and I had an intuition that short paired-end (PE) reads should achieve greater alignment specificity than the SE, and that should lead to more robust expression estimates. As the costs of sequencing on current Illumina insruments are per base … that means that PE 2×40 reads could be achieved at a cost identical to SE 1×75. So, we pulled down sequencing data for SRA from 12 different projects with long paired-end reads, spanning a variety of model organisms with well-annotated genomes. We then evaluated the ability of 1×75 and 2×40 (as well as 1×125) to reproduce expression estimates and downstream differential expression results relative to the 2×125 “gold standard.” Lo and behold, 2×40 consistent outperformed 1×75 and in many cases outperformed 1×125. This paper came out at the end of 2020 in BMC Bioinformatics, with help from our co-author John Gaspar. [link]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s