Thursday, December 26, 2013

TruSeq library with long and wide fragment length

Recently, I constructed an Illumina RNA-Seq library which had non-standar fragment length. I used TruSeq RNA Sample Prep Kit v2 and followed the protocol except one modification: reducing fragmentation time so that the fragments will be longer. The investigator who submit the RNA sample preferred to have longer library size, because they wanted to do de novo assembly of this transcriptome, hoping longer read length of MiSeq platform (2x250 paired end) would help.

In standard TruSeq protocol mRNA is fragmented by incubation at 94 degC for 8 min. Now I used one min and got a Bioanalyzer profile like the figure below. Library length ranged from 250 up to 1500 bp.
The TruSeq manual actually explained what the profiles would look like. And my library profile matched. But I was still curious how this library performed. I was not sure what size to use as mean library length - which was required for dilution when loading the library on the sequencing machine. Overloading makes the clusters too dense, and base-call qualities will get low. But we need to load adequate amount of library to generate enough data. Fortunately, the run was great, although a little overloaded: cluster density more than 1100K/mm2, 2X20M reads, totally 20Gb data. After the sequencing was done, I mapped the reads to the transcriptome reference using BWA-MEM. Without any idea what this sample was from, I just did several blast search and chose one species which had very high similarity. Library insert size (TLEN field in SAM ouput) was then calculated and plotted. Note that adapters ligation adds extra ~120 bp to the insert fragment.
Codes:

What these results tell:
(1) Mapping results shows the majority of library inserts are less than 500 bp. Libraries with shorter length (insert + adapter) have higher chance to generate clusters and get sequenced. This has been reported already.
(2) For a library with such long and broad fragment length, 2x250 PE sequencing seem not necessary, because 2X150 PE is enough to cover the whole insert fragment of most of library DNA.
(3) Does it still make sense to prepare long RNA-Seq library? What's the pros and cons of it? ...hmm Need comparison with short library to get a conclusion.
In the end of this post, let me make this blog longer by dumping more more ugly plots here ...
Codes:

No comments:

Post a Comment