The research, conducted by scientists from Applied Biosystems (ABI) and Australia's University of Queensland was published in the latest issue of the journal Nature Methods and used ABI's SOLiD (supported oligo ligation detection) gene sequencer to analyse the vast collection of RNAs transcribed from a mouse genome.
According to the authors, the results will help researchers to identify distinguishing genetic stem cell features as well as enabling them to better understand how breakdowns in molecular pathways lead to complex diseases such as cancer.
Nearly the entire mammalian genome is transcribed into either RNA molecules from genes that encode proteins or non-coding RNAs that regulate the activity of genes.
The study profiled the RNA transcripts generated from the genomes of mouse embryoid body (EB) cells and embryoid stem cells (ESC) generating more than 10 billion bases of sequence data, revealing thousands of previously unknown RNA transcripts.
"For the first time we are starting to accumulate data sets that allow us to look at that entire complexity of all of the RNA present in a mammalian cell," said Dr Sean Grimmond, associate professor at the University of Queensland and senior author of the study.
"This finding demonstrates that a digital gene expression methodology performed with the SOLiD System is far superior to array profiling approaches in terms of having a higher sensitivity and being able to see more RNAs in a transcriptome."
The researchers used a method developed at Queensland University to construct short quantitative random RNA libraries (SQRL), which enabled them to discern between RNAs transcribed from either the coding or the sense strand, as well as non-coding RNAs that reside on the anti-sense strand of double-stranded DNA.
Using the SQRL method, the researchers created random cDNA (complimentary DNA) libraries that gave them 25-35 base-pair length sequence tags that each represented a particular RNA transcript.
The ability of the SOLiD System to accurately detect minute quantities of RNA transcripts and generate up to 240 million sequence tags per run enabled the researchers to rapidly perform a digital RNA expression analysis and obtain the exact number of RNA sequence tags generated from the genome of the different cell lines.
"Using the SQRL approach allowed us to discover RNA molecules that could not have been discovered using alternative methods such as array profiling," said Kevin McKernan, Applied Biosystems' senior director of scientific operations, and one of the co-authors of the study.
"For example, this method allowed us to discover thousands of new splice variants. Also, being able to capture information about which DNA strand, sense or anti-sense, contains specific RNA transcripts provides us with an important detail for gaining a better understanding of anti-sense regulation and how non-coding RNAs function."