RNA regulation has important roles in shaping cell-specific gene expression patterns and organism development, functioning to diversify genetic output and providing multiple regulatory layers at which gene expression can be finely controlled. On one hand, alternative splicing and RNA editing largely expand the number/possibility of functions encoded by protein-coding genes, and more importantly, evolutionary changes of alternative splicing and RNA editing in transcriptome level underlie structural and regulatory differences associated with species-specific characteristics. On the other hand, the multifaceted transcriptome has become even more complex with the pervasive transcription of noncoding RNAs (ncRNAs), leading to new appreciation of transcription regulation, especially from noncoding regions. With technological improvements and the application of integrated methodologies, significant progress has been achieved in uncovering new noncoding molecules and achieving new insights into their functions. Specifically, our group is applying high throughput sequencing methods combined with genetic, biochemical, and bioinformatic strategies to focus on identifying and characterizing new isoforms of regulatory long noncoding RNAs (lncRNAs) by developing novel computational pipelines/algorithms and high throughput sequencing technologies in the whole transcriptome level.
Recently, a variety of lncRNAs have been systematically revealed from different tissues and species with clear characteristics distinguishing them from coding RNAs by massive transcriptome analyses with high-throughput technologies (including tiling arrays and RNA-seq) that are of high coverage, high sensitivity, and high efficiency, representing an evolutionary leap in our methodology for lncRNA characterization. Our recent studies have identified and determined new types of poly(A)– lncRNAs, including sno-lncRNAs and circular RNAs from excised introns or back-spliced exons, and these new findings have inspired new insights into the study of lncRNAs. Despite this progress, a number of important issues remain to be illustrated for better understanding of lncRNAs. For example, what are the detailed mechanisms for poly(A)– lncRNA biogenesis? What are the landscapes of these poly(A)– lncRNA expression in specific tissues/species and what are the connections of specific expression repertoires with specific tissue/species function? In addition, what kinds of miRNAs/proteins are associated with specific structures/motifs of these poly(A)– lncRNA for their functional regulation?
To address these questions, we are carrying out a series of work to focus on decoding the regulatory network of poly(A)– lncRNAs across species. First, we further improve computational algorithms for poly(A)– lncRNA identification and their function predication in different species. Second, with the application of high-throughput poly(A)– RNA sequencing and aforementioned algorithms, we are in the progress of systematically profiling various of poly(A)– lncRNAs from different species and cell lines. In addition, we plan to knock down/knock out some key enzymes that are known to involve in lncRNA biogenesis to identify more poly(A)– lncRNA across species. Moreover, we will predict and validate possible conserved/consensus motifs in poly(A)– lncRNAs across species to decipher their importance for lncRNA biogenesis during evolution. Furthermore, we plan to identify any possible protein, RNA and DNA partners that bind to specific poly(A)– lncRNAs by performing state-of-the-art binding assays, such as CHART (Capture Hybridization Analysis of RNA Targets) and ChIRP (Chromatin Isolation by RNA Purification). Finally, we have obtain human embryonic stem cells with specific poly(A)– lncRNA repression and are ready to fully address their potential functions by checking the genome-wide gene expression changes.