RNA-sequencing (RNA-seq) is a powerful technology for transcriptome profiling. While most RNA-seq projects focus on gene-level quantification and analysis, there is growing evidence that most mammalian genes are alternatively spliced to generate different isoforms that can be subsequently translated to protein molecules with diverse or even opposing biological functions. Quantifying the expression levels of these isoforms is key to understanding the genes biological functions in healthy tissues and the progression of diseases. Among open source tools developed for isoform quantification, Salmon, Kallisto, and RSEM are recommended based upon previous systematic evaluation of these tools using both experimental and simulated RNA-seq datasets. However, isoform quantification in practical RNA-seq data analysis needs to deal with many QC issues, such as the abundance of rRNAs in mRNA-seq, the efficiency of globin RNA depletion in whole blood samples, and potential sample swapping. To overcome these practical challenges, QuickIsoSeq was developed for large-scale RNA-seq isoform quantification along with QC. In this chapter, we describe the pipeline and detailed the steps required to deploy and use it to analyze RNA-seq datasets in practice. The QuickIsoSeq package can be downloaded from https//github.com/shanrongzhao/QuickIsoSeq.Statistical modeling of count data from RNA sequencing (RNA-seq) experiments is important for proper interpretation of results. Here I will describe how count data can be modeled using count distributions, or alternatively analyzed using nonparametric methods. I will focus on basic routines for performing data input, scaling/normalization, visualization, and statistical testing to determine sets of features where the counts reflect differences in gene expression across samples. Finally, I discuss limitations and possible extensions to the models presented here.RNA-Seq has become the de facto standard technique for characterization and quantification of transcriptomes, and a large number of methods and tools have been proposed to model and detect differential gene expression based on the comparison of transcript abundances across different samples. However, state-of-the-art methods for this task are usually designed for pairwise comparisons, that is, can identify significant variation of expression only between two conditions or samples. We describe the use of RNentropy, a methodology based on information theory, devised to overcome this limitation. RNentropy can thus detect significant variations of gene expression in RNA-Seq data across any number of samples and conditions, and can be applied downstream of any analysis pipeline for the quantification of gene expression from raw sequencing data. RNentropy takes as input gene (or transcript) expression values, defined with any measure suitable for the comparison of transcript levels across samples and conditions. The output consists of genes (or transcripts) exhibiting significant variation of expression across the conditions studied, together with the samples in which they result to be over- or underexpressed. RNentropy is implemented as an R package and freely available from the CRAN repository. We provide a detailed guide to the functions and parameters of the package and usage examples to demonstrate the software capabilities, also showing how it can be applied to the analysis of single-cell RNA sequencing data.RNA structure is a key player in regulating a plethora of biological processes. A large part of the functions carried out by RNA is mediated by its structure. To this end, in the last decade big effort has been put in the development of new RNA probing methods based on Next-Generation Sequencing (NGS), aimed at the rapid transcriptome-scale interrogation of RNA structures. In this chapter we describe RNA Framework, the to date most comprehensive toolkit for the analysis of NGS-based RNA structure probing experiments. By using two published datasets, we here illustrate how to use the different components of the RNA Framework and how to choose the analysis parameters according to the experimental setup.RNA molecules play important roles in almost every cellular process, and their functions are mediated by their sequence and structure. Determining the secondary structure of RNAs is central to understanding RNA function and evolution. RNA structure probing techniques coupled to high-throughput sequencing allow determining structural features of RNA molecules at transcriptome-wide scales. https://www.selleckchem.com/products/bmn-673.html Our group recently developed a novel Illumina-based implementation of in vitro parallel probing of RNA structures called nextPARS.Here, we describe a protocol for the computation of the nextPARS scores and their use to obtain the structural profile (single- or double-stranded state) of an RNA sequence at single-nucleotide resolution.RNA primary and secondary motif discovery is an important step in the annotation and characterization of unknown interaction dynamics between RNAs and RNA-Binding Proteins, and several methods have been developed to meet the need of fast and efficient discovery of interaction motifs. Recent advances have increased the amount of data produced by experimental assays and there is no available method suitable for the analysis of all type of results. Here we present a simple workflow to help choosing the more appropriate method, depending on the starting situation, among the three algorithms that best cover the landscape of approaches. A detailed analysis is presented to highlight the need for different algorithms in different working settings. In conclusion, the proposed workflow depends on the nature of the starting data and on the availability of RNA annotations.Modeling the three-dimensional structure of RNAs is a milestone toward better understanding and prediction of nucleic acids molecular functions. Physics-based approaches and molecular dynamics simulations are not tractable on large molecules with all-atom models. To address this issue, coarse-grained models of RNA three-dimensional structures have been developed. In this chapter, we describe a graphical modeling based on the Leontis-Westhof extended base pair classification. This representation of RNA structures enables us to identify highly conserved structural motifs with complex nucleotide interactions in structure databases. We show how to take advantage of this knowledge to quickly predict three-dimensional structures of large RNA molecules and present the RNA-MoIP web server (http//rnamoip.cs.mcgill.ca) that streamlines the computational and visualization processes. Finally, we show recent advances in the prediction of local 3D motifs from sequence data with the BayesPairing software and discuss its impact toward complete 3D structure prediction.
RNA-sequencing (RNA-seq) is a powerful technology for transcriptome profiling. While most RNA-seq projects focus on gene-level quantification and analysis, there is growing evidence that most mammalian genes are alternatively spliced to generate different isoforms that can be subsequently translated to protein molecules with diverse or even opposing biological functions. Quantifying the expression levels of these isoforms is key to understanding the genes biological functions in healthy tissues and the progression of diseases. Among open source tools developed for isoform quantification, Salmon, Kallisto, and RSEM are recommended based upon previous systematic evaluation of these tools using both experimental and simulated RNA-seq datasets. However, isoform quantification in practical RNA-seq data analysis needs to deal with many QC issues, such as the abundance of rRNAs in mRNA-seq, the efficiency of globin RNA depletion in whole blood samples, and potential sample swapping. To overcome these practical challenges, QuickIsoSeq was developed for large-scale RNA-seq isoform quantification along with QC. In this chapter, we describe the pipeline and detailed the steps required to deploy and use it to analyze RNA-seq datasets in practice. The QuickIsoSeq package can be downloaded from https//github.com/shanrongzhao/QuickIsoSeq.Statistical modeling of count data from RNA sequencing (RNA-seq) experiments is important for proper interpretation of results. Here I will describe how count data can be modeled using count distributions, or alternatively analyzed using nonparametric methods. I will focus on basic routines for performing data input, scaling/normalization, visualization, and statistical testing to determine sets of features where the counts reflect differences in gene expression across samples. Finally, I discuss limitations and possible extensions to the models presented here.RNA-Seq has become the de facto standard technique for characterization and quantification of transcriptomes, and a large number of methods and tools have been proposed to model and detect differential gene expression based on the comparison of transcript abundances across different samples. However, state-of-the-art methods for this task are usually designed for pairwise comparisons, that is, can identify significant variation of expression only between two conditions or samples. We describe the use of RNentropy, a methodology based on information theory, devised to overcome this limitation. RNentropy can thus detect significant variations of gene expression in RNA-Seq data across any number of samples and conditions, and can be applied downstream of any analysis pipeline for the quantification of gene expression from raw sequencing data. RNentropy takes as input gene (or transcript) expression values, defined with any measure suitable for the comparison of transcript levels across samples and conditions. The output consists of genes (or transcripts) exhibiting significant variation of expression across the conditions studied, together with the samples in which they result to be over- or underexpressed. RNentropy is implemented as an R package and freely available from the CRAN repository. We provide a detailed guide to the functions and parameters of the package and usage examples to demonstrate the software capabilities, also showing how it can be applied to the analysis of single-cell RNA sequencing data.RNA structure is a key player in regulating a plethora of biological processes. A large part of the functions carried out by RNA is mediated by its structure. To this end, in the last decade big effort has been put in the development of new RNA probing methods based on Next-Generation Sequencing (NGS), aimed at the rapid transcriptome-scale interrogation of RNA structures. In this chapter we describe RNA Framework, the to date most comprehensive toolkit for the analysis of NGS-based RNA structure probing experiments. By using two published datasets, we here illustrate how to use the different components of the RNA Framework and how to choose the analysis parameters according to the experimental setup.RNA molecules play important roles in almost every cellular process, and their functions are mediated by their sequence and structure. Determining the secondary structure of RNAs is central to understanding RNA function and evolution. RNA structure probing techniques coupled to high-throughput sequencing allow determining structural features of RNA molecules at transcriptome-wide scales. https://www.selleckchem.com/products/bmn-673.html Our group recently developed a novel Illumina-based implementation of in vitro parallel probing of RNA structures called nextPARS.Here, we describe a protocol for the computation of the nextPARS scores and their use to obtain the structural profile (single- or double-stranded state) of an RNA sequence at single-nucleotide resolution.RNA primary and secondary motif discovery is an important step in the annotation and characterization of unknown interaction dynamics between RNAs and RNA-Binding Proteins, and several methods have been developed to meet the need of fast and efficient discovery of interaction motifs. Recent advances have increased the amount of data produced by experimental assays and there is no available method suitable for the analysis of all type of results. Here we present a simple workflow to help choosing the more appropriate method, depending on the starting situation, among the three algorithms that best cover the landscape of approaches. A detailed analysis is presented to highlight the need for different algorithms in different working settings. In conclusion, the proposed workflow depends on the nature of the starting data and on the availability of RNA annotations.Modeling the three-dimensional structure of RNAs is a milestone toward better understanding and prediction of nucleic acids molecular functions. Physics-based approaches and molecular dynamics simulations are not tractable on large molecules with all-atom models. To address this issue, coarse-grained models of RNA three-dimensional structures have been developed. In this chapter, we describe a graphical modeling based on the Leontis-Westhof extended base pair classification. This representation of RNA structures enables us to identify highly conserved structural motifs with complex nucleotide interactions in structure databases. We show how to take advantage of this knowledge to quickly predict three-dimensional structures of large RNA molecules and present the RNA-MoIP web server (http//rnamoip.cs.mcgill.ca) that streamlines the computational and visualization processes. Finally, we show recent advances in the prediction of local 3D motifs from sequence data with the BayesPairing software and discuss its impact toward complete 3D structure prediction.
0 Comentários
0 Compartilhamentos
77 Visualizações
0 Anterior
