Transcriptome analysis using RNA-Seq fromexperiments with and without biological replicates: areview
Abstract
The discovery of nucleic acids opened new frontiers of knowledge, enabling researchers to access an enormous amount of data, through large-scale sequencing methodologies and bioinformatics tools. Amongst these new possibilities, RNA-Seq has been used to identify and quantify RNA molecules. To obtain more accurate biological responses from RNA-Seq data some questions should be considered such as experimental design, type of synthesized library, size of the fragments generated, number of biological replicates, depth, and coverage of the sequencing, species genome availability, and, the choice of software to properly perform the computational analyzes. Accurate bioinformatics analyzes allow the selection of genes with a lower error rate, increasing the validation assertiveness via RT-qPCR and thus, reducing costs. The objective of this review was to present the analysis stages of RNA-Seq data, from experimental design to system biology, considering relevant points, as well as to pointed out some software currently available to carry these analyzes out. Besides, with this review, we aimed to help the academic community to understand all steps and biases involved in RNA-Seq data analysis, from experiments with or without biological replicates.
Downloads
References
AIOUB, A.A.; ZUO, Y.; LI, Y.; QIE, X.; ZHANG, X.; ESSMAT, N.; HU, Z. Transcriptome analysis of Plantago major as a phytoremediator to identify some genes related to cypermethrin detoxification. Environmental Science and Pollution Research, v. 27, p. 1-15, 2020.
https://doi.org/10.1007/s11356-020-10774-4
BOLGER, A.M.; LOHSE, M.; USADEL, B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics, v. 30, n. 15, p. 2114-2120, 2014. https://doi.org/10.1093/bioinformatics/btu170
CAMARENA, L.; BRUNO, V.; EUSKIRCHEN, G.; POGGIO, S.; SNYDER, M. Molecular mechanisms of ethanol-induced pathogenesis revealed by RNA-sequencing. PLoS pathogens, v. 6, n. 4, p. e1000834, 2010. https://doi.org/10.1371/journal.ppat.1000834
CARAZO, F.; ROMERO, J.P.; RUBIO, A. Upstream analysis of alternative splicing: a review of computational approaches to predict context-dependent splicing factors. Briefings in Bioinformatics, v. 20, n. 4, p. 1358-1375, 2019. https://doi.org/10.1093/bib/bby005
CHEN, L.; HEIKKINEN, L.; WANG, C.; YANG, Y.; SUN, H.; WONG, G. Trends in the development of miRNA bioinformatics tools. Briefings in Bioinformatics, v. 20, n.5, p. 1836-1852, 2019. https://doi.org/10.1093/bib/bby054
CONESA, A.; MADRIGAL, P.; TARAZONA, S.; GOMEZ-CABRERO, D.; CERVERA, A.; MCPHERSON, A.; MORTAZAVI, A. A survey of best practices for RNA-Seq data analizes. Genome biology, v. 17, n. 1, p. 13, 2016. https://doi.org/10.1186/s13059-016-0881-8
DA FONSECA, B.H.R.; DOMINGUES, D.S.; PASCHOAL, A.R. mirtronDB: a mirtron knowledge base. Bioinformatics, v.35, n. 19, p. 3873-3874, 2019. https://doi.org/10.1093/bioinformatics/btz153
EVERAERT, C.; LUYPAERT, M.; MAAG, J. L.; CHENG, Q. X.; DINGER, M. E.; HELLEMANS, J.; MESTDAGH, P. Benchmarking of RNA-sequencing analysis workflows using whole-transcriptome RT-qPCR expression data. Scientific Reports, v. 7, n. 1, p. 1559, 2017. https://doi.org/10.1038/s41598-017-01617-3
EWING B; GREEN, P. Base-Calling of automated sequencer traces using Phred. II. Error probabilities. Genome Research, v. 8, p. 186-194, 1998. http://doi.org/10.1101/gr.8.3.186
FENG, J.; MEYER, C. A.; WANG, Q., LIU; J. S.; SHIRLEY LIU, X.; ZHANG, Y. GFOLD: a generalized fold change for ranking differentially expressed genes from RNA-Seq data. Bioinformatics, v. 28, n. 21, p. 2782-2788, 2012. https://doi.org/10.1093/bioinformatics/bts515
HEATHER, J.M.; CHAIN, B. The sequence of sequencers: the history of sequencing DNA. Genomics, v. 107, n. 1, p. 1-8, 2016. https://doi.org/10.1016/j.ygeno.2015.11.003
JAIN, M.; KOREN, S.; MIGA, K. H.; QUICK, J.; RAND, A. C.; SASANI, T. A.; MALLA, S. Nanopore sequencing and assembly of a human genome with ultra-long reads. Nature Biotechnology, v. 36, n. 4, p. 338-345, 2018. https://doi.org/10.1038/nbt.4060
KIM, D.; LANGMEAD, B.; SALZBERG, S.L. HISAT: a fast-spliced aligner with low memory requirements. Nature Methods, v. 12, n. 4, p. 357-360, 2015. https://doi.org/10.1038/nmeth.3317
LI, Q. Q.; LIU, Z.; LU, W.; LIU, M. Interplay between alternative splicing and alternative polyadenylation defines the expression outcome of the plant unique OXIDATIVE TOLERANT-6 gene. Scientific Reports, v. 7, n. 1, p. 1-9, 2017. https://doi.org/10.1038/s41598-017-02215-z
MACHADO, F. B.; MOHARANA, K. C.; ALMEIDA-SILVA, F.; GAZARA, R. K.; PEDROSA-SILVA, F.; COELHO, F. S.; VENANCIO, T. M. Systematic analysis of 1,298 RNA-Seq samples and construction of a comprehensive soybean (Glycine max) expression atlas. The Plant Journal: for cell and Molecular Biology, v. 103, n.5, p. 1894-1909, 2020. https://doi.org/10.1111/tpj.14850
MARACAJA-COUTINHO, V.; PASCHOAL, A. R.; CARIS-MALDONADO, J. C.; BORGES, P. V.; FERREIRA, A. J.; DURHAM, A. M. Noncoding RNAs databases: current status and trends. In Computational Biology of Non-Coding RNA. Humana Press, New York, NY, p. 251-285, 2019. https://doi.org/10.1007/978-1-4939-8982-9_10
MAZA, E. In Papyro comparison of TMM (edgeR), RLE (DESeq2), and MRN normalization methods for a simple two-conditions-without-replicates RNA-Seq experimental design. Frontiers in genetics, v. 7, p. 164, 2016. https://doi.org/10.3389/fgene.2016.00164
MIN, J.; WAGNER, M.; KASAMIAS, T. Advances in Transcriptome Analyses Using RNA Sequencing Technology in Soybean Plants [Glycine max]. Computational Molecular Biology, v. 10, n. 1, 2020.
MUHAMMAD, I. I.; KONG, S. L.; AKMAR ABDULLAH, S. N.; MUNUSAMY, U. RNA-seq, and ChIP-seq as Complementary Approaches for Comprehension of Plant Transcriptional Regulatory Mechanism. International Journal of Molecular Sciences, v. 21, n. 1, p. 167, 2020. https://doi.org/10.3390/ijms21010167
NEGRI, T. D. C.; ALVES, W. A. L.; BUGATTI, P. H.; SAITO, P. T. M.; DOMINGUES, D. S.; PASCHOAL, A. R. Pattern recognition analysis on long noncoding RNAs: a tool for prediction in plants. Briefings in Bioinformatics, v. 20, n.2, p. 682-689, 2019. https://doi.org/10.1093/bib/bby034
OH, J. M.; VENTERS, C. C.; DI, C.; PINTO, A. M.; WAN, L.; YOUNIS, I.; DREYFUSS, G. U1 snRNP regulates cancer cell migration and invasion in vitro. Nature Communications, v. 11, n. 1, p. 1-8, 2020. https://doi.org/10.1038/s41467-019-13993-7
OSHLACK, A.; ROBINSON, M.D.; YOUNG, M.D. From RNA-Seq reads to differential expression results. Genome Biology, v. 11, n. 12, p. 220, 2010. https://doi.org/10.1186/gb-2010-11-12-220
PROSDOCIMI, F.; DE CARVALHO, D. C.; DE ALMEIDA, R. N.; BEHEREGARAY, L. B. The complete mitochondrial genome of two recently derived species of the fish genus Nannoperca (Perciformes, Percichthyidae). Molecular Biology Reports, v. 39, n. 3, p. 2767-2772, 2012. https://doi.org/10.1007/s11033-011-1034-5
QUAIL, M. A.; SMITH, M.; COUPLAND, P.; OTTO, T. D.; HARRIS, S. R.; CONNOR, T. R.; GU, Y. A tale of three next-generation sequencing platforms: comparison of Ion Torrent, Pacific Biosciences, and Illumina MiSeq sequencers. BMC Genomics, v. 13, n. 1, p. 341, 2012. https://doi.org/10.1186/1471-2164-13-341
SIMPSON, A. J. G.; REINACH, F. D. C.; ARRUDA, P.; ABREU, F. A. D.; ACENCIO, M.; ALVARENGA, R.; BARROS, M. H. D. The genome sequence of the plant pathogen Xylella fastidiosa. Nature, v. 406, n. 6792, p. 151-157, 2000. https://doi.org/10.1038/35018003
THIMM, O.; BLÄSING, O.; GIBON, Y.; NAGEL, A.; MEYER, S.; KRÜGER, P.; STITT, M. Mapman: a user‐driven tool to display genomics data sets onto diagrams of metabolic pathways and other biological processes. The Plant Journal, v. 37, n. 6, p. 914-939, 2004. https://doi.org/10.1111/j.1365-313X.2004.02016.x
TIAN, T.; LIU, Y.; YAN, H.; YOU, Q.; YI, X.; DU, Z.; SU, Z. agriGO v2.0: a GO analysis toolkit for the agricultural community, 2017 update. Nucleic acids research, v. 45, n. 1, p. 122-129, 2017. https://doi.org/10.1093/nar/gkx382
VOLKER, R.; SMALL, C. RNA-seqlopedia, 2017. Disponível em: https://RNA-Seq uoregon.edu/#exp-design. Acesso em: 09 out. 2019.
WAGNER, G.P.; KIN, K.; LYNCH, V.J. Measurement of mRNA abundance using RNA-Seq data: RPKM measure is inconsistent among samples. Theory in biosciences, v. 131, n. 4, p. 281-285, 2012. https://doi.org/10.1007/s12064-012-0162-3
WANG, M.; JIANG, B.; LIU, W.; LIN, Y. E.; LIANG, Z.; HE, X.; PENG, Q. Transcriptome Analyzes Provide Novel Insights into Heat Stress Responses in Chieh-Qua (Benincasa hispida Cogn. var. Chieh-Qua How). International journal of molecular sciences, v. 20, n. 4, p. 883, 2019. https://doi.org/10.3390/ijms20040883
WANG, Z.; GERSTEIN, M.; SNYDER, M. RNA-Seq: a revolutionary tool for transcriptomics. Nature Reviews Genetics, v. 10, n. 1, p. 57-63, 2009. https://doi.org/10.1038/nrg2484
ZHANG, C.; DOWER, K.; ZHANG, B., MARTINEZ; R. V., LIN; L. L.; ZHAO, S. Computational identification, and validation of alternative splicing in ZSF1 rat RNA-seq data, a preclinical model for type 2 diabetic nephropathy. Scientific Reports, v. 8, n. 1, p. 7624, 2018. https://doi.org/10.1038/s41598-018-26035-x
Copyright (c) 2021 Mayla Daiane Correa Molinari, Renata Fuganti-Pagliarini, Jéssika Angelotti Mendonça, Daniel de Amorim Barobosa, Daniel Rockenbach Marin, Liliane Mertz-Henning, Alexandre Lima Nepomuceno
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Authors retain copyright and grant the Journal the right to the first publication. Authors are encouraged to and may self-archive a created version of their article in their institutional repository, or as a book chapter, as long as acknowledgement is given to the original source of publication. As the Journal provides open access to its publications, articles may not be used for commercial purposes. The contents published are the sole and exclusive responsibility of their authors; however, the publishers can make textual adjustments, adaptation to publishing standards and adjustments of spelling and grammar, to maintain the standard patterns of the language and the journal. Failure to comply with this commitment will submit the offenders to sanctions and penalties under the Brazilian legislation (Law of Copyright Protection; nº 9,610; 19 February 1998).