FADU: A Feature Counting Tool for Prokaryotic RNA-Seq Analysis

bioRxiv(2018)

引用 3|浏览75
暂无评分
摘要
Motivation: The major algorithms for quantifying transcriptomics data for differential gene expression analysis were designed for analyzing data from human or human-like genomes, specifically those with single gene transcripts and distinct transcriptional boundaries that extend beyond the coding sequence (CDS) as identified through expressed sequence tags (ESTs) or EST-like sequence data. Some eukaryotic genomes and all, or nearly all, bacterial genomes require alternate methods of quantification since they lack annotation of transcriptional boundaries with EST or EST-like data, have overlapping transcriptional boundaries, and/or have polycistronic transcripts. Results: An algorithm was developed and tested that better quantifies transcriptomics data for differential gene expression analysis in organisms with overlapping transcriptional units and polycistronic transcripts. Using data from standard libraries originating from Escherichia coli and Ehrlichia chaffeensis, and strand-specific libraries from the Wolbachia endosymbiont wBm, FADU can derive counts for genes that are missed by HTSeq and featureCounts. Using the default parameters with the E. coli data, FADU can detect transcription of 51 more genes than HTSeq in union mode and 21 genes more than featureCounts, with 42 and 18 of these features being ≤ 300 bp, respectively. Due to its ability to derive counts for otherwise unrepresented genes without overstating their abundance, we believe FADU to be an improved tool for quantifying transcripts in prokaryotic systems for RNA-Seq analyses. Availability and implementation: FADU is available at https://github.com/adkinsrs/FADU. FADU was implemented using Python3 and requires the PySAM module (version 0.12.0.1 or later).
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要