mRNA Multiple Subcellular Localization Prediction by Employing Random Forest in Conjunction with Feature Selection via Elastic Net

Rupak Chandra Bhowmick, Nahin Ul Sadad,Md. Nazrul Islam Mondal,Md. Al Mehedi Hasan

2023 26th International Conference on Computer and Information Technology (ICCIT)(2023)

引用 0|浏览0
暂无评分
摘要
The process of localizing messenger RNAs (mRNAs) is of utmost importance in cellular growth and development. Specifically, it serves a significant function in the regulation of spatio-temporal gene expression. In situ hybridization is an experimental technique that holds promise for determining the localization of messenger RNAs (mRNAs); yet, it is characterized by high costs and demands significant work. We developed a method for predicting the location of mRNA sequences, encompassing data from nine distinct cellular localizations. The initial step involved converting each sequence into a numerical feature vector with a dimension of 5460, utilizing the k-mer features ranging from sizes 1 to 6. The Elastic Net statistical model was employed to identify significant features among a total of 5460 k-mer features. The Random Forest technique, a supervised learning method, was subsequently utilized to forecast the localizations using the chosen features. The cytoplasm, cytosol, endoplasmic reticulum, exosome, mitochondrion, nucleus, pseudopodium, posterior, and ribosome exhibited accuracies of 88.61%, 87.78%, 96.24%, 95.29%, 99.60%, 83.30%, 99.34%, 99.26% and 88.87% respectively. The various localizations achieved accuracies of 72.90%, 80.27%, 76.23%, 68.09%, 88.93%, 80.27%, 94.85%, 54.60% and 74.73% when evaluated using test set-1. The various localizations achieved accuracies of 88.47%, 87.43%, 86.05%, 88.74%, 93.88%, 87.13%, 97.54%, 89.66%, 84.60% when evaluated using test set-2. The technique that was developed also shown superior accuracies compared to the currently available localization prediction tools.
更多
查看译文
关键词
Bioinformatrics,Computational biology,Subcellular Localization,mRNA,SMOTE
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要