Normalization techniques for PARAFAC modeling of urine metabolomic data

Metabolomics(2016)

引用 16|浏览10
暂无评分
摘要
Introduction One of the body fluids often used in metabolomics studies is urine. The concentrations of metabolites in urine are affected by hydration status of an individual, resulting in dilution differences. This requires therefore normalization of the data to correct for such differences. Two normalization techniques are commonly applied to urine samples prior to their further statistical analysis. First, AUC normalization aims to normalize a group of signals with peaks by standardizing the area under the curve (AUC) within a sample to the median, mean or any other proper representation of the amount of dilution. The second approach uses specific end-product metabolites such as creatinine and all intensities within a sample are expressed relative to the creatinine intensity. Objectives Another way of looking at urine metabolomics data is by realizing that the ratios between peak intensities are the information-carrying features. This opens up possibilities to use another class of data analysis techniques designed to deal with such ratios: compositional data analysis. The aim of this paper is to develop PARAFAC modeling of three-way urine metabolomics data in the context of compositional data analysis and compare this with standard normalization techniques. Methods In the compositional data analysis approach, special coordinate systems are defined to deal with the ratio problem. In essence, it comes down to using other distance measures than the Euclidian Distance that is used in the conventional analysis of metabolomic data. Results We illustrate using this type of approach in combination with three-way methods (i.e. PARAFAC) of a longitudinal urine metabolomics study and two simulations. In both cases, the advantage of the compositional approach is established in terms of improved interpretability of the scores and loadings of the PARAFAC model. Conclusion For urine metabolomics studies, we advocate the use of compositional data analysis approaches. They are easy to use, well established and proof to give reliable results.
更多
查看译文
关键词
Parallel factor analysis (PARAFAC),Compositional data,Metabolomics,Creatinine,Area under the curve
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要