Guided visual exploration of genomic stratifications in cancer

NATURE METHODS(2014)

引用 26|浏览34
暂无评分
摘要
To the Editor: Cancer is a heterogeneous disease, and molecular profiling of tumors from large cohorts has enabled characterization of new tumor subtypes. This is a prerequisite for improving personalized treatment and ultimately achieving better patient outcomes. Potential tumor subtypes can be identified with methods such as unsupervised clustering1 or network-based stratification2, which assign patients to sets based on high-dimensional molecular profiles. Detailed characterization of identified sets and their interpretation, however, remain a time-consuming exploratory process. To address these challenges, we combined 'StratomeX'3, an interactive visualization tool that is freely available at http://www.caleydo.org/, with exploration tools to efficiently compare multiple patient stratifications, to correlate patient sets with clinical information or genomic alterations and to view the differences between molecular profiles across patient sets. Although we focus on cancer genomics here, StratomeX can also be applied in other disease cohorts. Thousands of patient stratifications can be derived from large cancer genomics datasets. This space of patient stratifications—which we call the 'stratome'—contains stratifications based on, for example, clustering of mRNA, microRNA or protein expression matrices; the mutation or copy number status of genes; or on clinical variables. Owing to the size of the stratome and the heterogeneity of the underlying datasets, integration of computational and visual approaches is indispensable to the analyst in identifying biologically or clinically meaningful stratifications as well as clinical parameters and pathways that together provide a comprehensive view of each patient set. StratomeX complements the network viewers, heat maps and genome browsers typically used in cancer genomics4 (Supplementary Discussion and Supplementary Table 1). To visualize the relationships between multiple patient stratifications as well as other data (Fig. 1 and Supplementary Fig. 1), stratifications are represented as columns of stacked blocks where each block corresponds to a patient set. Blocks contain visualizations of the data associated with those patients, such as heat maps, pathway maps overlaid with expression data or survival plots (Supplementary Fig. 2). Bands connecting the blocks show the pairwise overlap of sets in adjacent stratifications, with the width of the bands representing the size of the overlap relative to the size of the patient sets (Supplementary Fig. 3). This visualization is an efficient tool to confirm hypotheses about gene functions or subtypes defined by molecular profiles. StratomeX also integrates a computational framework for query-based guided exploration of the stratome directly into the visualization (Fig. 1), which enables discovery of novel relationships between patient sets and efficient generation and refinement of hypotheses about tumor subtypes. A 'query wizard' provides step-by-step instructions (Supplementary Figs. 1 and 4) for defining queries, and a range of computational methods are used to generate rankings (Supplementary Methods). The queries yield a score for each stratification, for example, based on their overlap with a particular patient set or based on their overall similarity to a selected stratification. Furthermore, the analyst can query the collection for stratifications that contain patient sets that exhibit differences in survival or differential regulation of pathways. We use 'LineUp'5, a multi-attribute ranking technique, to visualize the results of these queries and to show which stratifications or pathways score highly (Fig. 1 and Supplementary Fig. 5). The tight integration between the StratomeX and LineUp views, as well as the dynamic computation of scores, is essential for rapid identification of meaningful relationships between stratifications, clinical parameters and pathways. We demonstrate the effectiveness of StratomeX in a case study (Supplementary Note, Supplementary Figs. 6–18, Supplementary Tables 2 and 3, Supplementary Data 1,2,3,4 and Supplementary Video 1) in which we explored molecular and clinical data to characterize tumor subtypes in a cohort of over 400 clear cell renal cell carcinoma cases reported by The Cancer Genome Atlas consortium6. M.S., A.L. and N.G. jointly conceived the project and designed the methods. M.S., A.L. and N.G. wrote the manuscript with contributions from S.G. and guidance by P.J.P., H.P. and D.S. M.S. and A.L. led the implementation of the software. S.G. and C.P. contributed to the design and implementation. P.J.P. and N.G. developed requirements and use cases, and oversaw the project. Download references This work was supported by US National Institutes of Health (U24 CA144025, U24 CA143845 and K99 HG007583), the Austrian Science Fund (J 3437-N15, P 22902), and the Air Force Research Laboratory and Defense Advanced Research Projects Agency grant FA8750-12-C-0300.
更多
查看译文
关键词
Software, Cancer genomics, Data integration, Translational research
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要