Data use and access behavior in escience---exploring data practices in the new data-intensive science paradigm

Data use and access behavior in escience---exploring data practices in the new data-intensive science paradigm(2011)

引用 23|浏览3
暂无评分
摘要
Science may be entering its fourth paradigm of "data-intensive science". Relatively little attention has been paid to the users of scientific data, particularly their data practices. This dissertation endeavors to improve our understanding of data behavior in the new paradigm. In association with the scope of Sloan Digital Sky Survey (SDSS) project, I conduct two major lines of research: a content analysis of SDSS-related scientific publications to investigate astronomers' data use behavior, and a visual exploration analysis (VEA) of SDSS SQL query logs with the design, implementation, and evaluation of an interactive visualization tool, SDSS Log Viewer. By integrating results from VEA and statistics, I conducted three case studies of SDSS log data to investigate users' data seeking behavior. For astronomers' data usage behavior, I found that: (1) while a large volume of scientific data is produced in SDSS, researchers that rely on SDSS only intended to leverage the large number and use more data; (2) studies that leveraged a large volume of data from multiple data sources are relatively rare in the SDSS research domain; (3) using data collected by others, both data collection projects and other researchers, is a common data behavior in the SDSS research community; and (4) the results of possibility of data reconstruction suggest that scientific publications themselves are insufficient for linking scientific data with the data sources. For users' data seeking behavior, I found that: (1) a small number of automatic query generators formed the major query traffics (in terms of the number of queries) to the SDSS data archive and six common categories of queries were identified. The number of query templates used by automatic query generators are small; (2) Academic researchers, who are the target users of the SDSS data archive, issued relatively large number of queries manually. Compared to the queries generated by automatic data requestors, the query templates used by this type of users are rather diverse in terms of both sophistication of condition strings and complexity of query structures. A possible learning hierarchy is observed in this user group; and (3) occasional passing-by users are large in numbers, but their behavior is still unclear.
更多
查看译文
关键词
data behavior,common data behavior,data practice,new data-intensive science paradigm,SDSS data,data collection project,scientific data,data reconstruction,automatic data requestors,data source,access behavior,SDSS log data
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要