Connecting family trees to construct a population-scale and longitudinal geo-social network for the US

INTERNATIONAL JOURNAL OF GEOGRAPHICAL INFORMATION SCIENCE(2021)

引用 7|浏览23
暂无评分
摘要
We collected 92,832 user-contributed and publicly available family trees from rootsweb.com, including 250 million individuals who were born in North America and Europe between 1630 and 1930. We cleaned and connected the family trees to create a population-scale and longitudinal family tree dataset using a workflow of data collection and cleaning, geocoding, fuzzy record linkage and a relation-based iterative search for connecting trees and deduplication of records. Given the largest connected component of nearly 40 million individuals, and a total of 80 million individuals, we generated, to date, the largest population-scale and longitudinal geo-social network over centuries. We evaluated the representativeness of the family tree dataset for historical population demography and mobility by comparing the data to the 1880 Census. Our results showed that the family trees were biased towards males, the elderly, farmers, and native-born white segments of the population. Individuals were highly mobile - in our 1880 sample of parent-child pairs where both were born in the U.S., 47% were born in different states. Our findings agreed with prior studies that people migrated from East to West in horizontal bands, and the trend was reflected in the dialects and regional structure of the U.S.
更多
查看译文
关键词
Spatial social networks, family trees, record linkage, migration, historical GIS
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要