job2vec: Learning a Representation of Jobs∗


引用 0|浏览19
Job postings provide unique insights about the demand for skills, tasks, and occupations. Using the full text of data from millions of online job postings, we train and evaluate a natural language processing (NLP) model with over 100 million parameters to classify job postings’ occupation labels and salaries. To derive additional insights from the model, we develop a method of injecting deliberately constructed text snippets reflecting occupational content into postings. We apply this text injection technique to understand the returns to several information technology skills including machine learning itself. We further extract measurements of the topology of the labor market, building a “jobspace” using the relationships learned in the text structure. Our measurements of the jobspace imply expansion of the types of work available in the U.S. labor market from 2010 to 2019. We also demonstrate that this technique can be used to construct indices of occupational technology exposure with an application to remote work. Moreover, our analysis shows that data-driven hierarchical taxonomies can be constructed from job postings to augment existing occupational taxonomies like the SOC (Standard Occupational Classification) system. Exploring further the model structure, we find that between 2010 and 2019, occupations have become increasingly distinct from each other in their language, suggesting a rise in specialization of tasks in the economy. This trend is strongest for managerial, computer science, and sales occupations. ∗1 Stanford Digital Economy Lab, USA, 2National Bureau of Economic Research, USA, 3Wharton School, University of Pennsylvania, USA 4 MIT Sloan, USA. We thank Google Cloud Platform (GCP) and Stanford’s Institute for Human-Centered Artificial Intelligence for supporting the computing necessary for this project. We also thank Nabeel Gillani, Lindsey Raymond, Dokyun Lee, Yonadav Shavit, participants of the Workshop on Information Systems and Economics (WISE), and the seminar participants at CMU Tepper, the University of Nebraska–Lincoln, and the Bureau of Labor Statistics for helpful comments. We thank Bledi Taska and Burning Glass Technologies for providing the job postings data, and Cary Sparrow and Greenwich.HR for providing the salary data that made this work possible. The Stanford Digital Economy Lab provided generous financial support.
AI 理解论文
Chat Paper