A review of 1250 articles from Physical Review Physics Education Research using Natural Language Processing
ORAL · Invited
Abstract
Physics Education Research (PER) has changed a great deal over the last 20 years, both in scope and in scale. These changes are reflected in the development of PRPER, which has become the leading journal for the field. To quantify and analyze these trends, we have performed a deductive, automated review of the PRPER literature through its entire publication history using Natural Language Processing (NLP) techniques. First, we downloaded all articles published in PRST-PER (2005-2015) and PRPER (2016-present), approximately 1250 articles in total, and converted the text of each article into a high-dimensional vector known as an embedding. These embeddings encode the meaning of the text, where texts with similar meanings will be located closer to one another within the vector space. We then define topic vectors by taking the mean of embeddings of representative articles that all focus on a distinct research area in PER, like problem-solving or conceptual understanding. We measure the distances between each article and every topic vector, then invert, scale, and normalize these distances to convert them into percentages of each topic per article. Finally, we sum these percentages each year to quantify the prevalence of each topic in the literature as a function of time. This analysis shows how different research topics have risen and fallen over the last 20 years, such as a large-scale shift in the field from studies on conceptual understanding towards studies on how identity and equity impact physics learning. We compare these trends to prior NLP-driven literature reviews of the Physics Education Research Conference Proceedings, which show similar trends over time but differences in topical focus, and unpack what these trends might mean for the future of the journal and the field.
–
Publication: Odden, T. O. B., Tyseng, H., Mjaaland, J. T., Kreutzer, M. F., & Malthe-Sørenssen, A. (2024). Using Text Embeddings for Deductive Qualitative Research at Scale in Physics Education (No. arXiv:2402.18087). arXiv. https://doi.org/10.48550/arXiv.2402.18087
Presenters
-
Tor Ole B Odden
University of Oslo
Authors
-
Tor Ole B Odden
University of Oslo
-
Halvor Tyseng
University of Oslo
-
Jonas T Mjaaland
University of Oslo
-
Markus F Kreutzer
University of Oslo
-
Helene Lane
University of Oslo
-
Anders Malthe-Sørenssen
University of Oslo