APS Logo

Using natural language processing to extract features from clinical notes for medical physics quality assurance

ORAL

Abstract

Missing data present a problem in various data-driven models that take clinical data as input. Imputation methods are often inadequate and provide estimations for missing data. We found that physicians’ clinical notes are information rich and can be used to fill in missing features; however, they are in free text form which makes feature extraction a challenge.

Our database contains 457 prostate cancer patients’ radiotherapy treatment records, where 83% of Gleason scores and 100% of Prostate-Specific Antigen (PSA) levels are missing. These features are critical in determining the appropriate radiotherapy dosage. We developed an algorithmic tool that utilizes basic Natural Language Processing (NLP) to extract missing features from physicians’ notes. The methods included tokenization, filtering stop word, chunking, lemmatizing, and tagging to capture the missing data.

Our prototype analysis consisted of using NLP to query 100 prostate patients’ clinical notes to find missing PSA and Gleason scores. Manual validation was performed. The results show that 64% of missing PSA values and 37% of missing Gleason scores were successfully restored. The sensitivity and specificity for finding PSA are 76% and 56% and for Gleason scores are 48% and 28%.

The restored database, with substantial data filled in by the NLP methods described, will be used to train and deploy an anomaly detection algorithm [1] that detects potentially erroneous radiotherapy prescriptions and assists with medical physics quality assurance.

[1] Li, Qiongge, et al. "A novel data-driven algorithm to predict anomalous prescription based on patient's feature set." arXiv preprint arXiv:2111.15101 (2021)

Presenters

  • Connor Thropp

    Brown University

Authors

  • Connor Thropp

    Brown University

  • Laura Buchanan

    Lifespan Cancer Institute; Warren Alpert Medical School of Brown University

  • Timothy Leech

    Lifespan Cancer Institute

  • Qiongge Li

    Lifespan Cancer Institute; Warren Alpert Medical School of Brown University

  • Eric Klein

    Lifespan Cancer Institute; Warren Alpert Medical School of Brown University