March 27, 2024

HILS PhD student presents at Malta workshop on challenges facing patient de-identification systems

The Department of Learning Health Sciences and the Health Infrastructures and Learning Systems (HILS) team congratulates HILS Ph.D. student, Dalton Simancek, MSI for his recent presentation at the 2024 Workshop on Computational Approaches to Language Data Pseudonymization (CALD-pseudo 2024) in St Julian’s, Malta. 

Simancek presented on Thursday, March 21st about why “mitigating name recognition errors is essential to minimizing the risk of patient re-identification.” His presentation was titled: “Handling Name Errors of a BERT-Based De-Identification System: Insights from Stratified Sampling and Markov-based Pseudonymization.”  

The HILS Ph.D student told DLHS Communications that missed recognition of named entities while de-identifying clinical narratives poses a critical challenge in protecting patient-sensitive health information. His paper highlighted the need for “stratified sampling and enhanced contextual considerations concerning Name tokens using a fine-tuned Longformer BERT model for clinical text de-identification.” Simancek explained that, “experimental results underscore the impact of addressing name recognition challenges in BERT-based de-identification systems for heightened privacy protection in electronic health records.”

Simancek is mentored by VD Vinod Vydiswaran, Ph.D