Jump ARCHES funds method to better diagnose rare diseases

1/29/2025 Lilli Bresnahan

Written by Lilli Bresnahan

Illinois’ Grainger College of Engineering Jump ARCHES Research Program awarded funds to empowering the diagnosis of rare diseases in primary care settings. This project is co-led by investigators Jimeng Sun, health innovation professor in the computer science department and Carle’s Illinois College of Medicine at Illinois, Adam Cross, a clinical informatics specialist and pediatrics hospitalist with OSF HealthCare and John Wu, a Computer Science PhD student at Illinois, researching AI applications in healthcare.

Jump Applied Research in Community Health through Engineering and Simulation (ARCHES) Research Program was established in 2014. This program is an endowment partnership between Jump Trading Simulation and Education Center at OSF HealthCare and The Grainger College of Engineering at Illinois. It aims to provide direct access and grants to engineers and physicians to fight against problems within health care. 

Phase I of the research was repurposed from a COVID-19 project because there was leftover funding. The researchers proposed to redirect this funding to the rare disease task. Phase II of their research was granted $200,000 and involves using machine learning and knowledge graphs to better detect rare diseases. 

According to Sun, the knowledge graphs exhibit how different medical signs and symptoms connect to specific rare diseases. The graphs are built using established medical databases to ensure the rare diseases in patients are “holistically” captured, according to Sun. 

Particularly through using the Graph Neural Networks, which is designed to analyze the types of connected relationships present in rare diseases, machine learning can understand patterns from confirmed rare disease cases. This helps to identify similar patterns in other patients who might have undiagnosed conditions. From here, the researchers can better predict which patients are at higher risk of having a rare disease that hasn’t been diagnosed yet. 

Through machine learning and knowledge graphs, this project addresses the critical issues of undiagnosed rare diseases, improving diagnostic precision and having better patient outcomes. Undiagnosed rare diseases present many critical challenges in healthcare. 

“Patients often endure a lengthy 'diagnostic odyssey,' consulting multiple specialists over years without receiving accurate diagnoses,” Sun said. 

This can lead to serious consequences such as receiving incorrect treatments that fail to address the underlying conditions and create financial stress. 

“Without a correct diagnosis, they continue paying for ineffective treatments while their symptoms persist or worsen,” Sun said. 

Additionally, the diagnostic process is challenging because its symptoms frequently overlap with common conditions. Again, this can lead to misdiagnoses and referrals between specialists, which can frustrate patients because of the uncertainty.

“This project aims to develop an innovative system that leverages these technologies to identify patterns and correlations within EMRs, enabling early detection of rare diseases,” the executive summary said.

This project works to develop a system that will address the lack of data for developing accurate detection methods because rare diseases are so uncommon. The researchers solution involves first, extracting and structuring patient phenotypes from clinical documentation, then through the use of knowledge graphs, they will screen patient profiles against known rare disease phenotype patterns, eliminating cases that don’t match these patterns. Second, they will develop a neural network trained on validated cases and carefully generated synthetic data to identify patients most likely to have specific rare diseases. 

“The system will include two critical features: a confidence threshold to ensure we only flag high-risk cases for further investigation, and machine learning interpretability methods to analyze phenotype-disease relationships, helping us understand which phenotypes are most predictive of specific rare diseases,” Sun said. 

After phase II, phase III will focus on validation and implementing their machine learning system in clinics. They will conduct real-world validation to assess the system’s reliability and accuracy in identifying potential rare diseases. Also, they will create a “user-friendly software pipeline” that integrates into clinical workflows, according to Sun. Finally, they will establish a clear criteria for what evidence is required to justify additional rare disease screening, ensuring that the system’s recommendations are both clinically meaningful and cost-effective.

 


Share this story

This story was published January 29, 2025.