Projects

Artificial Intelligence in Health Care

We design self-learning software and cognitive systems to improve medical care and drive clinical research.

Big Data, Computing, & Security

We design systems to efficiently retrieve, compute, and store data while researching AI-based controls to ensure privacy and security.

Smart & Connected Communities

We're working to extend healthcare service from hospitals into homes through data analytics and smart health technologies.

Select Projects

The Health Data Analytics Initiative works with investigators from throughout the UIUC campus, University of Illinois College of Medicine Peoria, OSF HealthCare, and local external agencies to improve medicine and well-being for all. Scroll for an in-depth look at select projects that represent the breadth of our research.

Work With Us

Secure Federated Learning for Clinical Informatics with Applications to the COVID-19 Pandemic

Sanmi Koyejo (PI - CS at UIUC), Dakshita Khurana (UIUC), George Heintz (UIUC), Bill Bond (OSF), & Roopa Foulger (OSF) | Funded by C3.ai Digital Transformation Institute 

Enabling health care providers to respond faster and with greater precision to pandemics requires both advanced machine learning and quickly accessible data, yet the necessary data is often inaccessible across hospitals due to privacy and intellectual property concerns. This project leverages distributed machine learning and modern cryptograph to introduce a computational protocol and software tools for securely training machine learning models with data spread over several medical establishments while preserving privacy and IP rights. 

Scientific contributions include innovative techniques that trade-off computation and communication to improve predictive performance of federated learning in clinical settings and novel cryptographic techniques that trade-off computation and robustness to enhance security.

To complement these technical aims, the project will develop open source software. The project's approach for COVID-19 diagnosis will be evaluated using data available on the C3.ai data lake combined with clinical data from OSF HealthCare to illustrate how private data can significantly improve prediction quality compared to public data alone. The project also proposes to serve as a hub for other C3.ai projects to enable the secure use of privately-held clinical datasets, which will improve results from other teams.

The broader vision and objective of this project is to provide Secure Federated Learning as a Service (FLaaS) freely available to any hospital during a declared crisis. The PIs envision a robust, secure federated learning system that will enable fast responses to minimize the impact of diseases in their earliest stages.

Digitizing the Neurological Screening Examination

Minh Do (PI - ECE at UIUC), George Heintz (UIUC), & Chris Zallek (OSF) | Funded by Jump ARCHES 

This project attempts to materialize the novel concept of using multiple sensors to digitize and quantify the neurological screening examination to aid in providing primary neurological care. Using cutting-edge sensors, processes will be engineered to record, quantify, integrate, and present clinically useful exam information. Sensor and algorithm technologies to quantify neurological exam subsections will continually evolve, and this project provides a testing ground for a framework to integrate information from different sources. 

The primary objective is to generate summary assessments to each exam subsection. These assessments will include the ability to obtain consistent recordings of a subject and exam subsection, listing of the features recorded allowing for quantification of the exam subset, software and algorithms used to quantify exam subsections, and data presentation options. Recording logistical and technical difficulties and analysis limitations will be identified. Through this, the feasibility of approaches and needs for obtaining neurological exam screening information will be appraised to guide subsequent research efforts.

The rationale for the long-term vision of this program is driven by the needs of clinicians and health care systems. Forcing clinicians to work faster or for more hours will not effectively reduce the neurological health care crisis. New tools and methods of care delivery are needed to focus on better exam data generation and clinical decision support. The rationale for this specific proposal is that while many tools are being developed to quantify the neurological exam, the need is already present to organize them into a toolbox for primary evaluations. Neurological examination technologies will move from research study-centric applications to neurology subspecialty clinics, and finally to providing primary care for neurological patients. The application of the team's engineering expertise and knowledge will accelerate this transformation.

Towards a Uniform Inspection Program

ChengXiang Zhai (UIUC), George Heintz (UIUC), & Jim Roberts (C-UPHD) | Funded by the FDA 

This project attempts to increase free text consistency and effectiveness in food inspection reports. Inspectors utilize free text fields to describe violations and educate the food establishment on specific matters to increase food safety. The project demonstrates with the use of Natural Language Processing (NLP) that free text fields have a significant influence on food code compliance and the resulting inspection score. Based on those results, the team will develop a tool with additional analytical capabilities that monitors both free text consistency across cuisines, inspectors, and writing style effectiveness. Big Data analysis methods will be leveraged to gain deep insights and provide an interface for performing bias analysis, optimization of inspector assignment, and uncover patterns from exploratory analysis. The following strategies will be applied:

Text preprocessing: Since the data is very structured, focus will be on processing observations made by each individual inspector. Stemming, stop word removal, and lemmatization will be applied to normalize the text. The goal is to obtain a vocabulary which can be used as a reference for machine learning algorithms.

Text/data similarity: One method is to use lexical similarity, considering two pieces of text as similar if they use similar words. This approach, however, is limited since it does not consider semantic similarity. Another method is to consider clustering approaches which can obtain similar words based on co-occurence. Recent approaches use representation learning to compare similar texts based on embedded representations.

Topic modeling: To cluster observations on a co-occurence basis and gain insight on how observation data is structured, topic modeling can be applied. This is an unsupervised, exploratory method which can be useful in analyzing inspectors. It can produce a concrete prediction on the basis of analyzing clusters.

Clustering: Similar to topic modeling, clustering allows for in-depth data exploration. For example, inspectors can be clustered based on their topic distributions and this data can be used to find outliers. This could be especially useful in identifying inspectors who deviate from the norm.

Predictive modeling: Using outcomes such as overall score, we can analyze the importance of terms and success of inspections, including recommended corrective actions and their priority via analysis of keywords and semantic relationships. This tool will be able to perform those analysis conditioned on one or more cuisines, thus the food program manager will be able to detect text inconsistency across different cuisines and idenfity the most effective writing style. Those results for employee food inspection staff will be applied to training, with the goal of establishing a more uniform inspection program and protocol.