April 20, 2023

Perspectives: Responsible Data Science Gets MIDAS Touch

Evan Reynolds, Ph.D., was part of a very select group of data scientists invited to the annual Future Leaders Summit by the Michigan Institute for Data Science (MIDAS).   This year's theme was "Responsible Data Science and AI," and Dr. Reynolds shared his perspective on the hot-button topic and how it affects his study of diabetes complications.

by Evan Reynolds, Ph.D.

a photo of the 2023 MIDAS Future Leaders Summit

The 2023 Future Leaders Summit, hosted by the University of Michigan Institute for Data Science (MIDAS) was held April 12-14, 2023, in Ann Arbor, MI.  As an emerging diabetes data scientist, I was thrilled to be nominated to attend on behalf of the NeuroNetwork for Emerging Therapies.  The theme for this year’s summit was “Responsible data science and AI,” a very important and of-the-moment topic.

The entire focus of my work revolves around data science, but particularly relevant to artificial intelligence (AI), a study I am involved in is aiming to predict complications in those suffering from diabetes.

photo of Evan Reynolds speaking at the MIDAS Future Leaders Summit
Evan Reynolds, Ph.D., speaking about diabetes data science

This work, funded by the National Institute of Diabetes and Digestive Kidney Diseases, incorporates data science by applying machine learning algorithms to metabolic profiles taken over time to predict which patients with diabetes are most likely to develop complications, such as peripheral neuropathy.

Machine learning techniques are a subset of artificial intelligence that uses statistical models and algorithms to aid in decision-making. In our work, these predictive algorithms get better as the “machine” learns more metabolic risk data.

For healthcare and many other industries, AI has an incredible potential for predictive capabilities. In the case of my work that I described above, being able to rely on these predictions for who is most likely to develop diabetes complications would allow for the deployment of interventions to these patients, which might lessen their severity if not completely prevent them.  

However, any type of AI, machine learning or otherwise, can have inherent biases based on the data that is fed to it.  This is why I was very much looking forward to the Summit.  The field of machine learning and data science is evolving at an exceedingly rapid pace, so it becomes equally as important to understand the new methodologies that can help ensure “responsible application.”

Dr. Ellie Sakhaee of Microsoft preparing to speak

The collaborative nature of this meeting did allow me to gain new perspectives on emerging machine learning approaches, and insight into the best practices for more equitable machine learning practices.  I was able to meet data sciences from a wide range of fields—environmental studies, data privacy, and human language, to name a few—Dr. Andrew J. Connolly, Director of the eScience Institute at the University of Washington, and Dr. Ellie Sakhaee, Senior Program Manager of Office of Responsible AI at Microsoft, were particularly helpful.

While they might have been applying data science approaches outside of healthcare, gaining insight into their work allowed me to see how their cutting-edge approaches could be translated into my own of predicting diabetes complications.

For example, I was interested in tactics for developing machine learning algorithms that are interpretable beyond just data scientists.  For our work to influence clinical diabetes decision-making and ultimately patient care, both doctors and patients need to be able to understand these algorithms to buy into their conclusions.

The Future Leaders Summit also significantly increased my awareness of the harmful impact that biases in machine learning algorithms can have, particularly on vulnerable populations. This is a particularly challenging problem because the field is changing at such a rapid rate.  However, implementing values of fairness in our studies will ensure results from our research can be equitably disseminated across all individuals, in our case, with diabetes.  How do we do that? I learned that the key was engaging all stakeholders.  Whether that is patient advisory boards, clinicians from different health systems, or representatives of different groups themselves, it is critical to gain their perspectives to understand what biases might exist in the data we use.

The big takeaway I took from this meeting is that we must consider ourselves long-term stewards of these algorithms, and their output, long after we implement them.