Inflammatory bowel diseases (Crohn’s disease, ulcerative colitis) affect over two million North Americans with increasing prevalence across the world. While colonoscopy is the primary procedure to assess these diseases, the way disease is interpreted remains limited. The Mayo score is the most commonly used technique to measure the severity of ulcerative colitis (UC) and has been used for assessing new medications in clinical trials since the 1980s. It also evaluates a response to treatment and suggests a prognosis. However, the Mayo Score represents a subjective assessment of disease severity by physicians which potentially introduces inconsistency, bias, and possible error. In addition, reviewing videos from an entire colonoscopy can be a time consuming, tedious, and expensive task that is impractical in most clinical settings.
To address the need for a more reliable and informative method to measure inflammatory bowel diseases, a team of clinicians led by Dr. Ryan Stidham, a gastroenterologist in the Department of Internal Medicine and a member of the Center for Computational Medicine and Bioinformatics (CCMB), entered into a collaboration with a team of data scientists led by Dr. Kayvan Najarian from the Department of Computational Medicine and Bioinformatics (DCMB).
As a first step, the team aimed to determine if they could more reliably replicate experts' grading of UC severity in colonoscopy. In between 2017-2019, the teams collected 16,000 endoscopic still images from over 3,000 patients with UC and designed a neural network computational model to automatically reproduce the Mayo score. Through a series of refined algorithms, the model had excellent performance for determining disease severity on still images, matching the performance of expert gastroenterologists. However, more work was needed to not only grade still images, but to determine the overall Mayo score based on the entire colonoscopy video. By designing additional models that also considered the quality of the image and identifying the presence of debris or other confounders that could mistakenly be confused for disease, the updated method had very good ability to replicate blinded human experts for determining the overall Mayo score for UC patients in clinical trials. This data “filter” greatly increased the reliability of the model. With this first phase of the collaboration, the teams achieved greater reliability, efficiency and speed, and better uniformity for grading UC disease severity. By automating IBD assessment, this also lowers the colonoscopy analysis cost.
Replicating the Mayo score is powerful, but many clinician’s feel that more information is needed to completely describe inflammatory bowel disease uncaptured by the Mayo score. UC affects the colon unevenly and the appearance of disease is very different between patients, despite having similar Mayo scores. This complexity makes defining and predicting treatment success are particularly challenging. The team was able to refine the previously trained neural network to track and developed completely novel methods for spatially mapping the disease severity along the colon. This has enabled a completely new AI-based method for describing UC that better captures the individual patient’s unique disease characteristics.
The project, now in partnership with Johnson & Johnson, is currently being tested using data from multiple clinical trials. So far, the results are very encouraging new spatial measurements of disease significantly outperform the Mayo and other standard disease measurement scores. The new AI-powered CDS score can detect when a new medication is going to be effective in a clinical trial with approximately half the patients traditionally needed. In addition, this new CDS score was significantly better associated with the patient’s symptoms and overall experience than the existing Mayo score.
This example of collaboration between physicians and data scientists shows how new advancements in medicine can be achieved. In the case of inflammatory bowel disease, computational measures have advantages of being more quantitative, explainable, and repeatable than any measure previously available. By better measuring disease physicians can better understand their decision-making regarding treatments and have new insights into better pathways to provide care. In addition, this is an example of data science technology that can improve the quality and reduce the cost of new drug development. This new technology can be realistically implemented in the clinics and will help our patients and physicians very soon.
Dr. Stidham presented this collaborative research with the Najarian lab at the Department for Computational Medicine and Bioinformatics and the Center for Computational Medicine and Bioinformatics kick-off meeting, on August 24, 2023.
Yao H, Najarian K, Gryak J, Bishu S, Rice MD, Waljee AK, Wilkins HJ, Stidham RW. Fully automated endoscopic disease activity assessment in ulcerative colitis. Gastrointest Endosc. 2021 Mar;93(3):728-736.e1. doi: 10.1016/j.gie.2020.08.011. Epub 2020 Aug 15. PMID: 32810479.
Stidham RW, Cai L, Cheng S, Rajaei F, Hiatt T, Wittrup E, Rice MD, Bishu S, Wehkamp J, Schultz W, Khan N, Stojmirovic A, Ghanem LR, Najarian K. Using Computer Vision to Improve Endoscopic Disease Quantification in Therapeutic Clinical Trials of Ulcerative Colitis. Gastroenterology. 2023 Oct 11:S0016-5085(23)05086-2. doi: 10.1053/j.gastro.2023.09.049. Epub ahead of print. PMID: 37832924.