Radiologist validation of a multi-tissue breast segmentation convolutional neural net ensemble

Jun 16, 2023


Magnetic resonance imaging (MRI), used in concert with computer-aided methods, can detect, diagnose, and characterize invasive breast cancers. To this end, fully automated segmentation of breast tissues is important for quantitative breast imaging analysis, and for use in spatially resolved biophysical models of breast cancer. To ensure the accuracy of segmentations, tissue label maps from such models must be validated by domain experts such as radiologists.  

We developed an ensembled suite of convolutional neural networks (core components of SimBioSys TumorSight) that segmented tumor and other tissues, in and around the breast (chest, adipose, gland, vasculature, skin). We sought to validate model results against the expertise of two breast-specialized radiologists. A ground truth dataset was created based on the radiologists’ assessments of tumor longest dimensions (LD), tumor segmentation, multi-tissue segmentation, and background parenchymal enhancement (BPE). This allowed us to quantify the agreement between the convolutional neural networks (CNN) and the radiologists, and also the observed variability between radiologists. This metric can be used as a lower bound for the expected variability between CNN results and radiologists’ assessments, providing a benchmark to evaluate current and future models. 



CNN-generated tumor and multi-tissue segmentations were created for 100 early-stage breast cancer cases based on dynamic contrast enhanced (DCE) MRI University of Alabama Birmingham (UAB) Hospital (e.g., Figure 1). Each case underwent an internal review and, if necessary, the tumor segmentation was manually edited (“manual segmentation”) to more accurately reflect the underlying tumor characteristics. These cases were then independently assessed by two board-certified radiologists (Reviewer 1 and Reviewer 2) for the following: LD (measurements of primary tumor and total extent of disease), tumor segmentation (approve/reject), multi-tissue segmentation (approve/reject), as well as categorizing BPE per BI-RADS1. The reviewers were required to make their LD measurements prior to viewing the tumor and multi-tissue segmentation to minimize bias. CNN-generated segmentations without any manual edits were used for comparison (“CNN segmentations”). A workflow summary is shown in Figure 2. 

Percent approval by reviewer was calculated, and lower bounds (LB) of a one-sided 95% exact confidence interval (CI) are given for both tumor and multi-tissue segmentations. IRR was measured between the two reviewers using Gwet’s AC12 for tumor and multi-tissue segmentation calls, as well as BPE. Reviewer’s LD measurements were used to create a 3D bounding box around the tumor. Intra-class correlation3 (ICC) was used to assess the reliability of these measurements between the reviewers, the CNN segmentations, and the manual segmentations.  



Reviewer 1 approved 67 of 91 tumor segmentations (73.6%, LB 95% CI=65.0%) and 87 of 90 multi-tissue segmentations (96.7%, LB 95% CI=91.6%). Reviewer 2 approved 84 of 96 tumor segmentations (87.5%, LB 95% CI=80.5%) and 88 of 93 multi-tissue segmentations (94.6%, LB 95% CI=89.0%). Overall, 87 of the 97 reviewed tumor segmentations (89.7%) were approved by one or more reviewer, along with 80 of 86 multi-tissue segmentations (93.0%). Reasons for incomplete reviews included issues loading/visualizing the images, and issues locating the lesion. A Gwet’s AC1 of 0.70, indicating substantial reliability, was observed between reviewers regarding tumor segmentation calls, and a Gwet’s AC1 of 0.93 was observed for multi-tissue segmentation calls, indicating strong reliability. High concordance was observed on BPE categorization (71.0% agreement, Gwet’s AC1=0.62). Cohort demographics and clinical characteristics are found in Table 1.  

The ICC coefficient for the log of the bounding box volume around the primary tumor was 0.82 (95% CI: [0.75, 0.87] between the two reviewers and the CNN segmentations, indicating good reliability. ICC was 0.92 (95% CI: [0.89, 0.94]) between the two reviewers and the manual segmentations, indicating excellent reliability. Strong Pearson correlations (rrange=0.77 to 0.90) between maximum LD measurements were observed for all six comparisons (Figure 3). Summary statistics for the maximum LD measurements and bounding box volumes are found in Table 2.  



We observed a very high level of acceptance overall for both tumor and multi-tissue segmentations. In addition, we observed high levels of agreement between Reviewer 1 and 2 on measures of tumor segmentation acceptance, and multi-tissue segmentation acceptance. These findings strongly support the validity of our CNN for multi-tissue segmentation purposes and facilitate the use of the accepted segmentations in downstream segmentation model fine-tuning.  

We also found a high level of concordance between Reviewer 1 and 2 regarding measurement of the main mass LD, and the total extent of disease LD. Additionally, strong correlation was observed between the CNN segmentations and the reviewers, as well as the manual segmentations and the reviewers (Figure 3). Importantly, these concordance measurements assist the contextualization of CNN model output, by establishing the ceiling of performance that our model may reach in the future. A high level of concordance was also observed, regarding BPE categorization (71.0% agreement); this context will be used as an internal benchmark in the development of future BPE-detection models. These results are consistent with recent studies 



Input and evaluation from domain experts (e.g., radiologists) is an invaluable step in validating the results of CNN-generated segmentations and creating benchmarks for future work. By completing this exercise, we’ve assessed the acceptability of our segmentations, quantified reliability between CNN, internal reviewers, and external radiologists, and generated a ground truth dataset that can be used for future validation and research/development efforts.  

TumorScope™ Breast

SimBioSys TumorScope™ currently aids the identification of the safest and most efficacious drug regimens for breast cancer patients.

It provides quantitative and qualitative analysis of a patient’s potential response to therapy, generated with a 3D computational model incorporating previously acquired diagnostic data.

The results from TumorScope™ are intended to be used in conjunction with the oncologist’s professional judgment, patient’s clinical history, symptoms, and other diagnostic tests.

With hundreds of retrospective patients validated, our results speak for themselves – a 95% correlation between simulated final volume and actual clinical volume post-therapy.

The Future
TumorScope™ Brain

Please Stay Tuned

The Future
TumorScope™ Mouth/Throat

Please Stay Tuned

TumorScope™ Lung

SimBioSys is developing TumorScope™ Lung, with the goal of having a positive impact on quality of life, clinical decision-making, and healthcare costs associated with lung cancer.

Though lung cancer is the leading cause of cancer-related deaths worldwide, it is amongst the few solid tumors for which immunotherapeutics have shown great promise.

The structure of lung tissue is dissimilar to that of other tissues we have studied, as the lungs are highly vascularized, oxygenated, and composed of numerous branching sets of airways.

These factors facilitate the need for accurate 3D models of the lung tumor microenvironment, and require nuanced optimization of our image analysis and segmentation methods.

The Future
TumorScope™ Bladder

Accounting for approximately 81,000 new cases in the US each year, bladder cancer is the sixth most-frequently diagnosed solid tumor.

The primary goal of neoadjuvant chemo for advanced bladder cancer is not to enable bladder-conserving treatment, but to downstage the tumor before radical cystectomy.

Bladder cancer staging is strongly dependent on the cancer’s invasion into the bladder wall and surrounding perivesical tissue.

Because of this, the SimBioSys TumorScope™ is poised to offer healthcare providers new methods to predict the degree of downstaging under different treatment regimens, and thereby optimize therapy for patients.

The Future
TumorScope™ Prostate

Affecting approximately 165,000 men in the United States each year, prostate cancers tend to occur in older men, and are often slow to progress.

As a result, management of the disease frequently includes watchful waiting and active surveillance.

SimBioSys TumorScope™ is capable of predicting tumor growth and progression, both with and without intervention.

There exists an obvious application in weighing the risks and benefits of less aggressive approaches to prostate cancer management.

The Future
TumorScope™ Ovary

The “silent killer”, early stage ovarian cancer often presents with symptoms similar to those of other common gynecological or gastroenterological issues.

Approximately 70% of epithelial ovarian cancers are not diagnosed until stage III or IV.

Ovarian cancer represents a natural next step for SimBioSys, allowing us to leverage the knowledge and modeling expertise we’ve accumulated.

This will allow us to target a cancer with high morbidity and mortality, for which neoadjuvant therapy is becoming an increasingly important option.

The Future
TumorScope™ Colon

Please Stay Tuned

The Future
TumorScope™ Skin

Please Stay Tuned

The Future
TumorScope™ Kidney

Please Stay Tuned

The Future
TumorScope™ Liver

Please Stay Tuned

The Future
TumorScope™ Uterus

Please Stay Tuned

The Future
TumorScope™ Thyroid

Please Stay Tuned

The Future
TumorScope™ Pancreas

Please Stay Tuned

The Future
TumorScope™ Esophagus

Please Stay Tuned

Tumor Microenvironment

The tumor microenvironment is understood as a complex space where cancer cells adapt their metabolic behavior, competing and cooperating with nearby healthy cells in order to grow.

Understanding the complex ways in which cancer cells interact with other nearby cell types—competing for some resources, sharing others, and eliciting molecular signals that reshape their surroundings—is critical for understanding tumor progression and response to therapy.

SimBioSys TumorScope™ offers a computational window to these interactions, enabling patients and healthcare providers to explore how different treatment regimens can influence tumor response, and ultimately, patient survival.

Virtual Trials

The logistical and financial requirements of clinical drug trials are burdensome in the context of developing novel cancer therapeutics.

Additionally, there is inherent risk for the participants of these trials, both human and animal.

Building on the aforementioned technology, SimBioSys plans to create software to virtually test the efficacy of a drug on our library of patients.

The goal is to use this technology for planning and selecting the most appropriate cohorts, using computational methods, before a trial begins.

Additionally, this technology will be used for testing the effects of various forms of a drug on virtual patients, as opposed to humans or animals.

This technology will provide a deeper understanding of the mechanisms underlying treatment non-response, and will aid in drug development efforts.

Drug Delivery Modeling

After the SimBioSys platform has been extended to nearly the full range of solid mass tumors, pharmaceutical companies will be able to test their numerous therapies against a range of simulated tumors to discover new uses and delivery methods for drugs.

Studies show a salient relationship between sub-optimal drug delivery and acquired drug-resistance, leading to increased risk of mortality.

TumorScope™ provides an opportunity to reduce the likelihood of this occurrence.

Tushar Pandey
Chief Executive Officer MBA University of Chicago, BS Engineering University of Illinois at Urbana-Champaign

With a passion to support the fight against cancer, Tushar’s focus is to ensure the company delivers on its mission to empower precision medicine. In his prior role as VP of Decision Support at Strata Decision Technology, he worked
with over 150 health systems across the country including Kaiser Permanente, Cleveland Clinic, MD Anderson, Intermountain Healthcare, Dana Farber among others. Under his leadership, Strata Decision received the prestigious “Best in
KLAS” recognition for five consecutive years. With over a decade of healthcare experience, Tushar has been one of the key thought leaders in the healthcare analytics and cost of care space.

Joseph R. Peterson
Chief Technical Officer PhD Chemistry University of Illinois at Urbana-Champaign

Driven by an interest in computing, Joseph’s 10 years of scientific research has spanned investigating combustion and explosion, to analyzing the role of the environment on microbes’ behavior, to examining individual differences in
breast tumors. He is passionate about developing software for the health and scientific R&D sectors. His goal as Chief Technical Officer at SimBioSys, Inc. is not merely to develop enterprise technologies that enable new
clinical action, but to foster lasting relationships between key players in cancer treatment.

John A. Cole, Jr.
Chief Scientific Officer PhD Physics University of Illinois at Urbana-Champaign

John is a biophysicist specializing in stochastic models and systems biology. Equally comfortable with pencil-and-paper mathematical modeling and high-performance computational simulation, John’s “whatever works” approach to problem
solving and friendly, collaborative demeanor has allowed him to contribute significantly to a range of projects in basic science and health. As Chief Science Officer of SimBioSys, Inc., he is excited to extend this line of research
to enable transformative cancer treatment.

Tyler Earnest
Director of Computational Medicine PhD Physics University of Illinois at Urbana-Champaign

Tyler has a long history of mathematical modeling as applied to biological systems. He is also well-versed in software development, 3D visualization, and GPU programming as applied to computational biology. His primary focus is on
conceiving, constructing, and validating new cancer and drug models.

Michael Hallock
VP, Software & IT
MS Bioinformatics University of Illinois Urbana-Champaign

Michael has more than 10 years of experience in the software development and information technology fields. He has extensive experience developing software for scientific computing, high performance computing, and cloud computing.
He applies his extensive knowledge to work on advanced analytics software, focusing on back-end (database, server/client communication, database development, IT infrastructure, etc.) technologies, as well as working closely with
full stack developers. Additionally, he will provide software support for scientific development.

Anu Antony,
Chief Medical Officer MBA Kellogg School of Management at Northwestern University, MPH Harvard School of Public Health, MD University of North
Carolina- Chapel Hill School of Medicine, Stanford University Medical Center, Memorial Sloan-Kettering Cancer

Dr. Antony is a Harvard, Stanford, and Memorial-Sloan Kettering Cancer Center-trained surgeon with 20 years of experience in breast cancer, including multiple leadership positions in Chicago as Professor and Vice-Chair of the Department of Surgery at Rush University, Co-Director of the Breast Cancer Service Line, and Chief of Breast Reconstruction at the Rush University Cancer Center, and Vice-Chair of the Breast Cancer Center at the University of Illinois at Chicago Hospital and Health Services. She is passionate about innovation in precision oncology and commercializing cutting-edge technology to bring it directly into the hands of physicians and patients. Her interest in science and medicine began at UNC-Chapel Hill where she graduated with distinction in Chemistry. After graduating with honors at UNC-Chapel Hill School of Medicine, she became intrigued with medical device innovation during her general surgery and plastic surgery training in silicon valley at Stanford University Medical Center. She furthered her education and training during an oncologic reconstructive surgery fellowship at Memorial Sloan-Kettering Cancer Center, a Masters in Biostatistics and Clinical Outcomes at the Harvard School of Public Health, and an additional research fellowship training at Massachusetts General Hospital/Harvard Medical School. Recognizing the benefits of dovetailing science, medicine, and business, she completed an MBA at the Northwestern-Kellogg School of Management. Dr. Antony has worked in government and private sectors where she actively treated cancer patients, co-led a multimillion dollar NIH program grant as co-PI studying stem cells in a primate model, actively publishes, lectures nationally and internationally, and has served as Chair and President of several regional and national professional societies and conferences.

Tricia Carrigan,
PhD SVP, Precision Medicine PhD

Dr. Tricia Carrigan is an accomplished Biopharmaceutical and Diagnostic Executive with over 24 years of experience across the biomarker discovery- companion diagnostic-drug development and commercialization spectrum. She specializes in Companion Diagnostics (CDx) Strategy & Commercialization, drug development programs, early and late stage drug licensing, Oncology, Women’s Health, Cardiovascular, and Hematology. She has an international experience in assay implementation/development for Phase I-III trials, external innovation and business development/partnering in EU and Asia- Pacific markets.

Eduardo Braun, MD
Head of Clinical Affairs
MD, Rio de Janeiro School of Medicine, RUSH University Medical Center
Eduardo Braun, MD earned his medical degree at the Federal University of Rio de Janeiro School of Medicine. He completed his Fellowship in Hematology/Oncology, and his Residency in Internal Medicine at RUSH University Medical Center in Chicago. He is board certified in internal medicine, medical oncology and hematology.
Dr. Braun actively participates in lung cancer, breast cancer and lymphoma research and his work has been published. He is an active member of the American Society of Clinical Oncology, American Society of Hematology and the International Association for the Study of Lung Cancer.
Dr. Braun practices in Valparaiso, Chesterton, Hobart and Westville, Indiana.

Connect With Us

SimBioSys is providing solutions for clinicians, patients, researchers and pharmaceutical companies. If you are interested in our product & services, or would like to collaborate, please contact us using the form below or email us at

I am a...*
This field is for validation purposes and should be left unchanged.


180 N LaSalle Street,
Suite 3250,
Chicago IL 60601


60 Hazelwood Drive,
Suite 230D,
Champaign IL 68120

Hilary Ann Baldwin,
VP, Regulatory & Quality

Hilary Ann Baldwin has over 20 years’ experience in regulatory and quality in the pharmaceutical, diagnostic, and medical device industries. She started in the pharmaceutical and toxicology industry while at Eli Lilly on their early development team, while building significant relationships with the FDA and other regulatory bodies. She then moved on to Roche where she began working on assay development and validation for the diagnostics division, while also taking over management of the regulatory submissions. After Roche, Hilary went to Covance, where she partnered with several pharmaceutical, diagnostic, and medical device companies on US and OUS submissions. During this time, she also took oversight of the companion diagnostic management team. Hilary worked at Stryker as a Staff Regulatory Specialist, eventually managing the global sustainability team, and focusing on OUS submissions. As the Vice President of Regulatory at Caris Life Sciences Hilary focused on domestic and global strategy. Currently, Hilary is the Vice President of Regulatory and Quality at SimBioSys. She recently worked with FDA on the VALID ACT in addition to the SaMD pilot program. Hilary has also partnered with several OUS regulatory bodies for first of kind products and assisted in writing the guidance with PMDA for remanufacturing.