Mayo Clinic AI researchers present a machine learning-based method for leveraging diffusion patterns to build a multitasking brain tumor inpainting algorithm

The number of publications on AI and, in particular, machine learning (ML) related to medical imaging has increased significantly in recent years. A current PubMed search using the Mesh keywords “artificial intelligence” and “radiology” yielded 5,369 articles in 2021, more than five times the results found in 2011. ML models are constantly being developed to improve efficiency and healthcare outcomes, from classification to semantic segmentation, object detection and image generation. Many published reports in diagnostic radiology, for example, indicate that ML models have the ability to perform as well or better than medical experts in specific tasks, such as detecting abnormalities and screening for pathologies.

It is therefore undeniable that, when used correctly, AI can help radiologists and drastically reduce their work. Despite the growing interest in developing ML models for medical imaging, significant challenges may limit the practical applications of these models or even predispose them to substantial biases. Data scarcity and imbalance are two such challenges. For one thing, medical imaging datasets are often much smaller than natural photography datasets such as ImageNet, and institutional dataset sharing or publication may be impossible due to patient confidentiality issues. On the other hand, even the medical imaging datasets that data scientists have access to could be more balanced.

In other words, the volume of medical imaging data for patients with specific pathologies is significantly lower than for patients with common pathologies or healthy people. Using insufficiently large or unbalanced datasets to train or evaluate a machine learning model can lead to systemic biases in model performance. The generation of synthetic images is one of the main strategies to combat data scarcity and imbalance, in addition to the public release of anonymized medical imaging datasets and endorsement of strategies such as l federated learning, enabling the development of machine learning (ML) models on multi-institutional platforms. data sets without data sharing.

Generative ML models can learn to generate realistic medical imaging data that does not belong to an actual patient and therefore can be shared publicly without compromising patient privacy. Various generative models capable of synthesizing high-quality synthetic data have been introduced since the emergence of generative adversarial networks (GANs). Most of these models produce unlabeled imagery data, which can be useful in specific applications, such as self-supervised or semi-supervised downstream models. Additionally, some other models are capable of conditional generation, which allows an image to be generated based on predetermined clinical, textual, or imagery variables.

Denoising diffusion probabilistic models (DDPM), also known as diffusion models, are a new class of image generation models that outperform GANs in synthetic image quality and output diversity. This latest class of generative models enables the generation of labeled synthetic data, which advances research in machine learning, medical imaging quality, and patient care. Despite their enormous success in generating synthetic medical imaging data, GANs are frequently chastised for their lack of output diversity and unstable formation. Autoencoder deep learning models are a more traditional alternative to GANs, as they are easier to train and produce more diverse outputs. However, their synthetic results do not have the image quality of GANs.

Diffusion models based on Markov chain theory learn to generate their synthetic outputs by gradually denoising an initial image filled with random Gaussian noise. This iterative denoising process makes inference runs of diffusion models much slower than those of other generative models. Nevertheless, it allows them to extract more representative features from their input data, which allows them to outperform other models. They present a proof-of-concept diffusion model that can be used for multitasking brain tumor painting on multi-sequence brain magnetic resonance imaging (MRI) studies in this methodological article.

They created a diffusion model that can accommodate a two-dimensional (2D) axial slice of a T1-weighted (T1), contrast-enhanced T1-weighted (T1CE), T2-weighted (T2), or FLAIR sequence of an MRI cerebral. and paint a user-defined cropped area of ​​that slice with a realistic, controllable image of either a high-quality glioma and its corresponding components (e.g., surrounding edema) or tumor-free brain tissue (apparently normal).

In the United States, the incidence of high-grade glioma is 3.56 per 100,000 people, and there are only a few publicly available MRI datasets for brain tumors. Their model will allow ML researchers to modify (induce or suppress) synthetic or tumor-free tumor tissue with configurable features on brain MRI slices in such limited data. The tool has been rolled out online for people to use. The model has been open-sourced with its documentation on GitHub.

This Article is written as a research summary article by Marktechpost Staff based on the research paper 'MULTITASK BRAIN TUMOR INPAINTING WITH DIFFUSION MODELS: A METHODOLOGICAL REPORT'. All Credit For This Research Goes To Researchers on This Project. Check out the paper, code and tool.
Please Don't Forget To Join Our ML Subreddit


Aneesh Tickoo is an intern consultant at MarktechPost. He is currently pursuing his undergraduate studies in Data Science and Artificial Intelligence at Indian Institute of Technology (IIT), Bhilai. He spends most of his time working on projects aimed at harnessing the power of machine learning. His research interest is image processing and is passionate about building solutions around it. He enjoys connecting with people and collaborating on interesting projects.


James G. Williams