A deep learning tool identifies mosaic mutations that cause disease

Genetic mutations cause hundreds of unsolved and untreatable disorders. Among them, DNA mutations in a small percentage of cells, called mosaic mutations, are very difficult to detect because they are present in only a small percentage of cells.

While examining the 3 billion bases of the human genome, current DNA mutation detectors are not well suited to distinguishing mosaic mutations hiding among normal DNA sequences. As a result, medical geneticists must often review DNA sequencing by eye to try to identify or confirm mosaic mutations – ; Time consuming attempt fraught with potential for error.

Writing in the January 2, 2023 issue of Nature BiotechnologyResearchers from the University of California San Diego School of Medicine and the Rady Children’s Institute for Genomic Medicine describe a way to teach a computer how to detect mosaic mutations using an artificial intelligence approach called “deep learning.”

Study: Detection of a control-independent mosaic single-nucleotide variant using DeepMosaic.  Image credit: Laurent T/ShutterstockStady: Detection of a control-independent mosaic single-nucleotide variant using DeepMosaic. Image credit: Laurent T/Shutterstock

Deep learning, sometimes referred to as artificial neural networks, is a machine learning technique that teaches computers to do what comes naturally to humans: learn by example, especially from large amounts of information. Compared to traditional statistical models, deep learning models use artificial neural networks to process data represented visually. As a result, the models function similarly to human visual processing, with greater precision and attention to detail, leading to major advances in computational capabilities, including mutation detection.

“One example of an unresolved disorder is focal epilepsy,” said study senior author Joseph Gleeson, UC San Diego School of Medicine professor of neurology and director of neuroscience research at the Rady Children’s Institute for Genomic Medicine.

“Epilepsy affects 4% of the population, and about a quarter of focal seizures fail to respond to standard medications. These patients often require surgical removal of the focal short-circuiting portion of the brain to stop seizures. Among these patients, mosaic mutations within the brain can cause focus epilepticus.

“We’ve had several epilepsy patients where we couldn’t pinpoint the cause, but once we applied our method called ‘DeepMosaic’ to the genomic data, the mutation became apparent. This has allowed us to improve the sensitivity of DNA sequencing in certain forms of epilepsy, and has led to discoveries that suggest To new ways to treat brain diseases.

Accurate detection of mosaic mutations, Gleeson said, is the first step in medical research toward developing treatments for many diseases.

Co-first author and interviewee Xiaoxu Yang, PhD, a postdoctoral researcher in Gleeson’s lab, said DeepMosaic had been trained on approximately 200,000 biological and simulated variants across the genome until “finally, we were satisfied with its ability to detect variants from Data you’ve never encountered before.

To train the computer, the authors provided examples of trustworthy mosaic mutations in addition to several normal DNA sequences and taught the computer to tell the difference. By training and retraining iteratively using more complex datasets and choosing between dozens of models, the computer was eventually able to identify mosaic mutations much better than human eyes and previous methods could. DeepMosaic has also been tested on numerous independent, never-before-seen large-scale sequencing datasets, outperforming previous methods.

“Deep mosaicism has surpassed traditional tools in detecting mosaicism from genomic and exogenous sequences,” said co-first author Shen Xu, a former research assistant at UC San Diego School of Medicine and now a research data scientist at Novartis. “The salient visual features captured by deep learning models are very similar to what experts focus on when examining variables manually.”

DeepMosaic is freely available to scientists. It’s not a single computer program, the researchers said, but an open source platform that could enable other researchers to train their own neural networks to achieve more targeted detection of mutations using a similar image-based setup.

Co-authors include Martin W. Bruce, Danny Antaki, Laurel L. Ball, Changuk Chung, Jiaoye Chen, Chen Li, Renee D. George, University of California, San Diego, and the Rady Children’s Institute for Genomic Medicine. Yifan Wang, Taegong Bai, and Alexei Abyzov, Mayo Clinic; Yuhe Cheng, and Ludmil B. Alexandrov, and Jonathan L. Spat, University of California, San Diego; Liping Wei, Peking University; and the NIMH Brain Somatic Mosaicism Network.

Funding for this research came in part from the National Institutes of Health (grants U01MH108898 and R01MH124890), the San Diego Supercomputer Center, and the UC San Diego Institute for Genomic Medicine.

NBT: Introductory Video for “Detection of a Control-independent Mosaic Single-nucleotide Variant Using DeepMosaic”

Leave a Comment