Deep Genomics’ machine learning model continually adapts using automation, biomedical knowledge, and data – key ingredients in its ability to secure pharmaceutical partnerships quickly as well as achieve long-term success.
Dr. Albi Celaj doesn’t have much time during the week to read scientific papers, so he makes use of weekends to catch up. That is how he discovered vital details about a company.
Artificial Intelligence
Deep Genomics was established in 2015 with a team that is highly knowledgeable both in machine learning (via deep neural networks) and genomics, striving to use AI-powered drug discovery and rapid genetic medicine development. Similar companies like BenevolentAI and Calico by Alphabet also utilize deep learning for pharmaceutical research purposes.
AI’s aim is to significantly shorten the years and billions spent developing new medicines. AI could assist researchers by shortlisting compounds which are most likely to work against particular diseases, and predict what biological impact those drugs would have in people. As a result, fewer experimental molecules must be tried out, and guesswork becomes much less of an issue.
Deep Genomics’ platform uses sequence-to-function analysis, an approach known as genomics-informed predictive modeling, to examine genomic data in order to detect patterns which could contribute to disease or could be exploited therapeutically. By combining automation, biomedical knowledge, data, and machine learning techniques it identifies potential disease causes or therapeutic targets and creates novel drug targets.
This platform searches for mutations that might alter gene function and lead to disease, or variants that create “splice-out” sites for proteins normally created – these mutations could potentially be corrected with splicing inhibitors; drugs that could help treat certain neuromuscular conditions like Duchenne muscular dystrophy.
The company claims its platform boasts a 70% success rate in turning disease-causing mutations into drug targets in 18 months or less compared to four years required by conventional pharmaceutical companies for drug approval and market launch. Their goal is to use this platform to develop medicines for Mendelian disorders caused by single gene mutations.
Biomedical Knowledge
Biomedical knowledge or Relational Data Management (RDM), provides the foundation of big data integration for clinical decision support, drug discovery and other applications. RDM is also at the core of Data Science – an emerging discipline which encompasses data mining, machine learning and analytics to transform unstructured information into structured knowledge. Through RDM models such as those found in genomics can easily be accessed and integrated.
At present, biomedical knowledge is scattered among scientific publications and databases, making its acquisition difficult in an efficient fashion. One such database that is frequently accessed is PubMed with unique PubMed identifiers and medical subject headings (MeSH) used to index articles. A common approach for turning unstructured data into structured knowledge is creating biomedical Knowledge Graphs (BG), which display relationships among entities like genes, diseases, drugs etc. using existing resources like the NCBI genome assembly database as well as DISEASES database and ChEMBL database resources.
Additionally, several RDM tools, such as the QIAGEN Biomedical Relationships Knowledge Base (BRKB), have been created to aide this work. These curated and machine-actionable sources of relationships provide bioinformatics tools with access to atomic-level details required by their tools and enable researchers to quickly identify and prioritize hypotheses within minutes.
Deep Genomics, based out of University of Toronto and using machine learning, utilizes genomic and medical data analysis techniques to find genetic causes of disease and potential treatments. They work alongside Wave Life Sciences in helping identify therapies to treat genetic neuromuscular disorders like Duchenne muscular dystrophy.
Deep Genomics is led by Brendan Frey, a University of Toronto professor with expertise in machine learning and genomic medicine. At first, their company’s main goal is early-stage drug development for Mendelian diseases caused by single faulty genes; their team are scouring genomes for mutations which could be targeted with medicines called oligonucleotides; these short strands of DNA contain targets designed to fit specifically within parts of DNA’s sequences.
Machine Learning
Applying machine learning to genomic data analysis enables humans to quickly recognize patterns they would never be able to spot with just their eyes alone, saving both time and resources during drug discovery processes.
Recent estimates estimate it takes an estimated average cost of $2.6 billion and 10 years to develop one pharmaceutical drug [1]. This high cost can be partially attributed to traditional R&D’s costly trial-and-error process that relies on large sample sizes for finding promising drug candidates. Recent advances in genomic sequencing technology, big data analytics and machine learning have provided pharmaceutical industries with ways to cut these costs through more targeted approaches that narrow the number of potential targets under investigation.
Genomics sequencers can be used to detect DNA variants that contribute to disease processes, including variants encoding for proteins associated with certain diseases. Their data can then be processed through secondary genomic analysis software for further analysis. These secondary analyses often use alignment technologies that compare test sample DNA against a reference genome and recombine its fragments back together. Deep learning-based algorithms can enhance this workflow by performing alignment in a neural network model that translates image and signal data into genomic information. This enables base calling within the sequencing instrument itself, speeding up processing time while providing faster comparisons of test samples against reference genomes.
Filtering out variants unlikely to cause disease during secondary genomic analysis can drastically shorten processing times on genomic platforms, making disease mutation discovery much quicker. When combined with GPU-accelerated software such as NVIDIA Parabricks for secondary analysis, variant callers can be fine-tuned for improved accuracy across genomic platforms or retrained to identify specific genetic patterns more quickly and more accurately.
Deep Genomics of Toronto is taking advantage of these cutting-edge technologies to streamline its drug development process, employing an automated platform combining automation, biomedical knowledge, data and machine learning to discover therapeutic targets. Their primary area of focus is Mendelian disorders caused by single genetic mutations affecting 350 million people globally; Deep Genomics has partnered with Wave Life Sciences for evaluation of WVE-210201 which targets exon skipping as a possible treatment method.
Target Discovery
Deep Genomics has pioneered and patented systems and processes to take DNA or RNA sequence as input and compare it against variants known to cause disease. Their automated platform then generates on-target and off-target effect data quickly and iteratively for scientists enabling rapid evaluation of compounds quickly.
Deep Genomics’ long-term goal is to develop medicines to treat hereditary diseases that afflict 350 million people globally, using its machine learning engine. To do so, Deep Genomics relies on its machine learning engine to identify hard-to-detect disease triggers. On September 17th in an open access journal bioRxiv preprint published on September 17, Deep Genomics scientists reported the discovery of Wilson disease drug target DG12P1, where Met645Arg variant affects ATP7B leading to frameshift and stop gain, disrupting function of this protein and predict chemical properties that can target it effectively.
Deep Genomics used its predictive capabilities to screen over 100,000 compounds against its target, creating an initial shortlist of candidate drugs. Once chosen, Deep Genomics then evaluated both on-target and off-target effects of these compounds in biological context. Their results indicated that all candidates could alter RNA or DNA; however only select few were likely therapeutically effective and thus suitable to be turned into oligonucleotides for human testing.
Oligonucleotide candidates were then evaluated based on their ability to alter RNA and DNA, with those that showed the most promise being selected for further investigation by experimental biology teams. They were charged with creating compounds to either specifically target regions responsible for disease or as control compounds without adverse side effects.
Deep Genomics will use this funding round to speed the transition of four of its leading programs into clinical trials by 2023, as well as expand the AI Workbench — a suite of machine learning tools used by biopharma companies for weaving RNA into potential therapeutic compounds. Furthermore, proceeds will support an expansive effort generating biological data on 100 genes on their hit list, feeding target and mechanism prediction algorithms into Deep Genomics system.