Major Pharma Companies, Including Novartis And Merck, Build Federated Learning Platform For Drug Discovery

Amgen, Astellas, AstraZeneca, Bayer, Boehringer Ingelheim, GSK, Institut De Recherches Servier, Janssen, Merck, and Novartis — inked an agreement to build a shared platform called MELLODDY (Machine Learning Ledger Orchestration for Drug Discovery). In partnership with Nvidia, Owkin, and others, the group sought to leverage techniques like federated learning to collectively train AI on datasets without having to share any proprietary data.

Today, contributors to the three-year MELLODDY project announced they have reached their first-year objective: successfully deploying the platform. Marking a larger milestone, they say they have also completed the platform’s first federated learning runs.

Fewer than 12% of all drugs entering clinical trials end up in pharmacies, and it takes at least 10 years for medicines to complete the journey from discovery to the marketplace. Clinical trials alone take six to seven years, on average, putting the cost of R&D at roughly $2.6 billion, according to the Pharmaceutical Research and Manufacturers of America.

The MELLODDY project’s cofounders assert that federated learning can accelerate this process. In machine learning, federated learning entails training algorithms across decentralized devices holding data samples without exchanging those samples. A centralized server might be used to orchestrate the steps of the algorithm and act as a reference clock, or the arrangement might be peer-to-peer. Regardless, local algorithms are trained on local data samples, and the weights (the learnable parameters of the algorithms) are exchanged between the algorithms at some frequency to generate a global model.

Melloddy

To build the MELLODDY platform, tech vendors BME, Iktos, and Nvidia implemented machine learning solutions for drug discovery, ensuring privacy and optimizing training on Nvidia graphics cards. Owkin supplied Owkin Connect, its privacy-preserving framework designed for multitasking federated learning, while KU Leuven provided an open source library called SparseChem for training machine learning models specific to drug discovery. Kubermatic deployed its Kubernetes platform to build the infrastructure for each pharmaceutical partner. And Substra Foundation managed the technical operations, monitored workloads, and hosted the open source code that’s part of Owkin Connect.

Within the MELLODDY platform, a portion of which is hosted on Amazon Web Services, partners securely register their datasets in a local instance of the architecture. (A spokesperson told VentureBeat the platform passed “extensive” security audits conducted by an external company and by IT teams from each pharmaceutical partner.) A private blockchain provides traceability, with a ledger distributed in a decentralized way across pharma partners.

During the initial runs, all participating pharmaceutical companies managed to simultaneously train their predictive models in a de-identified, aggregated fashion without exposing private research, data, or information, according to a spokesperson.

MELLODDY project partners say they have begun scientific and business case assessments of the results of the first cycle of modeling runs. They will look at publishing those results in a scientific paper, and over the next two years the MELLODDY project will focus on improving the performance of the common predictive model by exposing it to an increasing amount of data.

The MELLODDY project builds on other efforts to uncover health insights through the use of federated learning. In May, Intel revealed the details of a National Institutes of Health-funded program that will leverage AI to identify brain tumors while preserving privacy. With the Perelman School of Medicine at the University of Pennsylvania (Penn Medicine), the company will coordinate a federation of 29 international medical centers in the U.S., Canada, the U.K., Germany, Switzerland, and India to train AI models using federated learning.

Beyond this, the American College of Radiology, Brazilian Imaging Center Diagnosticos da America, Partners HealthCare, Ohio State University, and Stanford Medicine collaborated to develop a federated learning model using more than 130,000 images from 33,000 mammography studies. And Nvidia began working with collaborators to release COVID-19-related models trained with federated learning through the company’s Clara Imaging Software platform, following a collaboration with King’s College London on a federated learning neural network for brain tumor segmentation.

The MELLODDY project has an estimated budget of €18,4 million ($21.76 million) and receives funding from the Innovative Medicines Initiative, a public-private partnership between the European Union and the European pharmaceutical industry. (Pharma partners contributed €10 million and Nvidia contributed €120,000, with the rest coming from public grants.) Johnson & Johnson subsidiary Janssen Pharmaceutica NV serves as the pharmaceutical industry lead, with coordination from Owkin.