Generation of realistic synthetic data using Multimodal Neural Ordinary Differential Equations

Abstract

Individual organizations, such as hospitals, pharmaceutical companies, and health insurance providers, are currently limited in their ability to collect data that are fully representative of a disease population. This can, in turn, negatively impact the generalization ability of statistical models and scientific insights. However, sharing data across different organizations is highly restricted by legal regulations. While federated data access concepts exist, they are technically and organizationally difficult to realize. An alternative approach would be to exchange synthetic patient data instead. In this work, we introduce the Multimodal Neural Ordinary Differential Equations (MultiNODEs), a hybrid, multimodal AI approach, which allows for generating highly realistic synthetic patient trajectories on a continuous time scale, hence enabling smooth interpolation and extrapolation of clinical studies. Our proposed method can integrate both static and longitudinal data, and implicitly handles missing values. We demonstrate the capabilities of MultiNODEs by applying them to real patient-level data from two independent clinical studies and simulated epidemiological data of an infectious disease.

Source: https://www.nature.com/articles/s41746-022-00666-x#Abs1