Richard Khusial, PhD: No financial relationships to disclose
Objectives: Pharmacometrics and the use of population pharmacokinetics (PPK) plays a critical role in model informed drug discovery and development (MIDD). Recently, there has been an interest in the application of deep learning to support various activities within MIDD. In this study, a novel deep learning model, LSTM-ANN, was implemented to predict drug concentrations using a real-world case study. We hypothesized this unique deep learning approach would be able to capture the relationships within the pharmacokinetic (PK) data and provide accurate predictions.
Methods: A time series deep learning LSTM-ANN model with multiple inputs was constructed. A total of 5,176 drug concentrations from 224 subjects along with 10 patient-specific covariates were used in the LSTM-ANN model development. The PK data was divided into time dependent and independent variables. The time dependent variables were fed to the LSTM layer to learn the time dependent patterns. These patterns were then combined with their respective time independent variables and fed into an ANN layer. Additionally, to determine the optimal data splitting strategy, different splitting strategies were tested: splitting based on time, random shuffling, and splitting based on unique subject identifier. The splitting percentage was kept consistent between all three strategies, with 70 percent of the data as training and 30 percent of the data as validation. Each of the LSTM-ANN model hyperparameters were optimized through Bayesian optimization with a tree-structured Parzen estimator surrogate model along with a hyperband pruner. The performance evaluation criteria included visual goodness-of-fit (GOF) plots and validation root mean square error (RMSE). Additionally, a permutation analysis was conducted to evaluate the impact of subject specific covariates on the LSTM-ANN model performance.
Results: A preliminary LSTM-ANN model was generated and optimized for each of the splitting strategies. An LSTM-ANN model optimized with the randomly splitted strategy achieved the lowest validation RMSE of 252.833. Permutation analysis revealed age, sex, baseline body weight and fed status were the covariates with the highest level of impact on the LSTM-ANN model performance. Splitting based on time had the highest RMSE in the validation set with 1092.003. A separate PPK modeling using NONMEM was executed, and the covariates identified to have a significant impact on model parameters were baseline body weight, sex, age, fed status and dose.
Conclusions: A novel LSTM-ANN framework was used to analyze and predict PK data from a real-world case study. The splitting strategy showed a significant impact on the learning and performance of the model. Permutation analysis for covariate influence applied on the best performing LSTM-ANN model identified the most impactful covariates to be similar to those selected by stepwise covariate modeling using NONMEM.