(M-103) Evaluation of ChatGPT and Gemini large language models for pharmacometrics with NONMEM

Monday, November 11, 2024

7:00 AM - 5:00 PM MST

Yifan Yu, PhD Candidate – PhD Student, University at Buffalo school of pharmacy and pharmaceutical science; Robert Bies, PharmD, PhD – Professor, University at Buffalo school of pharmacy and pharmaceutical science; Murali Ramanathan, PhD – Professor, University at Buffalo school of pharmacy and pharmaceutical science

Author(s)

Euibeom Shin

PharmD/MS Student
University at Buffalo school of pharmacy and pharmaceutical science, United States

Disclosure(s):

Euibeom Shin: No financial relationships to disclose

To assess ChatGPT 4.0 (ChatGPT) and Gemini Ultra 1.0 (Gemini) large language models on NONMEM coding tasks relevant
to pharmacometrics and clinical pharmacology. ChatGPT and Gemini were assessed on tasks mimicking real-world applications of NONMEM. The tasks ranged from providing a curriculum for learning NONMEM, an overview of NONMEM code
structure to generating code. Prompts in lay language to elicit NONMEM code for a linear pharmacokinetic (PK) model
with oral administration and a more complex model with two parallel first-order absorption mechanisms were investigated.
Reproducibility and the impact of “temperature” hyperparameter settings were assessed. The code was reviewed by two
NONMEM experts. ChatGPT and Gemini provided NONMEM curriculum structures combining foundational knowledge
with advanced concepts (e.g., covariate modeling and Bayesian approaches) and practical skills including NONMEM code
structure and syntax. ChatGPT provided an informative summary of the NONMEM control stream structure and outlined
the key NONMEM Translator (NM-TRAN) records needed. ChatGPT and Gemini were able to generate code blocks for
the NONMEM control stream from the lay language prompts for the two coding tasks. The control streams contained focal
structural and syntax errors that required revision before they could be executed without errors and warnings. The code
output from ChatGPT and Gemini was not reproducible, and varying the temperature hyperparameter did not reduce the
errors and omissions substantively. Large language models may be useful in pharmacometrics for efficiently generating an
initial coding template for modeling projects. However, the output can contain errors and omissions that require correction.

Citations: 1. ChatGPT Version Jan 2024 (2024) https:// chat. openai. com/.
Accessed Mar 2024
2. Gemini Team, Anil R, Borgeaud S et al (2023) Gemini: A Family of Highly Capable Multimodal Models. arXiv:2312.11805.
https:// doi. org/ 10. 48550/ arXiv. 2312. 11805. Accessed Dec 01
2023. https://ui.adsabs.harvard.edu/abs/2023arXiv231211805G
3. Llama 2: open source, free for research and commercial use (2024)
https://llama.meta.com/llama2/
4. Touvron H, Martin L, Stone K, et al. Llama 2: Open Foundation
and Fine-Tuned Chat Models. 2023: arXiv:2307.09288. https://
doi. org/ 10. 48550/ arXiv. 2307. 09288. Accessed Jul 01 2023.
https://ui.adsabs.harvard.edu/abs/2023arXiv230709288T
5. Meet Claude (2024) https://www.anthropic.com/claude
6. Orru G, Piarulli A, Conversano C, Gemignani A (2023) Humanlike problem-solving abilities in large language models using
ChatGPT. Front Artif Intell 6:1199350. https://doi.org/10.3389/
frai.2023.1199350
7. Roumeliotis KI, Tselikas ND (2023) ChatGPT and Open-AI Models: A Preliminary Review. Future Internet 15(6):192
8. Radford A, Narasimhan K, Salimans T, Sutskever I (2018)
Improving language understanding by generative pre-training
9. Owen JS, Fiedler-Kelly J (2014) Introduction to population pharmacokinetic/pharmacodynamic analysis with nonlinear mixed
effects models. Wiley
10. Pétricoul O, Cosson V, Fuseau E, Marchand M (2007) Population
models for drug absorption and enterohepatic recycling. Pharmacometrics: the science of quantitative pharmacology 345–382
11. Bauer RJ (2019) NONMEM tutorial part I: description of commands and options, with simple examples of population analysis.
CPT Pharmacometrics Syst Pharmacol 8(8):525–537. https://doi.
org/10.1002/psp4.12404
12. Sun H, Fadiran EO, Jones CD et al (1999) Population pharmacokinetics: a regulatory perspective. Clin Pharmacokinet 37:41–58
13 Cloesmeijer ME, Janssen A, Koopman SF, Cnossen MH, Mathôt RA,
consortium S, (2024) ChatGPT in pharmacometrics? Potential opportunities and limitations. British J Clin Pharmacol 90(1):360–365
14. Shin E, Ramanathan M (2024) Evaluation of prompt engineering
strategies for pharmacokinetic data analysis with the ChatGPT large
language model. J Pharmacokinet Pharmacodyn 51(2):101–108
15. Bard Large language model. 2023. https://bard.google.com
16. Fidler M, Wilkins JJ, Hooijmaijers R et al (2019) Nonlinear
Mixed-Effects Model Development and Simulation Using nlmixr
and Related R Open-Source Packages. CPT Pharmacometrics Syst
Pharmacol 8(9):621–633. https://doi.org/10.1002/psp4.12445
17. Bonate PL, Barrett JS, Ait-Oudhia S et al (2023) Training the
next generation of pharmacometric modelers: a multisector perspective. J Pharmacokinet Pharmacodyn. https://doi.org/10.1007/
s10928-023-09878-4
18. Shin E, Ramanathan M (2023) Evaluation of prompt engineering
strategies for pharmacokinetic data analysis with the ChatGPT
large language model. J Pharmacokinet Pharmacodyn. https://doi.
org/10.1007/s10928-023-09892-6
19. mrgsolve: Simulate from ODE-Based Models. R package version
1.4.1. Metrum Research Group; 2024. https://github.com/metru
mresearchgroup/mrgsolve
20. Fidler M, Hooijmaijers R, Schoemaker R, Wilkins JJ, Xiong Y,
Wang W (2021) R and nlmixr as a gateway between statistics
and pharmacometrics. CPT Pharmacometrics Syst Pharmacol
10(4):283–285. https://doi.org/10.1002/psp4.12618
21. nlmixr: an R package for population PKPD modeling. 2019.
https://nlmixrdevelopment.github.io/nlmixr/index.html
22. Anonymous. Monolix documentation. Lixoft-SimulationsPlus.
Accessed March 13, 2024, 2024. https:// monol ix. lixoft. com/
single-page/
23. Stan Reference Manual. NumFOCUS; 2011. https://mc-stan.org/
docs/reference-manual/
24. Cloesmeijer ME, Janssen A, Koopman SF, Cnossen MH, Mathot
RAA, Symphony consortium (2024) ChatGPT in pharmacometrics? Potential opportunities and limitations. Br J Clin Pharmacol
90(1):360–365. https://doi.org/10.1111/bcp.15895
25. Frieder S, Pinchetti L, Chevalier A et al (2023) Mathematical
capabilities of ChatGPT. arXiv. arXiv:2301.13867v2
26. Yuan Z, Yuan H, Tan C, Wang W, Huang S. How well do large
language models perform in arithmetic tasks? arXiv. 2023:arXiv:
2304.02015
27. Alkaissi H, McFarlane SI (2023) Artificial Hallucinations in
ChatGPT: Implications in Scientific Writing. Cureus J Med Sci
15(2).https://doi.org/10.7759/cureus.35179
28. Beutel G, Geerits E, Kielstein JT (2023) Artificial hallucination: GPT on LSD? Crit Care 27(1):148. https://doi.org/10.1186/
s13054-023-04425-6