Machine Learning and Pharmacometrics for Prediction of Pharmacokinetic Data: Differences, Similarities and Challenges Illustrated with Rifampicin

Bibliographic Details
Title:	Machine Learning and Pharmacometrics for Prediction of Pharmacokinetic Data: Differences, Similarities and Challenges Illustrated with Rifampicin
Authors:	Lina Keutzer, Huifang You, Ali Farnoud, Joakim Nyberg, Sebastian G. Wicha, Gareth Maher-Edwards, Georgios Vlasakakis, Gita Khalili Moghaddam, Elin M. Svensson, Michael P. Menden, Ulrika S. H. Simonsson, on behalf of the UNITE4TB Consortium
Source:	Pharmaceutics, Vol 14, Iss 8, p 1530 (2022)
Publisher Information:	MDPI AG, 2022.
Publication Year:	2022
Collection:	LCC:Pharmacy and materia medica
Subject Terms:	machine learning, pharmacometrics, population pharmacokinetics, rifampicin, pharmacokinetics, simulation, Pharmacy and materia medica, RS1-441
More Details:	Pharmacometrics (PM) and machine learning (ML) are both valuable for drug development to characterize pharmacokinetics (PK) and pharmacodynamics (PD). Pharmacokinetic/pharmacodynamic (PKPD) analysis using PM provides mechanistic insight into biological processes but is time- and labor-intensive. In contrast, ML models are much quicker trained, but offer less mechanistic insights. The opportunity of using ML predictions of drug PK as input for a PKPD model could strongly accelerate analysis efforts. Here exemplified by rifampicin, a widely used antibiotic, we explore the ability of different ML algorithms to predict drug PK. Based on simulated data, we trained linear regressions (LASSO), Gradient Boosting Machines, XGBoost and Random Forest to predict the plasma concentration-time series and rifampicin area under the concentration-versus-time curve from 0–24 h (AUC0–24h) after repeated dosing. XGBoost performed best for prediction of the entire PK series (R2: 0.84, root mean square error (RMSE): 6.9 mg/L, mean absolute error (MAE): 4.0 mg/L) for the scenario with the largest data size. For AUC0–24h prediction, LASSO showed the highest performance (R2: 0.97, RMSE: 29.1 h·mg/L, MAE: 18.8 h·mg/L). Increasing the number of plasma concentrations per patient (0, 2 or 6 concentrations per occasion) improved model performance. For example, for AUC0–24h prediction using LASSO, the R2 was 0.41, 0.69 and 0.97 when using predictors only (no plasma concentrations), 2 or 6 plasma concentrations per occasion as input, respectively. Run times for the ML models ranged from 1.0 s to 8 min, while the run time for the PM model was more than 3 h. Furthermore, building a PM model is more time- and labor-intensive compared with ML. ML predictions of drug PK could thus be used as input into a PKPD model, enabling time-efficient analysis.
Document Type:	article
File Description:	electronic resource
Language:	English
ISSN:	1999-4923
Relation:	https://www.mdpi.com/1999-4923/14/8/1530; https://doaj.org/toc/1999-4923
DOI:	10.3390/pharmaceutics14081530
Access URL:	https://doaj.org/article/25a790d93aeb473e9a0152c76324f8f0
Accession Number:	edsdoj.25a790d93aeb473e9a0152c76324f8f0
Database:	Directory of Open Access Journals
Full text is not displayed to guests.	Login for full access.

FullText	Links: – Type: pdflink Url: https://content.ebscohost.com/cds/retrieve?content=AQICAHjPtM4BHU3ZchRwgzYmadcigk49r9CVlbU7V5F6lgH7WwHgj59mIdJCjqS6NLQhSnRWAAAA4jCB3wYJKoZIhvcNAQcGoIHRMIHOAgEAMIHIBgkqhkiG9w0BBwEwHgYJYIZIAWUDBAEuMBEEDL8kdXDPV4ibBXBdGAIBEICBms0rD50y_VbXUTKijM1wGSDOE0h7Bc-NhBOHYY-cV1cvd4wqLXx7KLDSwcaNBoCZcJxyJ5axeqps4D3y6WcRSwyT2DwPfNNJzBbgFEgXFCDRt5yrIZqoTbCKG1Y0oF_O3wXkbtIgrzWEnfBZq-aIOpu-PDIBmAw_JK1g70f7jPRdAioaJmi-b0VKnUquNwtj9kgdjFMoMBDJ3DA= Text: Availability: 1 Value: <anid>AN0158912702;[b74k]01aug.22;2022Sep07.06:18;v2.2.500</anid> <title id="AN0158912702-1">Machine Learning and Pharmacometrics for Prediction of Pharmacokinetic Data: Differences, Similarities and Challenges Illustrated with Rifampicin </title> <p>Pharmacometrics (PM) and machine learning (ML) are both valuable for drug development to characterize pharmacokinetics (PK) and pharmacodynamics (PD). Pharmacokinetic/pharmacodynamic (PKPD) analysis using PM provides mechanistic insight into biological processes but is time- and labor-intensive. In contrast, ML models are much quicker trained, but offer less mechanistic insights. The opportunity of using ML predictions of drug PK as input for a PKPD model could strongly accelerate analysis efforts. Here exemplified by rifampicin, a widely used antibiotic, we explore the ability of different ML algorithms to predict drug PK. Based on simulated data, we trained linear regressions (LASSO), Gradient Boosting Machines, XGBoost and Random Forest to predict the plasma concentration-time series and rifampicin area under the concentration-versus-time curve from 0–24 h (AUC&lt;sub&gt;0–24h&lt;/sub&gt;) after repeated dosing. XGBoost performed best for prediction of the entire PK series (R&lt;sup&gt;2&lt;/sup&gt;: 0.84, root mean square error (RMSE): 6.9 mg/L, mean absolute error (MAE): 4.0 mg/L) for the scenario with the largest data size. For AUC&lt;sub&gt;0–24h&lt;/sub&gt; prediction, LASSO showed the highest performance (R&lt;sup&gt;2&lt;/sup&gt;: 0.97, RMSE: 29.1 h·mg/L, MAE: 18.8 h·mg/L). Increasing the number of plasma concentrations per patient (0, 2 or 6 concentrations per occasion) improved model performance. For example, for AUC&lt;sub&gt;0–24h&lt;/sub&gt; prediction using LASSO, the R&lt;sup&gt;2&lt;/sup&gt; was 0.41, 0.69 and 0.97 when using predictors only (no plasma concentrations), 2 or 6 plasma concentrations per occasion as input, respectively. Run times for the ML models ranged from 1.0 s to 8 min, while the run time for the PM model was more than 3 h. Furthermore, building a PM model is more time- and labor-intensive compared with ML. ML predictions of drug PK could thus be used as input into a PKPD model, enabling time-efficient analysis.</p> <p>Keywords: machine learning; pharmacometrics; population pharmacokinetics; rifampicin; pharmacokinetics; simulation; feature selection</p> <hd id="AN0158912702-2">1. Introduction</hd> <p>Pharmacometrics (PM) and machine learning (ML) are promising approaches used in drug discovery and development, regulatory decision making and personalized medicine. In pharmacokinetic/pharmacodynamic (PKPD) analysis, pharmacokinetic (PK) data are commonly used as input to drive the exposure–response relationship, which can be both continuous drug concentrations or derived PK parameters (e.g., area under the concentration-time curve, AUC) [[<reflink idref="bib1" id="ref1">1</reflink>]]. When data are sparse, utilizing classical methodologies such as noncompartmental analysis (NCA) is not appropriate. In this case, PKPD modelling using the population approach is commonly applied. This requires the availability or development of a population PK model, which can be time- and labor-intensive. Growing access to big data in recent years has increased the interest in utilization of ML for predictions. ML, with its high computational efficiency, has significant potential and is starting to be used in drug development [[<reflink idref="bib3" id="ref2">3</reflink>]], but is not applied as much as in discovery yet [[<reflink idref="bib4" id="ref3">4</reflink>]].</p> <p>While population PK modelling is a method very frequently used for drug PK predictions, ML is less commonly utilized. There are, however, a few examples in the literature where ML has been applied to predict PK data. Poynton et al. [[<reflink idref="bib5" id="ref4">5</reflink>]], for example, have evaluated the performance of several ML algorithms for prediction of remifentanil PK and compared the performance to population PK [[<reflink idref="bib5" id="ref5">5</reflink>]]. The authors show that an ensemble model integrating both an artificial neural network (ANN) and a population PK model best described remifentanil plasma concentration over time [[<reflink idref="bib5" id="ref6">5</reflink>]]. In addition, Woillard et al. [[<reflink idref="bib6" id="ref7">6</reflink>]] have shown that the XGBoost algorithm is appropriate for predictions of tacrolimus and mycophenolate mofetil exposure. Woillard et al. [[<reflink idref="bib6" id="ref8">6</reflink>]] and Poynton et al. [[<reflink idref="bib5" id="ref9">5</reflink>]] were some of the first authors to apply ML to PK datasets. While ML has the advantage of being fast and efficient, as well as being able to handle large datasets, PM models are based on biological mechanisms, contributing to mechanistical understanding, biological interpretability of the results and the potential to simulate in silico experiments from the model. ML carries an inherent risk of returning outputs that are not clinically relevant. Thus, this is another reason (other than time) that these two disciplines should work together. Due to the multiple challenges faced in PM and ML, there is a continuous interest in both communities to identify possible ways of combining expertise from both fields [[<reflink idref="bib4" id="ref10">4</reflink>], [<reflink idref="bib8" id="ref11">8</reflink>], [<reflink idref="bib10" id="ref12">10</reflink>]]. Attempts to combine PM and ML were made by others previously. Current work includes improved methods for covariate selection [[<reflink idref="bib8" id="ref13">8</reflink>]], as well as research focused on combining ML algorithms with the compartmental structure of PM models [[<reflink idref="bib11" id="ref14">11</reflink>]] aiming to improve efficiency in model development. One way of combining ML and PM could be to use PK predicted from an ML model as input for a pharmacometrics PKPD model, provided that the predictive performance is acceptable (illustrated in Figure 1). In order to evaluate whether ML could support PKPD modelling by fast and accurate prediction of PK data, the aim of this case study was to investigate the ability of different ML algorithms to accurately and precisely predict both plasma concentration over time and derived PK parameters (such as AUC), using rifampicin as an example. Rifampicin is an antibiotic commonly used to treat drug-susceptible tuberculosis as part of an approved four-drug regimen [[<reflink idref="bib13" id="ref15">13</reflink>]]. It is a drug known for its complex PK, including autoinduction of elimination, concentration-dependent nonlinear clearance and dose-dependent bioavailability [[<reflink idref="bib14" id="ref16">14</reflink>], [<reflink idref="bib16" id="ref17">16</reflink>]], as well as a high inter-occasion variability (IOV) [[<reflink idref="bib17" id="ref18">17</reflink>]]. Nonlinear PK sometimes leads to a complex modelling process, increasing labor efforts and time needed. This is the key reason why we believe that using an ML algorithm can actually assist PM in being more efficient.</p> <p>When merging both methods, it is critical to first of all understand similarities and differences between the two fields, to uncover gaps in either methodology and to establish a common terminology, enabling both communities to communicate with each other.</p> <hd id="AN0158912702-3">1.1. Pharmacometrics and Machine Learning</hd> <p></p> <hd id="AN0158912702-4">1.1.1. Pharmacometrics</hd> <p>PM is the science of mathematical and statistical modelling with the aim of quantifying a drug's PK, pharmacodynamics (PD), PKPD or disease progression. PM models can be utilized within the entire spectrum of drug development, from discovery all the way to life-cycle management and label updates [[<reflink idref="bib18" id="ref19">18</reflink>], [<reflink idref="bib20" id="ref20">20</reflink>], [<reflink idref="bib22" id="ref21">22</reflink>], [<reflink idref="bib24" id="ref22">24</reflink>]]. According to the FDA, model-informed drug development (MIDD) is an integral component in the development of a drug [[<reflink idref="bib26" id="ref23">26</reflink>]]. A common method used in PM is nonlinear mixed effects (NLME) modelling, first defined by Lindstrom and Bates [[<reflink idref="bib27" id="ref24">27</reflink>]]. It describes simultaneous model fitting to PK or PD data from all individuals within a population. NLME models consist of fixed-effects parameters describing the population as a whole and random-effects parameters describing the variability within the population and in the individual [[<reflink idref="bib27" id="ref25">27</reflink>]]. A population can, e.g., be a group of patients, healthy volunteers, but even a set of in vitro or in vivo data. The aim of a PM analysis is typically to quantify both the variability in the PK or PD response between patients and within a patient, as well as to identify predictors (= covariates) informing about the source of variability (see Figure S1), which can be endogenous (data-driven) or exogenous (e.g., between study sites). NLME models describe (<reflink idref="bib1" id="ref26">1</reflink>) the general tendency within the population at hand, (<reflink idref="bib2" id="ref27">2</reflink>) the variability between different patients (inter-individual variability (IIV)), (<reflink idref="bib3" id="ref28">3</reflink>) variability within the same patient on different occasions (inter-occasion variability (IOV)), (<reflink idref="bib4" id="ref29">4</reflink>) the remaining residual unexplained variability (RUV) and (<reflink idref="bib5" id="ref30">5</reflink>) predictors descriptive of the variability between patients, termed covariates. NLME models are defined by ordinary differential equations (ODEs) and stochastic differential equations or analytical solutions [[<reflink idref="bib29" id="ref31">29</reflink>]]. They are usually expressed as compartmental models describing the absorption, distribution, metabolism and elimination of a drug, as illustrated in Figure 2. Since all data are fit simultaneously, model building with sparse or imbalanced data is possible [[<reflink idref="bib1" id="ref32">1</reflink>], [<reflink idref="bib29" id="ref33">29</reflink>]]. Model building is performed in a step-wise manner and PM models contain biological or pharmacological mechanisms in their structure, thus allowing the derivation of parameters which are biologically sound and interpretable. The final model parameter estimates can be used to perform simulations of "what if" scenarios, answering research questions in order to inform future in vitro/in vivo experiments or clinical trials regarding their chances of success [[<reflink idref="bib31" id="ref34">31</reflink>], [<reflink idref="bib33" id="ref35">33</reflink>], [<reflink idref="bib35" id="ref36">35</reflink>]], which can greatly reduce the cost for an experiment or trial [[<reflink idref="bib36" id="ref37">36</reflink>]].</p> <p>During and after model development, PM modellers utilize numerous diagnostics, both quantitative and/or graphical, in order to select the final model that best describes the observed data. For quantitative model comparison, modellers in the PM community commonly turn to the objective function value (OFV). The likelihood of the predictions to fit the data, i.e., the probability of the model parameters being able to describe the data, is estimated in a maximum likelihood parameter estimation, often using differential equation systems. The model parameters are estimated by minimizing the OFV, which is proportional to the −2log likelihood that the model parameter values occur from the data. For comparison of nested models, the likelihood ratio test is used [[<reflink idref="bib37" id="ref38">37</reflink>]].</p> <p>Graphical model evaluation is performed most commonly using visual predictive checks (VPCs) [[<reflink idref="bib39" id="ref39">39</reflink>], [<reflink idref="bib41" id="ref40">41</reflink>]], basic goodness-of-fit (GOF) plots and individual plots [[<reflink idref="bib42" id="ref41">42</reflink>]]. The VPC is a tool to investigate the predictive performance of the model and allows for comparison between alternative models, evaluation of model fit and visualization of how the model could be improved [[<reflink idref="bib39" id="ref42">39</reflink>]]. The observed data are compared to data simulated from the model parameters. Commonly, the 90th percentile (sometimes lower) of the observed data is compared to the 95% (or lower) confidence interval (CI) of the simulated 90th percentile data. GOF plots include, for example, evaluation of population predictions or individual predictions versus observations (see Table 1 for details regarding terminology). In individual plots, individual observed data are compared to individual predictions. Lastly, precision in the parameter estimates, clinical relevance and scientific plausibility are considered during model evaluation.</p> <hd id="AN0158912702-5">1.1.2. Machine Learning</hd> <p>Machine learning (ML) has been defined as "a field of statistical research for training computational algorithms that split, sort and transform a set of data to maximize the ability to classify, predict, cluster or discover patterns in a target dataset" [[<reflink idref="bib44" id="ref43">44</reflink>]]. ML is commonly divided into supervised, unsupervised and semi-supervised methods [[<reflink idref="bib45" id="ref44">45</reflink>]]. Supervised ML aims to predict human assigned labels or experimentally determined outputs based on independent variables, i.e., these models are trained to predict established and expected outcomes based on a loss function. Algorithms belonging to supervised ML models either solve regression and classification problems, e.g., logistic regression, neural networks, support vector machines and decision trees [[<reflink idref="bib46" id="ref45">46</reflink>]]. In contrast, unsupervised machine learning does not require prior knowledge regarding the outcomes, and aims to reveal unexpected patterns in the data, i.e., clustering [[<reflink idref="bib47" id="ref46">47</reflink>]]. Semi-supervised learning is a combination of using labelled and unlabelled data applied in situations where only parts of the data have been labelled [[<reflink idref="bib48" id="ref47">48</reflink>]]. ML has frequently been used in drug discovery and is now increasingly being applied to drug development. Applications in drug development include biomarker identification, prediction of clinical outcomes and planning of clinical trials [[<reflink idref="bib9" id="ref48">9</reflink>], [<reflink idref="bib45" id="ref49">45</reflink>], [<reflink idref="bib49" id="ref50">49</reflink>], [<reflink idref="bib51" id="ref51">51</reflink>]].</p> <p>Here, we focus on supervised ML, in which the dataset is separated into training, validation and test datasets. Common methodologies to split the dataset are either n-fold cross-validation or bootstrapping (for terminology, see Table 1). The training dataset is used for learning patterns in the data, while the validation dataset determines optimal parameterization, for instance, the number of trees in the decision tree, or epochs in a neural network. The loss function is calculated and optimized on the validation dataset to avoid under- or overfitting on the training dataset. Different methods for computation of the loss function exist, such as <emph>L</emph>1 (Least Absolute Deviations) (Equation (<reflink idref="bib1" id="ref52">1</reflink>)) or <emph>L</emph>2 (Least Square Errors) (Equation (<reflink idref="bib2" id="ref53">2</reflink>)).</p> <p>(<reflink idref="bib1" id="ref54">1</reflink>) <ephtml> &lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;semantics&gt;&lt;mrow&gt;&lt;mi&gt;L&lt;/mi&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;msubsup&gt;&lt;mstyle mathsize="70%" displaystyle="true"&gt;&lt;mo&gt;&amp;#8721;&lt;/mo&gt;&lt;/mstyle&gt;&lt;mrow&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/mrow&gt;&lt;mi&gt;n&lt;/mi&gt;&lt;/msubsup&gt;&lt;mrow&gt;&lt;msub&gt;&lt;mi&gt;Y&lt;/mi&gt;&lt;mrow&gt;&lt;mi&gt;o&lt;/mi&gt;&lt;mi&gt;b&lt;/mi&gt;&lt;mi&gt;s&lt;/mi&gt;&lt;mi&gt;e&lt;/mi&gt;&lt;mi&gt;r&lt;/mi&gt;&lt;mi&gt;v&lt;/mi&gt;&lt;mi&gt;e&lt;/mi&gt;&lt;mi&gt;d&lt;/mi&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo&gt;&amp;#8722;&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;Y&lt;/mi&gt;&lt;mrow&gt;&lt;mi&gt;p&lt;/mi&gt;&lt;mi&gt;r&lt;/mi&gt;&lt;mi&gt;e&lt;/mi&gt;&lt;mi&gt;d&lt;/mi&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;mi&gt;c&lt;/mi&gt;&lt;mi&gt;t&lt;/mi&gt;&lt;mi&gt;e&lt;/mi&gt;&lt;mi&gt;d&lt;/mi&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;/mrow&gt;&lt;/mrow&gt;&lt;/semantics&gt;&lt;/math&gt; </ephtml></p> <p>(<reflink idref="bib2" id="ref55">2</reflink>) <ephtml> &lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;semantics&gt;&lt;mrow&gt;&lt;mi&gt;L&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;msubsup&gt;&lt;mstyle mathsize="70%" displaystyle="true"&gt;&lt;mo&gt;&amp;#8721;&lt;/mo&gt;&lt;/mstyle&gt;&lt;mrow&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/mrow&gt;&lt;mi&gt;n&lt;/mi&gt;&lt;/msubsup&gt;&lt;msup&gt;&lt;mrow&gt;&lt;mrow&gt;&lt;msub&gt;&lt;mi&gt;Y&lt;/mi&gt;&lt;mrow&gt;&lt;mi&gt;o&lt;/mi&gt;&lt;mi&gt;b&lt;/mi&gt;&lt;mi&gt;s&lt;/mi&gt;&lt;mi&gt;e&lt;/mi&gt;&lt;mi&gt;r&lt;/mi&gt;&lt;mi&gt;v&lt;/mi&gt;&lt;mi&gt;e&lt;/mi&gt;&lt;mi&gt;d&lt;/mi&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo&gt;&amp;#8722;&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;Y&lt;/mi&gt;&lt;mrow&gt;&lt;mi&gt;p&lt;/mi&gt;&lt;mi&gt;r&lt;/mi&gt;&lt;mi&gt;e&lt;/mi&gt;&lt;mi&gt;d&lt;/mi&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;mi&gt;c&lt;/mi&gt;&lt;mi&gt;t&lt;/mi&gt;&lt;mi&gt;e&lt;/mi&gt;&lt;mi&gt;d&lt;/mi&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;/mrow&gt;&lt;/mrow&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;/mrow&gt;&lt;/semantics&gt;&lt;/math&gt; </ephtml></p> <p>In Equations (<reflink idref="bib1" id="ref56">1</reflink>) and (<reflink idref="bib2" id="ref57">2</reflink>), <emph>L</emph>1 indicates Lasso regression, <emph>L</emph>2 Ridge regression, <emph>Y<subs>observed</subs></emph> observed data and <emph>Y<subs>predicted</subs></emph> ML prediction. For estimating generalizability to predict novel data points [[<reflink idref="bib45" id="ref58">45</reflink>]], the test dataset is exclusively used for the final model evaluation, never for any training or parameterization [[<reflink idref="bib52" id="ref59">52</reflink>], [<reflink idref="bib54" id="ref60">54</reflink>]]. Data leakage of the test dataset mostly leads to severe overestimation of the model's predictive performance, i.e., overfitting [[<reflink idref="bib45" id="ref61">45</reflink>]].</p> <hd id="AN0158912702-6">1.1.3. Terminology</hd> <p>Different terminologies are used in PM and ML, often describing either the same or a similar part of the analysis. While in PM one "builds" or "fits" a model, in ML a model is "trained". Both terms essentially describe the process of developing a model, its structure and parameters of interest. When it comes to predictors used in the model, there is a difference between ML and PM. In ML, the word "features" is used to describe all input variables used to train the model in order to predict the desired outcome [[<reflink idref="bib54" id="ref62">54</reflink>]]. This could in drug development, for example, include time, dose and patient characteristics such as bodyweight, age or creatinine clearance. In PM, on the other hand, predictors are usually termed "covariates", which are not directly comparable to features. Covariates describe predictors that aim to explain the sources of the PK and PD variability between patients (IIV). Since certain variables such as time and dose are normally already part of the structural model, they are not considered covariates [[<reflink idref="bib55" id="ref63">55</reflink>]]. Only the predictors explaining variability between patients in addition to the variables already included in the structural model are considered covariates, which would in this example be bodyweight, age or creatinine clearance. In this work, we use the word "predictor" as an umbrella term for both feature and covariate. In PM, only scientifically plausible and clinically relevant predictors should be included in the model, which are identified in a preselection step conducted by an expert. This can also be beneficial for ML models; however, if the dataset is large enough, preselection of predictors is not necessary. ML can then be used to identify unexpected associations in a data-driven manner.</p> <p>The parameters of a PM model are often not comparable to those in ML. In PM, parameters are descriptive of PK and PD processes, such as drug clearance, distribution volume or drug absorption, as well as drug effect (e.g., the maximal drug effect (E<subs>max</subs>) or the drug's potency, such as EC<subs>50</subs>). ML parameters, however, are more mathematical and from a biological point of view less interpretable. Parameters are configuration variables of the model and are estimated during the model training process enabling predictions from the final model. They are determined automatically and include, for example, weights, coefficients and support vectors. Hyperparameters, on the other hand, are variables defined by the modeller as they cannot be estimated from the data, but are tuned during the learning process [[<reflink idref="bib56" id="ref64">56</reflink>]], such as the regularization parameter λ and the number of trees <emph>k</emph> in eXtreme Gradient Boosting (XGBoost) [[<reflink idref="bib57" id="ref65">57</reflink>]]. Determination of hyperparameters is called "hyperparameter tuning" and is often achieved by testing different hyperparameters and then choosing the ones providing the best model fit [[<reflink idref="bib58" id="ref66">58</reflink>]]; however, the process can also be automated [[<reflink idref="bib59" id="ref67">59</reflink>]].</p> <p>As mentioned above, in supervised ML the data are split into a "training dataset" used for learning general rules, a "validation dataset" for internal validation and a "test dataset" for unbiased evaluation using cross-validation [[<reflink idref="bib52" id="ref68">52</reflink>], [<reflink idref="bib54" id="ref69">54</reflink>]]. Separating the data is not common practice in pharmacometrics model building, where, instead, all available data are used for model building and only external validation is performed with new data.</p> <p>The overall workflows for both PM and ML are illustrated in Figure 3.</p> <p>Table 1 provides a comprehensive overview of differences and similarities in terminology used by the PM and ML community.</p> <p>Table 1 Overview of terminology commonly used by the pharmacometrics and/or machine learning community.</p> <p> <ephtml> &lt;table&gt;&lt;thead&gt;&lt;tr&gt;&lt;th colspan="2"&gt;Term&lt;/th&gt;&lt;th&gt;Description&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;PM&lt;/th&gt;&lt;th&gt;ML&lt;/th&gt;&lt;th /&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;Covariates&lt;/td&gt;&lt;td&gt;Features&lt;/td&gt;&lt;td&gt;Both terms describe predictors. Features are all input variables used to train a model. Covariates are predictors explaining variability between patients in addition to the variables already included in the structural pharmacometrics model.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Objective function value (OFV)&lt;/td&gt;&lt;td&gt;Loss&lt;/td&gt;&lt;td&gt;The OFV is one of the main metrics for model evaluation in pharmacometrics model building. It is proportional to &amp;#8722;2log likelihood that the model parameter values occur from the data [&lt;xref ref-type="bibr" rid="bibr37"&gt;37-38&lt;/xref&gt;].&lt;break /&gt;In ML, the loss is used as a goodness of fit. It represents the distance between predictions and observations which can be computed in different ways, such as &lt;italic&gt;L&lt;/italic&gt;1, &lt;italic&gt;L&lt;/italic&gt;2 or MAPE.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Build/Fit a model&lt;/td&gt;&lt;td&gt;Train a model&lt;/td&gt;&lt;td&gt;Both terms define the process of developing a model by determining model parameters that describe the input data in order to reach a predefined objective.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Validation dataset&lt;/td&gt;&lt;td&gt;Validation dataset&lt;/td&gt;&lt;td&gt;In PM, the term validation dataset is often used for external validation. In ML, the term is commonly used for the data that are held back for internal validation to evaluate model performance during training.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Overparameterization&lt;/td&gt;&lt;td&gt;Overfitting&lt;/td&gt;&lt;td&gt;In PM, a model can be overparameterized, meaning too many parameters are estimated in relation to the amount of information, leading to minimization issues. Overfitting in ML describes a phenomenon where the model has been trained to fit the training data too well. The model is forced to predict in a very narrow direction, which may result in poor predictive ability.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Model parameters&lt;/td&gt;&lt;td&gt;Model parameters&lt;/td&gt;&lt;td&gt;Even though both communities use the same term, model parameters in PM are different from parameters in ML. Model parameters in PM describe biological or pharmacological processes, such as drug clearance, drug distribution volume or rate of absorption. These parameters are directly interpretable. In ML, on the other hand, model parameters are mathematical parameters learnt during the model training process and are part of the final model describing the data. They do not provide biological interpretation in the first instance at least.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Model averaging&lt;/td&gt;&lt;td&gt;Ensemble model&lt;/td&gt;&lt;td&gt;An ensemble model combines multiple ML algorithms, which in most cases leads to better predictive performance compared to single algorithms [&lt;xref ref-type="bibr" rid="bibr60"&gt;60&lt;/xref&gt;]. There is a similar method used in PM called model averaging [&lt;xref ref-type="bibr" rid="bibr61"&gt;61&lt;/xref&gt;], where several models are combined using weights determined by their individual fit to the data. &lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Shrinkage&lt;/td&gt;&lt;td&gt;Shrinkage&lt;/td&gt;&lt;td&gt;The term "shrinkage" has a different meaning in the PM and ML communities. In PM, shrinkage describes overparameterization, where 0 indicates very informative data and no overfit, and 1 uninformative data and overfitting. In ML, shrinkage methods in different ML models reduce the possibility of overfitting or underfitting by providing a trade-off between bias and variance. &lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Bootstrapping&lt;/td&gt;&lt;td&gt;Bootstrapping&lt;/td&gt;&lt;td&gt;Describes a random resampling method with replacement. In PM, it is used during model development and evaluation for estimation of the model performance. In ML, bootstrapping is part of some algorithms, such as XGBoost or Random Forest, and is also used to estimate the model's predictive performance. &lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Cross-validation&lt;/td&gt;&lt;td&gt;Cross-validation&lt;/td&gt;&lt;td&gt;In PM, cross-validation is used occasionally, for example, in covariate selection procedures in order to assess the true alpha error. In ML, cross-validation is commonly applied to prevent overfitting and to obtain robust predictions. Cross-validation describes the process of splitting the data into a training dataset and a test dataset. The training dataset is used for model development and the test dataset for external model evaluation. In n-fold cross-validation, the data are split into n non-overlapping subsets, where n &amp;#8722; 1 subsets are used for training and the left-out subset for evaluation. This procedure is repeated until all subsets have been used for model evaluation. Model performance is then computed across all test sets [&lt;xref ref-type="bibr" rid="bibr45"&gt;45&lt;/xref&gt;]. &lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;-&lt;/td&gt;&lt;td&gt;Holdout/test dataset&lt;/td&gt;&lt;td&gt;Describes the test/unseen dataset used for external validation. It is of great importance that the holdout/test data is not used for model training or hyperparameter tuning in order not to overestimate the model's predictive performance [&lt;xref ref-type="bibr" rid="bibr45"&gt;45&lt;/xref&gt;]. &lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;-&lt;/td&gt;&lt;td&gt;Oversampling/Upsampling&lt;/td&gt;&lt;td&gt;Oversampling is an approach used to deal with highly imbalanced data. Data in areas with sparse data are resampled or synthesized using different methods, for example, Synthetic Minority Oversampling Technique (SMOTE) [&lt;xref ref-type="bibr" rid="bibr62"&gt;62&lt;/xref&gt;].&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Empirical Bayes Estimates (EBEs)&lt;/td&gt;&lt;td&gt;Bayesian optimization&lt;/td&gt;&lt;td&gt;EBEs in PM are the model parameter estimates for an individual, estimated based on the final model parameters as well as observed data using Bayesian estimation [&lt;xref ref-type="bibr" rid="bibr63"&gt;63&lt;/xref&gt;]. In artificial intelligence (AI), Bayesian optimization is used to tune artificial neural networks (ANNs), particularly in deep learning.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Typical value&lt;/td&gt;&lt;td&gt;Typical value&lt;/td&gt;&lt;td&gt;The typical value in PM is the most likely parameter estimate for the whole population given a set of covariates. It could, e.g., be the drug clearance estimate that best summarizes the clearance of the whole population. In ML, the typical value in unsupervised learning, for example, is the center of a cluster (e.g., k-means).&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Inter-individual variability (IIV)&lt;/td&gt;&lt;td&gt;-&lt;/td&gt;&lt;td&gt;Variability between individuals in a population. Describes the difference between typical and individual PK parameters. Often assumed to be log-normally distributed. &lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Inter-occasion variability (IOV)&lt;/td&gt;&lt;td&gt;-&lt;/td&gt;&lt;td&gt;Variability within an individual on different occasions (e.g., sampling or dosing occasions). Often assumed to be log-normally distributed.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Residual error variability (RUV)&lt;/td&gt;&lt;td&gt;-&lt;/td&gt;&lt;td&gt;Remaining random unexplained variability. Describes the difference between individual prediction and observed value. &lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Population prediction&lt;/td&gt;&lt;td&gt;-&lt;/td&gt;&lt;td&gt;The population prediction is the most likely representation of the population given a set of covariates.&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Individual prediction&lt;/td&gt;&lt;td&gt;-&lt;/td&gt;&lt;td&gt;Predictions for an individual using the population estimates in combination with the observed data for this individual, computed in a Bayesian posthoc step.&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt; </ephtml> </p> <p>1 <emph>L</emph>1, least absolute deviations; <emph>L</emph>2, least absolute errors; MAPE, mean absolute prediction error; ML, machine learning; PM, pharmacometrics.</p> <hd id="AN0158912702-7">2. Materials and Methods</hd> <p></p> <hd id="AN0158912702-8">2.1. Data</hd> <p>Rifampicin PK data were simulated from a previously published population PK model describing rifampicin plasma concentrations over time in tuberculosis patients [[<reflink idref="bib15" id="ref70">15</reflink>]], which has been shown to perform best compared to other population PK models for rifampicin and has undergone external validation [[<reflink idref="bib64" id="ref71">64</reflink>]]. The population PK model consists of a 1-compartment distribution model with nonlinear elimination (described as a Michaelis–Menten equation), autoinduction of elimination and dose-dependent bioavailability [[<reflink idref="bib15" id="ref72">15</reflink>]]. The simulated scenario was as in the original HIGHRIF1 clinical phase 2 trial [[<reflink idref="bib65" id="ref73">65</reflink>]] in order to create clinically relevant data. PK in 83 individuals following 10, 20, 25, 30, 35 or 40 mg/kg oral rifampicin once a day for 2 weeks was simulated. The dose group was randomly assigned to a simulated individual. The simulated individual's bodyweight defined the actual dose. The dose was then rounded to the next 150 mg increment, due to availability of tablet strength. In total, the 13 doses, which are 450, 600, 900, 1050, 1200, 1350, 1500, 1650, 1800, 1950, 2250, 2400, 2550 and 2700 mg, were randomly assigned to 8, 9, 1, 7, 8, 3, 17, 2, 13, 5, 3, 4, 1 and 2 individuals, respectively. IIV, IOV and RUV were included in the simulations. PK sampling time-points were at pre-dose and 0.5, 1, 1.5, 2, 3, 4, 6, 8, 12 and 24 h after dose at two different sampling occasions (day 7 and 14 of treatment). Patient covariates were randomly sampled from the parametric covariate distribution in the HIGHRIF1 trial [[<reflink idref="bib65" id="ref74">65</reflink>]], taking into account the correlation between bodyweight (WT), fat-free mass (FFM) and gender, as described previously [[<reflink idref="bib17" id="ref75">17</reflink>]]. The covariates were sampled from either a truncated normal distribution (WT, FFM, age, body height (HT)) or a binomial distribution (gender, HIV-coinfection, race). Body mass index (BMI) was calculated from WT and HT.</p> <p>Predictors included in the simulations to create the dataset were time after dose (TAD), treatment week (OCC) and dose, which are part of the structural PM model, as well as the covariate FFM. In total, 1826 (83 patients × 2 occasions × 11 samples per occasion) rifampicin plasma concentrations were simulated. The simulated rifampicin concentrations and the covariates included in the simulation are considered to be the true observed concentrations and predictors, respectively. The final simulated dataset is provided in Supplementary Materials Data S1.</p> <hd id="AN0158912702-9">2.2. ML Model Training</hd> <p>In this work, we evaluate ML model performance for prediction of PK data using rifampicin as an example drug. Since either a rifampicin plasma concentration-time series or exposure indices can be used as input for a PKPD model, it is of interest to investigate the predictive ability of ML for either outcome. As an exposure index, the area under the plasma concentration-time curve from 0 to 24 h (AUC<subs>0–24h</subs>) was chosen here, since the AUC<subs>0–24h</subs>/MIC has been shown to be the best predictor of rifampicin efficacy [[<reflink idref="bib66" id="ref76">66</reflink>]]. The individual AUC<subs>0–24h</subs> values were calculated using noncompartmental analysis (NCA) based on rich simulated profiles (20 observations per sampling occasion). For AUC<subs>0–24h</subs> calculation, the trapezoidal rule implemented in the pmxTools R package [[<reflink idref="bib67" id="ref77">67</reflink>]] was utilized. The derived AUC<subs>0–24h</subs> values were considered the true values.</p> <p>For ML model training, features included in the training dataset were TAD, dose, OCC, BMI, age, gender, race, WT, HT, HIV co-infection and FFM. The target was either the rifampicin plasma concentration or the rifampicin AUC<subs>0–24h</subs>. From the whole dataset, 5 datasets each containing 80% of the data for training and 20% for testing were created using patient identifier (ID) as a grouping variable, enabling 5-fold cross-validation. When splitting the data, it was ensured that each of the 5 test datasets contained different IDs, making sure that every ID was left out once, thus avoiding overlapping and bias. The test set was solely used to evaluate final performance, never for any training or fitting parameters of any model. The training dataset was again split in 80% training and 20% validation for 5-fold internal cross-validation. Utilizing the training datasets as input, different ML algorithms were trained for prediction of rifampicin PK. The predictive performance for each of the 5 test datasets was averaged in order to compute the overall predictive performance.</p> <p>The current study deals with a regression problem; therefore, different supervised ML algorithms included in the SuperLearner R package [[<reflink idref="bib68" id="ref78">68</reflink>]] were tested. The three top-performing algorithms were chosen for further evaluation, which were GBM, XGBoost and Random Forest. In addition to these three nonlinear models, a linear model (LASSO) was evaluated for comparison.</p> <p>The algorithms were optimized by testing different model parameters (hyperparameter tuning). Hyperparameter tuning was performed in a grid search for LASSO, GBM and Random Forest, where all possible parameter combinations were explored. For XGBoost, a sequential, heuristic search was performed due to very long run times for the full grid search. The ranges of investigated hyperparameters of the final models are presented in Table S1.</p> <p>For comparison of run times, as well as creation of a VPC, the population PK model [[<reflink idref="bib15" id="ref79">15</reflink>]] was re-estimated in NONMEM [[<reflink idref="bib37" id="ref80">37</reflink>]] using the simulated dataset.</p> <hd id="AN0158912702-10">2.3. Feature Ranking</hd> <p>GBM, XGBoost, Random Forest and LASSO were compared based on their capability to identify features. Since the dataset was created by simulating from the population PK model, including TAD, OCC, dose and FFM, which were previously shown to influence rifampicin PK [[<reflink idref="bib15" id="ref81">15</reflink>]], the data include a correlation between rifampicin plasma concentrations and these predictors. Other predictors included in the dataset, which were not used in the simulation of the plasma concentrations, can therefore be considered noise. In contrast to a real-world dataset, where the true predictors are unknown, in this simulation-based study the true predictors are known. In this work, we explored whether the four different algorithms were able to identify these true predictors, i.e., TAD, OCC, dose and FFM. Evaluated predictors were TAD, dose, OCC, BMI, age, gender, race, WT, HT, HIV co-infection and FFM. The machine learning model evaluates all features and assigns weights to them in the training process, which will determine the prediction of the outcome variable.</p> <p>The importance score of GBM is based on calculating the amount by which each feature's split point improves the efficiency in a single decision tree. The importance scores are averaged across all decision trees. Therefore, the greater the improvement efficiency measure of a feature on the split point (closer to the root node), the greater the weight. This means that the more the promotion tree is selected, the higher the feature importance. XGBoost, as the implemented method of GBM, has the same algorithms in feature ranking, but uses a more regularized model formalization to control over-fitting which would affect the importance scores. Random Forest calculates importance scores by evaluating the decrease in node impurity weighted by the probability of reaching that node. LASSO adds the <emph>L</emph>1 norm of the coefficient as a penalty term to the loss function. Since the regularization term is non-zero, the coefficients corresponding to less important features are therefore discarded.</p> <hd id="AN0158912702-11">2.4. PK Predictions</hd> <p>The amount of available clinical PK data is very different across the several stages of drug development and thus an additional objective was to explore ML model performance across varying data sizes. To assess the amount of information required by the ML algorithms to predict well, scenarios including varying numbers of observed rifampicin concentrations as input variables in addition to the weighted features included in the model (TAD, dose, OCC, BMI, age, gender, race, WT, HT, HIV co-infection and FFM) were investigated (see Table 2). Based on this input, a full plasma concentration-time series (at pre-dose and 0.5, 1, 1.5, 2, 3, 4, 6, 8, 12 and 24 h post-dose) and the AUC<subs>0–24h</subs> at treatment days 7 and 14 were predicted (see Table 2). As an example, in scenario 2 (Table 2), the abovementioned features, as well as 2 observed rifampicin plasma concentrations at 2 and 4 h post-dose are used as input variables to the model to predict the rifampicin plasma concentration at pre-dose and 0.5, 1, 1.5, 2, 3, 4, 6, 8, 12 and 24 h post-dose, i.e., a full pharmacokinetic profile.</p> <hd id="AN0158912702-12">2.5. Model Evaluation</hd> <p>Model performance was evaluated using the <emph>R</emph><sups>2</sups> (Equation (<reflink idref="bib3" id="ref82">3</reflink>)) between observations and predictions, the root mean square error (<emph>RMSE</emph>) describing precision (Equation (<reflink idref="bib4" id="ref83">4</reflink>)) and the mean absolute error (<emph>MAE</emph>) describing bias (Equation (<reflink idref="bib5" id="ref84">5</reflink>)).</p> <p>(<reflink idref="bib3" id="ref85">3</reflink>) <ephtml> &lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;semantics&gt;&lt;mrow&gt;&lt;msup&gt;&lt;mi&gt;R&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;mo&gt;&amp;#8722;&lt;/mo&gt;&lt;mfrac&gt;&lt;mrow&gt;&lt;mi&gt;S&lt;/mi&gt;&lt;msub&gt;&lt;mi&gt;S&lt;/mi&gt;&lt;mrow&gt;&lt;mi&gt;r&lt;/mi&gt;&lt;mi&gt;e&lt;/mi&gt;&lt;mi&gt;s&lt;/mi&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mi&gt;S&lt;/mi&gt;&lt;msub&gt;&lt;mi&gt;S&lt;/mi&gt;&lt;mrow&gt;&lt;mi&gt;t&lt;/mi&gt;&lt;mi&gt;o&lt;/mi&gt;&lt;mi&gt;t&lt;/mi&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;/mrow&gt;&lt;/mfrac&gt;&lt;/mrow&gt;&lt;/semantics&gt;&lt;/math&gt; </ephtml></p> <p>In Equation (<reflink idref="bib3" id="ref86">3</reflink>), <emph>SS<subs>res</subs></emph> are the squared residuals reflecting the fit between observed and predicted value and <emph>SS<subs>tot</subs></emph> are the total sum of squares, reflecting the total variance. <emph>SS<subs>res</subs></emph> is defined as</p> <p> <ephtml> &lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;semantics&gt;&lt;mrow&gt;&lt;mi&gt;S&lt;/mi&gt;&lt;msub&gt;&lt;mi&gt;S&lt;/mi&gt;&lt;mrow&gt;&lt;mi&gt;r&lt;/mi&gt;&lt;mi&gt;e&lt;/mi&gt;&lt;mi&gt;s&lt;/mi&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mo&gt;&amp;#8721;&lt;/mo&gt;&lt;msup&gt;&lt;mrow&gt;&lt;mrow&gt;&lt;msub&gt;&lt;mrow&gt;&lt;mi mathvariant="italic"&gt;Observed&lt;/mi&gt;&lt;/mrow&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;/msub&gt;&lt;mo&gt;&amp;#8722;&lt;/mo&gt;&lt;mi&gt;P&lt;/mi&gt;&lt;mi&gt;r&lt;/mi&gt;&lt;mi&gt;e&lt;/mi&gt;&lt;mi&gt;d&lt;/mi&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;mi&gt;c&lt;/mi&gt;&lt;mi&gt;t&lt;/mi&gt;&lt;mi&gt;e&lt;/mi&gt;&lt;msub&gt;&lt;mi&gt;d&lt;/mi&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;/msub&gt;&lt;/mrow&gt;&lt;/mrow&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;/mrow&gt;&lt;/semantics&gt;&lt;/math&gt; </ephtml> </p> <p>where <emph>Observed<subs>i</subs></emph> is the individual plasma concentration or AUC<subs>0–24h</subs> value simulated from the population PK model, and <emph>Predicted<subs>i</subs></emph> is the individual ML model-based prediction.</p> <p> <emph>SS<subs>tot</subs></emph> is defined as</p> <p> <ephtml> &lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;semantics&gt;&lt;mrow&gt;&lt;mi&gt;S&lt;/mi&gt;&lt;msub&gt;&lt;mi&gt;S&lt;/mi&gt;&lt;mrow&gt;&lt;mi&gt;t&lt;/mi&gt;&lt;mi&gt;o&lt;/mi&gt;&lt;mi&gt;t&lt;/mi&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mo&gt;&amp;#8721;&lt;/mo&gt;&lt;msup&gt;&lt;mrow&gt;&lt;mrow&gt;&lt;msub&gt;&lt;mrow&gt;&lt;mi mathvariant="italic"&gt;Observed&lt;/mi&gt;&lt;/mrow&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;/msub&gt;&lt;mo&gt;&amp;#8722;&lt;/mo&gt;&lt;mi&gt;mean&lt;/mi&gt;&lt;mrow&gt;&lt;mi&gt;O&lt;/mi&gt;&lt;mi&gt;b&lt;/mi&gt;&lt;mi&gt;s&lt;/mi&gt;&lt;mi&gt;e&lt;/mi&gt;&lt;mi&gt;r&lt;/mi&gt;&lt;mi&gt;v&lt;/mi&gt;&lt;mi&gt;e&lt;/mi&gt;&lt;mi&gt;d&lt;/mi&gt;&lt;/mrow&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;/mrow&gt;&lt;/mrow&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;/mrow&gt;&lt;/semantics&gt;&lt;/math&gt; </ephtml> </p> <p>where <emph>Observed<subs>i</subs></emph> is the individual plasma concentration or AUC<subs>0–24h</subs> value simulated from the population PK model.</p> <p>(<reflink idref="bib4" id="ref87">4</reflink>) <ephtml> &lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;semantics&gt;&lt;mrow&gt;&lt;mi&gt;R&lt;/mi&gt;&lt;mi&gt;M&lt;/mi&gt;&lt;mi&gt;S&lt;/mi&gt;&lt;mi&gt;E&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;msqrt&gt;&lt;mrow&gt;&lt;mfrac&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;mi&gt;n&lt;/mi&gt;&lt;/mfrac&gt;&lt;msubsup&gt;&lt;mstyle mathsize="70%" displaystyle="true"&gt;&lt;mo&gt;&amp;#8721;&lt;/mo&gt;&lt;/mstyle&gt;&lt;mrow&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/mrow&gt;&lt;mi&gt;n&lt;/mi&gt;&lt;/msubsup&gt;&lt;msup&gt;&lt;mrow&gt;&lt;mrow&gt;&lt;msub&gt;&lt;mrow&gt;&lt;mi mathvariant="italic"&gt;Observed&lt;/mi&gt;&lt;/mrow&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;/msub&gt;&lt;mo&gt;&amp;#8722;&lt;/mo&gt;&lt;mi&gt;P&lt;/mi&gt;&lt;mi&gt;r&lt;/mi&gt;&lt;mi&gt;e&lt;/mi&gt;&lt;mi&gt;d&lt;/mi&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;mi&gt;c&lt;/mi&gt;&lt;mi&gt;t&lt;/mi&gt;&lt;mi&gt;e&lt;/mi&gt;&lt;msub&gt;&lt;mi&gt;d&lt;/mi&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;/msub&gt;&lt;/mrow&gt;&lt;/mrow&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;/mrow&gt;&lt;/msqrt&gt;&lt;/mrow&gt;&lt;/semantics&gt;&lt;/math&gt; </ephtml></p> <p>(<reflink idref="bib5" id="ref88">5</reflink>) <ephtml> &lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;semantics&gt;&lt;mrow&gt;&lt;mi&gt;M&lt;/mi&gt;&lt;mi&gt;A&lt;/mi&gt;&lt;mi&gt;E&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mfrac&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;mi&gt;n&lt;/mi&gt;&lt;/mfrac&gt;&lt;mo /&gt;&lt;msubsup&gt;&lt;mstyle mathsize="70%" displaystyle="true"&gt;&lt;mo&gt;&amp;#8721;&lt;/mo&gt;&lt;/mstyle&gt;&lt;mrow&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/mrow&gt;&lt;mi&gt;n&lt;/mi&gt;&lt;/msubsup&gt;&lt;mo&gt;\|&lt;/mo&gt;&lt;msub&gt;&lt;mrow&gt;&lt;mi mathvariant="italic"&gt;Observed&lt;/mi&gt;&lt;/mrow&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;/msub&gt;&lt;mo&gt;&amp;#8722;&lt;/mo&gt;&lt;mi&gt;P&lt;/mi&gt;&lt;mi&gt;r&lt;/mi&gt;&lt;mi&gt;e&lt;/mi&gt;&lt;mi&gt;d&lt;/mi&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;mi&gt;c&lt;/mi&gt;&lt;mi&gt;t&lt;/mi&gt;&lt;mi&gt;e&lt;/mi&gt;&lt;msub&gt;&lt;mi&gt;d&lt;/mi&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;/msub&gt;&lt;mo&gt;\|&lt;/mo&gt;&lt;/mrow&gt;&lt;/semantics&gt;&lt;/math&gt; </ephtml></p> <p>In Equations (<reflink idref="bib4" id="ref89">4</reflink>) and (<reflink idref="bib5" id="ref90">5</reflink>), RMSE stands for root mean squared error describing precision, <emph>MAE</emph> stands for mean absolute error describing bias, <ephtml> &lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;semantics&gt;&lt;mrow&gt;&lt;msub&gt;&lt;mrow&gt;&lt;mi mathvariant="italic"&gt;Observed&lt;/mi&gt;&lt;/mrow&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;/msub&gt;&lt;/mrow&gt;&lt;/semantics&gt;&lt;/math&gt; </ephtml> is the individual plasma concentration or AUC<subs>0–24h</subs> value simulated from the population PK model and <emph>Predicted<subs>i</subs></emph> is the individual ML model-based prediction.</p> <hd id="AN0158912702-13">2.6. Software</hd> <p>For simulation of rifampicin plasma concentrations, NONMEM version 7.3.0 (Icon Development Solutions, Hanover, MD, USA) [[<reflink idref="bib37" id="ref91">37</reflink>]] assisted by PsN version 5.0.0 (Department of Pharmacy, Uppsala University, Uppsala, Sweden) [[<reflink idref="bib43" id="ref92">43</reflink>]] was used. Dataset manipulation, data visualization and ML model training was performed in R statistical software version 4.0.3 (R Foundation for Statistical Computing, Vienna, Austria) [[<reflink idref="bib69" id="ref93">69</reflink>]]. The computations were performed on resources provided by the Swedish National Infrastructure for Computing (SNIC) through Uppsala Multidisciplinary Center for Advanced Computational Science (UPPMAX).</p> <hd id="AN0158912702-14">3. Results</hd> <p></p> <hd id="AN0158912702-15">3.1. Feature Ranking</hd> <p>In scenario 1, i.e., predicting a rifampicin plasma concentration-time series based on features only, XGBoost, Random Forest and GBM selected TAD and dose as the most important predictors as expected, since these were included in the creation of the data. Instead of the true covariate FFM, all three algorithms selected WT and BMI as the third and fourth most important predictors, which can be explained by the direct correlation between WT, BMI and FFM. The linear LASSO algorithm showed poor feature selection performance, which was not better than random. Therefore, feature importance was not evaluated for LASSO. The importance score of each feature for the different ML algorithms (scenario 1) is presented in Figure 4. The feature importance for the five other scenarios is graphically presented in Figure S2.</p> <hd id="AN0158912702-16">3.2. Predictions of Rifampicin Plasma Concentration over Time</hd> <p>XGBoost, Random Forest, GBM and LASSO were trained and optimized. The final, well-trained models were used to predict rifampicin plasma concentration over time (11 time-points per sampling occasion) using the features in the test dataset as input. The final model predictions and run times are presented in Table 3. For all algorithms, the run time was shorter (range across all algorithms: 1.0 (scenario 5 in Lasso)–508.7 s (scenario 3 in Random Forest)) compared to the population PK model (11,479 s) using 1 core of the UPPMAX cluster.</p> <p>With regards to prediction of plasma concentration over time using features as input only, XGBoost showed the highest <emph>R</emph><sups>2</sups> (0.60) and precision (RMSE: 10.6 mg/L). GBM had an <emph>R</emph><sups>2</sups> of 0.57 and RMSE of 10.9 mg/L and Random Forest had an <emph>R</emph><sups>2</sups> of 0.54 and RMSE of 11.3 mg/L. The linear model LASSO had an <emph>R</emph><sups>2</sups> of 0.25. Using 2 rifampicin plasma concentrations per sampling occasion as input for prediction of the whole time series (11 time-points per sampling occasion) substantially improved model performance compared to features only. In this scenario, XGBoost and GBM had the highest predictive performance with an <emph>R</emph><sups>2</sups> of 0.76, closely followed by Random Forest with an <emph>R</emph><sups>2</sups> value of 0.75. The use of 6 rifampicin plasma concentrations led to the best predictive performance in all algorithms. XGBoost had the highest <emph>R</emph><sups>2</sups> (0.84) and precision (RMSE: 6.9 mg/L). Random Forest and GBM showed comparable performances, with <emph>R</emph><sups>2</sups> values of 0.82 and 0.83, respectively. LASSO exhibited poor performance (<emph>R</emph><sups>2</sups>: 0.39). Model performance across all algorithms and scenarios is summarized in Table 3. The results clearly show that increasing the amount of data per simulated patient improves the predictive performance of all four ML algorithms, as shown in Table 3 and Figure 5. The prediction interval-based VPC for the best performing algorithm (XGBoost) (Figure 6) shows accurate prediction of the median but underprediction of the true variability in the population. The rifampicin concentrations simulated from the population PK model, considered to be observations in this study, for 15 randomly selected IDs in the test dataset and the model predictions from the best-performing ML model (XGBoost) were compared across the dose groups, which are presented in Figure 7 and Figure S3. Even though XGBoost showed the best predictive performance in scenarios 1, 2 and 3 (see Table 3), all three nonlinear algorithms exhibited acceptable predictive performance, considering the small dataset. A VPC for the re-estimated population PK model using the simulated data is shown in VPC (Figure 8).</p> <p>Using the best performing model (XGBoost) and 6 rifampicin concentrations as input (scenario 3), the 95% prediction interval (PI) was −0.2–62.4 mg/L (median: 10.9 mg/L) compared to the data simulated from the population PK model with a 95% PI of 0–69.4 mg/L (median: 10.2 mg/L). Imprecision and bias were comparable between treatment week 1 (RMSE: 7.1 mg/L, MAE: 3.9 mg/L) and week 2 (RMSE: 6.6 mg/L, MAE: 4.1 mg/L), indicating that the ML model can identify the difference in exposure between week 1 and week 2 caused by rifampicin autoinduction.</p> <p>The R code for the final models is provided in Supplementary Materials Text S2.</p> <hd id="AN0158912702-17">3.3. Predictions of Rifampicin AUC 0–24h</hd> <p>The different ML models were trained to predict rifampicin AUC<subs>0–24h</subs> using varying plasma concentrations of rifampicin as input (see Table 2). The <emph>R</emph><sups>2</sups>, imprecision (RMSE), bias (MAE) and run time for each scenario are summarized in Table 4. Graphical exploration revealed good performance across all four algorithms (Figure 9), but LASSO was superior in regard to precision and accuracy (Table 4).</p> <hd id="AN0158912702-18">4. Discussion</hd> <p>When comparing PM and ML methodology in drug development in terms of physiological plausibility and clinical relevance, the difference that probably stands out the most is that PKPD models are based on statistical models as well as underlying biological mechanisms where the models can vary from compartmental models to full mechanistic as in physiology-based pharmacokinetic models, while ML is often a purely data-driven approach, though not in all cases. In PM, the parameters are directly biologically interpretable, which can aid in identifying underlying mechanisms. ML parameters, however, are mathematically interpretable albeit to a lesser extent from a biological perspective. There are, however, some artificial intelligence (AI) methodologies, such as causal AI, that provide a better interpretability. In PM, the previous mechanistic knowledge that is used as input can also introduce a bias into the analysis if imputed inappropriately, which is avoided in ML. On the other hand, the previous knowledge used in PM models is crucial for analysis of sparse datasets. There are also differences in the model-building process. When already-available ML models are used, the modeller is less involved in the model building but facilitates the selection of an appropriate algorithm, hyperparameter tuning and model evaluation. In PM, a model is manually built in an iterative manner, constantly evaluating the model's performance at each step, based on which the modeller takes the decision on how to proceed (Figure 3). The process is thus very time- and labor-intensive, and can lead to different modellers developing slightly different models based on the same dataset. This way of developing a model ensures that modelling is guided by a physiological intent, i.e., it is easily understandable if a PK profile follows a one- or a two-compartment model structure and how that potentially relates to the distribution of the drug in the body and site of action. ML cannot account for that in the first instance at least, but is a faster, more time- and labor-efficient method compared to PM. Challenges with purely data-driven ML algorithms certainly lie in the need for rich data, as well as the "Garbage in–Garbage out" problem, which describes the need for high-quality and diverse data in order to build robust models [[<reflink idref="bib45" id="ref94">45</reflink>]]. One of the major advantages of PM modelling is the ability to perform simulations of different scenarios that were not part of the data used for model building. An example is clinical trial simulations where a drug's PK and/or effect is predicted in a virtual population for different scenarios, for example, to support dose selection for a future clinical trial or for prediction of optimal doses in renal impairment patients or pediatric populations [[<reflink idref="bib70" id="ref95">70</reflink>]]. While this can also be achieved using ML, e.g., with reinforcement methodologies, larger amounts of data are needed to train robust models.</p> <p>In this work, we have investigated the identification of key features (Figure 4), described the variability in the PK predictions (Figure 6) and predicted PK using time as a continuous variable (Table 3, Figure 6 and Figure 7).</p> <p>Accurate selection of informative predictors is crucial both to obtain a high model performance and for clinical use, as they inform about potential differences in dose adjustments in special populations. This work suggests that all three nonlinear algorithms are able to correctly identify TAD and dose as the most important predictors. Since the linear LASSO algorithm performed very poorly, feature selection was not evaluated. Despite the correct identification of TAD and dose, none of the nonlinear algorithms selected FFM in third place, which is the true covariate. All three algorithms, however, selected WT and BMI as third and fourth most important, which are directly correlated with FFM. We hypothesize that WT was assigned a high importance instead of FFM due to the high correlation between the two features. The ML model assigns a high feature importance to WT and a low importance to FFM, because the contribution of FFM is no longer needed after inclusion of WT. In the PM model, on the other hand, FFM was selected as more important than WT. This might be due to the fact that PM and ML are two very different methods, which select features in a different way. However, this should not have a large impact on the results since FFM and WT are directly correlated and assigning one a high and the other one a low importance would likely describe the data very similarly. In PM, the true covariate FFM would be added as a covariate on, e.g., drug clearance or drug volume of distribution, indicating which process the covariate influences. This information, however, is not easily obtainable from ML models, and it thus remains unknown how exactly the exposure is influenced by a feature. Another difference is that in PM TAD, OCC and dose are a part of the structural model, and are always used for prediction of concentration over time in population PK modelling, while in ML these are considered features, just like FFM.</p> <p>Using ML instead of PM could be promising for prediction of PK due to its time- and labor-efficiency in situations when the gold standard population PK cannot be used, or there is no need for a more mechanistic understanding of the PK, or to develop a model that can perform simulations. All the tested ML algorithms in this study outperformed the PM model with respect to run time, being at least 22 times faster. Considering the fact that PM model development is a manual, stepwise procedure, it is also important to note that a PM model has to be run repeatedly, especially when performing covariate analysis. It is often necessary to run the model 20–100 times until the final model is reached. In ML, however, many algorithms have an embedded feature selection, i.e., the model only has to be run once. A fairer comparison would thus be a multiple of 11,479 s (191 min) for the PM model, depending on how many developmental steps are involved and how many covariates are explored, versus one run time for the ML model plus the run time for hyperparameter tuning. However, both pure run times and the model development itself are faster using ML. Due to the stepwise, manual model building process in PM, it usually takes several weeks for an experienced modeller to develop a population PK model, while an ML model can most often be developed in less time. With large amounts of diverse data, the time ratio would change even more. This increased efficiency is especially beneficial in light of the good predictive performance of the nonlinear ML models explored in this study when at least two plasma concentrations are used as input. In this work, we demonstrated a good precision and accuracy in the ML predictions for both longitudinal data (XGBoost: RMSE: 6.9 mg/L, MAE: 4.0 mg/L) and exposure indices (AUC<subs>0–24h</subs>) (LASSO: RMSE: 29.1 h·mg/L, MAE: 18.8 h·mg/L), using six concentrations as input. In both cases, the inclusion of observed rifampicin plasma concentrations as features considerably improved the model performance.</p> <p>For prediction of the plasma concentration-time series, the predictions (Figure 5) indicated that the nonlinear ML algorithms, especially XGBoost, were in good accordance with the data simulated from the population PK model, and the shape of the concentration-time profiles was predicted accurately (Figure 6 and Figure 7). Despite the high complexity in rifampicin PK, including autoinduction of elimination, concentration-dependent nonlinear clearance, dose-dependent bioavailability [[<reflink idref="bib14" id="ref96">14</reflink>], [<reflink idref="bib16" id="ref97">16</reflink>]] and high IOV [[<reflink idref="bib17" id="ref98">17</reflink>]], the nonlinear ML models performed well when at least two plasma concentrations were used as input. The results showed that the best-performing model (XGBoost) was able to predict accurately at treatment weeks 1 and 2 (Figure 7), suggesting that the model can handle autoinduction [[<reflink idref="bib14" id="ref99">14</reflink>]]. The good performance across all dose groups (Figure 7) indicated that dose-dependent bioavailability and concentration-dependent elimination are handled properly. However, the XGBoost predictions may have a weakness in identifying the true upper range of high exposure (see Figure 6). When comparing the individual predicted plasma concentration-time profiles to the data simulated from the population PK model, here considered to be observed data, it becomes obvious that there is a larger variability in the observed data (95% PI predictions: 62.6 mg/L, 95% PI observations: 69.4 mg/L) (see Figure 6 and Figure 8). This could be due to the fact that the population PK model incorporates different sources of variability, such as IIV, IOV and residual error, and is thus able to describe variability between patients and within a patient well. The ML algorithms explored in this study, however, only use the features as a source of variability and cannot account for IIV that is not caused by any of the features. This could be important to appreciating/predicting risks of individual patients reaching a safety threshold of exposure.</p> <p>For prediction of AUC<subs>0–24h</subs>, all four algorithms performed well when at least two plasma concentrations were used as input. The results clearly show a correlation between the amount of data used for training and model performance. Using features only without rifampicin plasma concentrations as input, resulted in weak model performance with an <emph>R</emph><sups>2</sups> of 0.41, RMSE of 117.9 h·mg/L and MAE of 74.2 h·mg/L (LASSO). Using two plasma concentrations at 2 h and 4 h post-dose as input, resembling a limited sampling strategy [[<reflink idref="bib64" id="ref100">64</reflink>]], led to a higher model performance (LASSO, <emph>R</emph><sups>2</sups>: 0.69, RMSE: 86.8 h·mg/L, MAE: 54.5 h·mg/L). The best predictive performance was achieved when using six plasma concentrations as input, representing a richer sampling, where LASSO performed very well with an <emph>R</emph><sups>2</sups> of 0.97 (RMSE: 29.1 h·mg/L, MAE: 18.8 h·mg/L) (see also Table 4). This indicates that predicting AUC<subs>0–24h</subs> accurately and precisely without drug plasma concentrations is challenging. At least two concentrations are needed to reach acceptable model performance.</p> <p>The generalizability of the algorithms to predict well across all test sets is important to evaluate overfitting and identify outliers. The generalizability of the models across the test sets obtained from the five-fold cross-validation was acceptable as the range of RMSE and MAE was in general spread evenly around the average value (see Table 3 and Table 4), with the exception of GBM in scenario 6.</p> <p>Due to the different nature of the evaluated algorithms (linear versus nonlinear), they performed very differently for prediction of longitudinal data and AUC<subs>0–24h</subs>. While the linear model LASSO showed excellent performance for AUC<subs>0–24h</subs> prediction using six concentrations (<emph>R</emph><sups>2</sups>: 0.97), it was not able to predict longitudinal data (<emph>R</emph><sups>2</sups>: 0.39). This is likely due to the different nature of the longitudinal data and the AUC<subs>0–24h</subs>. A concentration-time series has a distinct shape (see, e.g., Figure 6), which a linear algorithm such as LASSO is not able to describe. The AUC<subs>0–24h</subs>, however, is a summary variable, describing the whole time-series in one value. This simplifies the problem and enables even a linear algorithm to predict well. The nonlinear models performed well for both prediction of longitudinal data as well as AUC<subs>0–24h</subs>, including at least two plasma concentrations as input. However, the PM model using NLME methodology still performs better for prediction of longitudinal data, as shown in Figure 8. NLME models are better able to capture the variability between and within patients compared to the ML models investigated here (Figure 6 and Figure 8). There is thus a need for further studies investigating how the variability could be better captured using ML.</p> <p>PK data are naturally imbalanced due to the fact that plasma PK sampling is often performed very sparsely. This can often lead to poor performance in areas of sparse information, e.g., around the C<subs>max</subs> using ML. One approach addressing this issue is oversampling, which is a method increasing the data size in sparse areas. Due to the fact that PK data is not assumed to be normally distributed, we did not apply oversampling in this work, but believe that oversampling methods appropriate for sparse PK data should be investigated.</p> <p>In this work, we used an exemplary simplified case study and time efficiencies provided an example of a proof of value study. ML is a purely data-driven approach and should thus be regarded as an assistive tool rather than a decision maker at this stage. Richer, more diverse data will help produce a more accurate tool, but in this study, we exemplified the potential of ML. In this study, the model performance between PM and ML could not be directly compared, since the data were simulated from the PM model, which might bias PM model performance. There is thus a need for further studies comparing the performance of PM and ML directly using real patient data. In addition, the generalizability of the ML methods should be investigated further. A limitation of this work is that data simulated from a population PK model instead of real patient data were used as input for the ML model training. Even though real patient data would have reflected the real-world scenario perfectly, we believe that using simulated data instead did not impact the results significantly, since the population PK model used to perform the simulations [[<reflink idref="bib15" id="ref101">15</reflink>]] has been validated externally and shown to predict real patient data accurately and precisely [[<reflink idref="bib64" id="ref102">64</reflink>]]. Moreover, in order to be able to make the simulated dataset in Supplementary Materials Data S1 public, the patient covariates were sampled from a parametric covariate distribution of the HIGHRIF1 trial [[<reflink idref="bib65" id="ref103">65</reflink>]] and not bootstrapped. Even though bootstrap is the gold standard method, it is assumed that the results do not vary much, since the correlations between covariates were retained and the sampling was performed from a truncated distribution using the reported ranges as minimum and maximum values. One drawback of the ML model is that some plasma concentrations were predicted to be negative, which is not the case in the simulated data, as the lowest possible prediction is restricted to 0 mg/L due to scientific plausibility. Restricting the ML models or working with log-transformed data for future work could be one way to avoid this issue. In addition, we see a need to do more work on inflation of classifier accuracy due to data crosstalk. Furthermore, the time-points resembling sparse PK sampling (2 and 4 h post-dose) are based on a PM study [[<reflink idref="bib64" id="ref104">64</reflink>]]. A sparse sampling strategy in PM is often based on gaining information on exposure during the absorption and elimination phase. This might not be the case for ML models, and thus there is a need to investigate an optimal sampling strategy appropriate for ML.</p> <p>There are few studies investigating how ML and PM could be combined to further improve drug development [[<reflink idref="bib5" id="ref105">5</reflink>], [<reflink idref="bib8" id="ref106">8</reflink>], [<reflink idref="bib10" id="ref107">10</reflink>], [<reflink idref="bib12" id="ref108">12</reflink>]]. This work is one of the most comprehensive illustrative examples of a side-by-side comparison of the two methods. We here exemplify in a case study the use of ML for PK predictions, which could then be used as input into a PKPD model, enabling faster but still accurate PKPD analysis. We demonstrate that ML could be a useful tool for PK analysis, providing fast predictions of both the full concentration time series as well as PK exposure indices (e.g., AUC<subs>0–24h</subs>) with acceptable precision. Further work is needed to investigate this tool using real patient data. Bridging the gap between PM and ML seems promising considering that ML can add value to PM workflows through increased time and labor efficiency.</p> <hd id="AN0158912702-19">Figures and Tables</hd> <p>Graph: Figure 1 Overall proposed workflow. Blue panels indicate pharmacometrics and yellow machine learning.</p> <p>Graph: Figure 2 Illustration of a two-compartment pharmacokinetic model for a fictive drug. AGI, amount of drug in the gastrointestinal tract; k01, absorption rate constant; k10, elimination rate constant; k12, rate constant describing distribution from central to peripheral compartment; k21, rate constant describing distribution from peripheral to central compartment; V1, volume of central compartment (e.g., blood); V2, volume of peripheral compartment (e.g., brain tissue). Drug clearance is expressed as k10×V1.</p> <p>Graph: Figure 3 Comparison of the general model development workflow between pharmacometrics and machine learning. The different colors represent different steps of model development. Green: data preparation, blue: model building, red: model evaluation, orange: finalizing the model.</p> <p>Graph: Figure 4 Importance scores for evaluated features shown for the different machine learning algorithms. (A) GBM, (B) Random Forest and (C) XGBoost using features only (scenario 1) as input for prediction of plasma concentration versus time. The error bars represent the standard deviation. AGE, age (years); BMI, body mass index (kg/m 2); DOSE, daily rifampicin dose (mg); FFM, fat-free mass (kg); HIV, HIV-coinfection; HT, body height (cm); OCC, treatment week; RACE, race; SEX, gender; TAD, time after dose (h); WT, bodyweight (kg).</p> <p>Graph: pharmaceutics-14-01530-g004b.tif</p> <p>Graph: Figure 5 Predictions of rifampicin plasma concentration-time series from the different ML algorithms compared to the simulations from the population PK model, considered to be observations in this study. Panel (A) is the scenario where the model was trained to predict the rifampicin plasma concentration-time series using features only as input. In panel (B), the models were trained to predict the rifampicin plasma concentration-time series based on features and 2 plasma concentrations at time-points 2 and 4 h post-dose at days 7 and 14. In panel (C), the models were trained to predict the rifampicin plasma concentration-time series based on features and 6 plasma concentrations at time-points 0.5, 1, 2, 4, 8 and 24 h post-dose at days 7 and 14. The red dashed line represents a trendline through the data. The black solid line is the line of identity, indicating 100% agreement between true and predicted values.</p> <p>Graph: pharmaceutics-14-01530-g005b.tif</p> <p>Graph: pharmaceutics-14-01530-g005c.tif</p> <p>Graph: Figure 6 Prediction interval visual predictive check for the best-performing model (XGBoost) trained using 6 plasma concentrations as input (scenario 3) shown for the whole population. Open circles are the rifampicin plasma concentrations simulated from the population PK model, considered to be observed data in this study. The shaded area is the 95th prediction interval of the machine learning model predictions (XGBoost) and the solid blue line is the median of the model predictions. The upper and lower red dashed lines are the 97.5th and 2.5th percentiles of the observed data, respectively, and the solid red line is the median of the observed data.</p> <p>Graph: Figure 7 Individual rifampicin plasma concentrations predicted from the XGBoost model (solid line and open circles) compared to the concentrations simulated from the population PK model, considered to be observations in this study (black closed circles) shown for scenario 3 (features and 6 plasma concentrations used for prediction) for 15 randomly selected IDs. Panel (A) represents the predictions for each individual in the test dataset at day 7. Panel (B) represents the predictions for each individual in the test dataset at day 14. The different colors indicate the different daily rifampicin doses.</p> <p>Graph: Figure 8 Visual predictive check for the re-estimated population PK model based on the simulated data. Open blue circles are the rifampicin plasma concentrations simulated from the population PK model, considered to be observed data in this study. The upper and lower dashed lines are the 95th and 5th percentiles of the observed data, respectively, and the solid line is the median of the observed data. The shaded areas (top to bottom) are the 95% confidence intervals of the 95th (blue shaded area), median (red shaded area) and 5th (blue shaded area) percentiles of the simulated data.</p> <p>Graph: Figure 9 Predictions of rifampicin AUC 0–24h at days 7 and 14 from the different ML algorithms compared to the NCA derived AUC 0–24h , considered to be observations in this study. Panel (A) is the scenario where the model was trained using features only as input. In panel (B), the models were trained to predict rifampicin AUC 0–24h based on features and 2 plasma concentrations at time-points 2 h and 4 h post-dose at days 7 and 14. In panel (C), the models were trained to predict rifampicin AUC 0–24h based on features and 6 plasma concentrations at time-points 0.5 h, 1 h, 2 h, 4 h, 8 h and 24 h post-dose at days 7 and 14. The red dashed line represents a trendline through the data. The black solid line is the line of identity, indicating 100% agreement between true and predicted values.</p> <p>Graph: pharmaceutics-14-01530-g009b.tif</p> <p>Graph: pharmaceutics-14-01530-g009c.tif</p> <p>Table 2 Different scenarios of data sizes used for model training and predicted outcome.</p> <p> <ephtml> &lt;table&gt;&lt;thead&gt;&lt;tr&gt;&lt;th align="center" style="border-top:solid thin;border-bottom:solid thin"&gt;Scenario&lt;/th&gt;&lt;th align="center" style="border-top:solid thin;border-bottom:solid thin"&gt;Model&lt;/th&gt;&lt;th align="center" style="border-top:solid thin;border-bottom:solid thin"&gt;Predictions&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;1&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;Features only&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;Rifampicin concentration-time series &lt;sup&gt;c&lt;/sup&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;2&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;Features + 2 observed rifampicin concentrations &lt;sup&gt;a&lt;/sup&gt;&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;Rifampicin concentration-time series &lt;sup&gt;c&lt;/sup&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;3&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;Features + 6 observed rifampicin concentrations &lt;sup&gt;b&lt;/sup&gt;&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;Rifampicin concentration-time series &lt;sup&gt;c&lt;/sup&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;4&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;Features only&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;AUC&lt;sub&gt;0&amp;#8211;24h&lt;/sub&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;5&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;Features + 2 observed rifampicin concentrations &lt;sup&gt;a&lt;/sup&gt;&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;AUC&lt;sub&gt;0&amp;#8211;24h&lt;/sub&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;6&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;Features + 6 observed rifampicin concentrations &lt;sup&gt;b&lt;/sup&gt;&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;AUC&lt;sub&gt;0&amp;#8211;24h&lt;/sub&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt; </ephtml> <sups>a</sups> Time-points of rifampicin concentrations are at 2 and 4 h post-dose at days 7 and 14, representing a sparse sampling schedule. <sups>b</sups> Time-points of rifampicin concentrations are at 0.5, 1, 2, 4, 8 and 24 h post-dose at days 7 and 14, representing a richer sampling schedule. <sups>c</sups> At pre-dose and 0.5, 1, 1.5, 2, 3, 4, 6, 8, 12, and 24 h post-dose at days 7 and 14. AUC<subs>0–24h</subs>, area under the rifampicin plasma concentration-time curve up to 24 h.</p> <p>Table 3 Model performance for prediction of plasma concentration over time using varying amounts of information as input.</p> <p> <ephtml> &lt;table&gt;&lt;thead&gt;&lt;tr&gt;&lt;th align="center" style="border-top:solid thin;border-bottom:solid thin" /&gt;&lt;th colspan="3" align="center" style="border-top:solid thin;border-bottom:solid thin"&gt;GBM&lt;/th&gt;&lt;th colspan="3" align="center" style="border-top:solid thin;border-bottom:solid thin"&gt;XGBoost&lt;/th&gt;&lt;th colspan="3" align="center" style="border-top:solid thin;border-bottom:solid thin"&gt;Random Forest&lt;/th&gt;&lt;th colspan="3" align="center" style="border-top:solid thin;border-bottom:solid thin"&gt;LASSO&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th align="center" style="border-bottom:solid thin" /&gt;&lt;th align="center" style="border-bottom:solid thin"&gt;Scenario 1&lt;/th&gt;&lt;th align="center" style="border-bottom:solid thin"&gt;Scenario 2&lt;/th&gt;&lt;th align="center" style="border-bottom:solid thin"&gt;Scenario 3&lt;/th&gt;&lt;th align="center" style="border-bottom:solid thin"&gt;Scenario 1&lt;/th&gt;&lt;th align="center" style="border-bottom:solid thin"&gt;Scenario 2&lt;/th&gt;&lt;th align="center" style="border-bottom:solid thin"&gt;Scenario 3&lt;/th&gt;&lt;th align="center" style="border-bottom:solid thin"&gt;Scenario 1&lt;/th&gt;&lt;th align="center" style="border-bottom:solid thin"&gt;Scenario 2&lt;/th&gt;&lt;th align="center" style="border-bottom:solid thin"&gt;Scenario 3&lt;/th&gt;&lt;th align="center" style="border-bottom:solid thin"&gt;Scenario 1&lt;/th&gt;&lt;th align="center" style="border-bottom:solid thin"&gt;Scenario 2&lt;/th&gt;&lt;th align="center" style="border-bottom:solid thin"&gt;Scenario 3&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;&lt;italic&gt;R&lt;/italic&gt;&lt;sup&gt;2&lt;/sup&gt;&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;0.57&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;0.76&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;0.83&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;0.60&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;0.76&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;0.84&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;0.54&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;0.75&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;0.82&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;0.25&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;0.36&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;0.39&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;Pearson correlation&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;0.77&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;0.87&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;0.90&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;0.78&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;0.87&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;0.91&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;0.75&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;0.86&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;0.90&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;0.52&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;0.62&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;0.65&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;RMSE (mg/L)&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;10.9 (8.9&amp;#8211;13.3)&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;8.3&lt;break /&gt;(6.8&amp;#8211;8.6)&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;7.1&lt;break /&gt;(5.1&amp;#8211;7.3)&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;10.6&lt;break /&gt;(8.9&amp;#8211;13.5)&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;8.3&lt;break /&gt;(6.7&amp;#8211;12.4)&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;6.9&lt;break /&gt;(5.1&amp;#8211;11.1)&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;11.3&lt;break /&gt;(9.8&amp;#8211;14.1)&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;8.5&lt;break /&gt;(6.9&amp;#8211;12.7)&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;7.2&lt;break /&gt;(5.3&amp;#8211;11.8)&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;14.5&lt;break /&gt;(13.4&amp;#8211;19.1)&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;13.3&lt;break /&gt;(11.5&amp;#8211;16.6)&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;12.9&lt;break /&gt;(11.3&amp;#8211;15.3)&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;MAE (mg/L)&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;7.1&lt;break /&gt;(6.0&amp;#8211;7.1)&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;5.2&lt;break /&gt;(4.3&amp;#8211;6.8)&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;4.1&lt;break /&gt;(3.3&amp;#8211;5.7)&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;7.0&lt;break /&gt;(6.0&amp;#8211;8.0)&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;5.1&lt;break /&gt;(4.2&amp;#8211;6.7)&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;4.0&lt;break /&gt;(3.2&amp;#8211;5.4)&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;7.0&lt;break /&gt;(6.4&amp;#8211;8.0)&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;4.9&lt;break /&gt;(4.2&amp;#8211;6.4)&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;3.8&lt;break /&gt;(2.8&amp;#8211;5.3)&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;10.2&lt;break /&gt;(9.9&amp;#8211;12.2)&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;9.6&lt;break /&gt;(8.4&amp;#8211;11.1)&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;9.3&lt;break /&gt;(8.1&amp;#8211;10.5)&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;Run time (s)&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;6.8&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;8.2&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;11.1&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;1.4&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;1.2&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;4.7&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;309.9&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;362.6&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;508.7&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;1.1&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;1.3&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;1.1&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt; </ephtml> In scenario 1, the models were trained to predict the rifampicin plasma concentration-time series (11 time-points at days 7 and 14) based only on features (no plasma concentrations). In scenario 2, the models were trained to predict the rifampicin plasma concentration-time series (11 time-points at days 7 and 14) based on features and 2 plasma concentrations at time-points 2 and 4 h post-dose at days 7 and 14. In scenario 3, the models were trained to predict the rifampicin plasma concentration-time series (11 time-points at days 7 and 14) based on features and 6 plasma concentrations at time-points 0.5, 1, 2, 4, 8 and 24 h post-dose at days 7 and 14. MAE, mean absolute error averaged across the n-folds (range); RMSE, root mean square error averaged across the n-folds (range).</p> <p>Table 4 Model performance for prediction of rifampicin AUC<subs>0-24h</subs> using varying amounts of information as input.</p> <p> <ephtml> &lt;table&gt;&lt;thead&gt;&lt;tr&gt;&lt;th align="center" style="border-top:solid thin;border-bottom:solid thin" /&gt;&lt;th colspan="3" align="center" style="border-top:solid thin;border-bottom:solid thin"&gt;GBM&lt;/th&gt;&lt;th colspan="3" align="center" style="border-top:solid thin;border-bottom:solid thin"&gt;XGBoost&lt;/th&gt;&lt;th colspan="3" align="center" style="border-top:solid thin;border-bottom:solid thin"&gt;Random Forest&lt;/th&gt;&lt;th colspan="3" align="center" style="border-top:solid thin;border-bottom:solid thin"&gt;LASSO&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th align="center" style="border-bottom:solid thin" /&gt;&lt;th align="center" style="border-bottom:solid thin"&gt;Scenario 4&lt;/th&gt;&lt;th align="center" style="border-bottom:solid thin"&gt;Scenario 5&lt;/th&gt;&lt;th align="center" style="border-bottom:solid thin"&gt;Scenario 6&lt;/th&gt;&lt;th align="center" style="border-bottom:solid thin"&gt;Scenario 4&lt;/th&gt;&lt;th align="center" style="border-bottom:solid thin"&gt;Scenario 5&lt;/th&gt;&lt;th align="center" style="border-bottom:solid thin"&gt;Scenario 6&lt;/th&gt;&lt;th align="center" style="border-bottom:solid thin"&gt;Scenario 4&lt;/th&gt;&lt;th align="center" style="border-bottom:solid thin"&gt;Scenario 5&lt;/th&gt;&lt;th align="center" style="border-bottom:solid thin"&gt;Scenario 6&lt;/th&gt;&lt;th align="center" style="border-bottom:solid thin"&gt;Scenario 4&lt;/th&gt;&lt;th align="center" style="border-bottom:solid thin"&gt;Scenario 5&lt;/th&gt;&lt;th align="center" style="border-bottom:solid thin"&gt;Scenario 6&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;&lt;italic&gt;R&lt;/italic&gt;&lt;sup&gt;2&lt;/sup&gt;&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;0.27&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;0.61&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;0.73&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;0.44&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;0.71&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;0.84&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;0.22&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;0.62&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;0.78&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;0.41&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;0.69&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;0.97&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;Pearson correlation&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;0.59&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;0.73&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;0.83&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;0.63&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;0.75&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;0.83&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;0.55&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;0.73&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;0.83&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;0.67&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;0.84&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;0.98&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;RMSE (h&amp;#183;mg/L)&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;131.7&lt;break /&gt;(86.9&amp;#8211;246.6)&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;103.0&lt;break /&gt;(49.8&amp;#8211;233.1)&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;88.2&lt;break /&gt;(41.7&amp;#8211;218.2)&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;121.0&lt;break /&gt;(57.7&amp;#8211;262.8)&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;92.6&lt;break /&gt;(38.9&amp;#8211;250.1)&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;69.6&lt;break /&gt;(21.0&amp;#8211;238.3)&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;137.1&lt;break /&gt;(76.8&amp;#8211;252.8)&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;103.5&lt;break /&gt;(48.5&amp;#8211;238.7)&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;79.9&lt;break /&gt;(30.2&amp;#8211;208.5)&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;117.9&lt;break /&gt;(76.0&amp;#8211;238.5)&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;86.8&lt;break /&gt;(48.3&amp;#8211;175.5)&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;29.1&lt;break /&gt;(20.7&amp;#8211;57.3)&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;MAE (h&amp;#183;mg/L)&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;85.5&lt;break /&gt;(74.4&amp;#8211;121.1)&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;61.3&lt;break /&gt;(43.2&amp;#8211;105.8)&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;47.6&lt;break /&gt;(21.0&amp;#8211;238.3)&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;76.7&lt;break /&gt;(47.1&amp;#8211;122.3)&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;52.6&lt;break /&gt;(30.5&amp;#8211;110.5)&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;30.4&lt;break /&gt;(13.3&amp;#8211;82.8)&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;84.6&lt;break /&gt;(63.1&amp;#8211;118.3)&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;59.4&lt;break /&gt;(39.6&amp;#8211;102.6)&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;38.3&lt;break /&gt;(22.5&amp;#8211;74.8)&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;74.2&lt;break /&gt;(56.5&amp;#8211;119.7)&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;54.5&lt;break /&gt;(38.4&amp;#8211;87.2)&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;18.8&lt;break /&gt;(15.2&amp;#8211;29.2)&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;Run time (s)&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;1.3&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;1.6&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;1.8&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;0.7&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;4.7&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;4.1&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;20.5&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;21.9&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;22.8&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;1.1&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;1.0&lt;/td&gt;&lt;td align="center" valign="middle" style="border-bottom:solid thin"&gt;1.1&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt; </ephtml> In scenario 4, the models were trained to predict rifampicin AUC<subs>0–24h</subs> based only on features (no plasma concentrations) at days 7 and 14. In scenario 5, the models were trained to predict rifampicin AUC<subs>0–24h</subs> based on features and 2 plasma concentrations at time-points 2 and 4 h post-dose at days 7 and 14. In scenario 6, the models were trained to predict rifampicin AUC<subs>0–24h</subs> based on features and 6 plasma concentrations at time-points 0.5, 1, 2, 4, 8 and 24 h post-dose at days 7 and 14. AUC<subs>0–24h</subs>, Area under the rifampicin plasma concentration-time curve from 0 to 24 h; MAE, mean absolute error averaged across the n-folds (range); RMSE, root mean square error averaged across the n-folds (range).</p> <hd id="AN0158912702-20">Author Contributions</hd> <p>Conceptualization, L.K., H.Y., A.F., J.N., S.G.W., G.M.-E., G.V., G.K.M., E.M.S., M.P.M. and U.S.H.S.; Data curation, L.K., H.Y., A.F., M.P.M. and U.S.H.S.; Formal analysis, L.K., H.Y., A.F., M.P.M. and U.S.H.S.; Methodology, L.K., H.Y., A.F., J.N., S.G.W., G.M.-E., G.V., G.K.M., E.M.S., M.P.M., U.S.H.S.; Supervision, M.P.M. and U.S.H.S.; Validation, L.K., H.Y., A.F., J.N., S.G.W., G.M.-E., G.V., G.K.M., E.M.S., M.P.M. and U.S.H.S.; Visualization, L.K. and H.Y.; Writing—original draft, L.K. and H.Y.; Writing—review and editing, L.K., H.Y., A.F., J.N., S.G.W., G.M.-E., G.V., G.K.M., E.M.S., M.P.M. and U.S.H.S. This work was carried out within the UNITE4TB consortium. All authors have read and agreed to the published version of the manuscript.</p> <hd id="AN0158912702-21">Institutional Review Board Statement</hd> <p>Not applicable.</p> <hd id="AN0158912702-22">Informed Consent Statement</hd> <p>Not applicable.</p> <hd id="AN0158912702-23">Data Availability Statement</hd> <p>The simulated dataset can be found in Supplementary Materials Data S1.</p> <hd id="AN0158912702-24">Conflicts of Interest</hd> <p>Conflict of interest applicable to GSK authors: G.M.E. and G.V. are employed by GlaxoSmithKline, Uxbridge, Middlesex, UK. The views and opinions presented in this manuscript do not reflect the company's position. The authors declare no conflict of interest.</p> <hd id="AN0158912702-25">Acknowledgments</hd> <p>This communication reflects the views of the UNITE4TB Consortium and neither IMI nor the EU and EFPIA are liable for any use that may be made of the information contained herein. The computations were enabled by resources in project [snic2020-5-524] provided by the Swedish National Infrastructure for Computing (SNIC) at UPPMAX, partially funded by the Swedish Research Council through grant agreement no. 2018-05973.</p> <hd id="AN0158912702-26">Supplementary Materials</hd> <p>The following supporting information can be downloaded at: https://<ulink href="http://www.mdpi.com/article/10.3390/pharmaceutics14081530/s1,">www.mdpi.com/article/10.3390/pharmaceutics14081530/s1,</ulink> Data S1: Simulated dataset, Text S2: Final model code. Figure S1. Illustration of the different types of variability in a nonlinear mixed effects model. The plasma concentration time-curve for a fictive drug is shown. The blue line represents the model predictions for the typical individual and the orange line the model predictions for one specific individual. The blue dots represent the observations of all individuals in the population and the orange dots the observations for the one specific individual. The covariates for the blue and orange lines are consistent. The difference between typical pharmacokinetic (PK) parameters (population PK parameters) and the individual PK parameters is expressed as inter-individual variability (IIV), and is reflected by the difference in drug exposure between the typical individual (blue line) and an individual (orange line). The remaining difference between individual model predictions (orange line) and observed concentrations (orange dots) are described by residual unexplained variability (RUV). Figure S2: Importance scores for evaluated features shown for the machine learning algorithms GBM, Random forest, XGBoost and LASSO for (A) AUC<subs>0–24h</subs> predictions using features only as input (scenario 4), (B) AUC<subs>0–24h</subs> predictions using 2 plasma concentrations as input (scenario 5), (C) AUC<subs>0–24h</subs> predictions using 6 plasma concentrations as input (scenario 6), (D) prediction of the plasma concentration-time series using 2 plasma concentrations as input (scenario 2), (E) prediction of the plasma concentration-time series using 6 plasma concentrations as input (scenario 3). AGE, age (years); BMI, body mass index (kg/m<sups>2</sups>); DOSE, daily rifampicin dose (mg); FFM, fat-free mass (kg); HIV, HIV-coinfection; HT, body height (cm); OCC, treatment week, RACE, race; SEX, gender; TAD, time after dose (h); WT, bodyweight (kg); TAD, time after dose (h); TAD_0.5, rifampicin plasma concentration at 0.5 h post-dose; TAD_1, rifampicin plasma concentration at 1 h post-dose; TAD_2, rifampicin plasma concentration at 2 h post-dose; TAD_4, rifampicin plasma concentration at 4 h post-dose; TAD_8, rifampicin plasma concentration at 8 h post-dose; TAD_24, rifampicin plasma concentration at 24 h post-dose. Figure S3: Individual rifampicin plasma concentrations predicted from the eXtreme Gradient Boosting (XGBoost) model (solid line and open circles) compared to the true concentrations (black closed circles). Panel (A) represents the predictions for each individual in the test dataset at the first week of rifampicin treatment for scenario 1 (predictions based on features only). Panel (B) represents the predictions for each individual in the test dataset at the second week of rifampicin treatment for scenario 1 (predictions based on features only). Panel (C) represents the predictions for each individual in the test dataset at the first week of rifampicin treatment for scenario 2 (predictions based on features and 2 rifampicin plasma concentrations). Panel (D) represents the predictions for each individual in the test dataset at the second week of rifampicin treatment for scenario 2 (predictions based on features and 2 rifampicin plasma concentrations). The different colors indicate the different daily rifampicin doses. Table S1. Final model hyperparameters for the different machine learning models.</p> <ref id="AN0158912702-27"> <title> Footnotes </title> <blist> <bibl id="bib1" idref="ref1" type="bt">1</bibl> <bibtext> Publisher's Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.</bibtext> </blist> </ref> <ref id="AN0158912702-28"> <title> References </title> <blist> <bibtext> Upton R.N., Mould D.R. Basic Concepts in Population Modeling, Simulation, and Model-Based Drug Development: Part 3—Introduction to Pharmacodynamic Modeling Methods. CPT Pharmacomet. Syst. Pharmacol. 2014; 3: e88. 10.1038/psp.2013.71. 24384783</bibtext> </blist> <blist> <bibl id="bib2" idref="ref27" type="bt">2</bibl> <bibtext> Meibohm B., Derendorf H. Basic concepts of pharmacokinetic/pharmacodynamic (PK/PD) modelling. Int. J. Clin. Pharmacol. Ther. 1997; 35: 401-413. 9352388</bibtext> </blist> <blist> <bibl id="bib3" idref="ref2" type="bt">3</bibl> <bibtext> Réda C., Kaufmann E., Delahaye-Duriez A. Machine learning applications in drug development. Comput. Struct. Biotechnol. J. 2020; 18: 241-252. 10.1016/j.csbj.2019.12.006. 33489002</bibtext> </blist> <blist> <bibl id="bib4" idref="ref3" type="bt">4</bibl> <bibtext> McComb M., Bies R., Ramanathan M. Machine learning in pharmacometrics: Opportunities and challenges. Br. J. Clin. Pharmacol. 2021; 88: 1482-1499. 10.1111/bcp.14801. 33634893</bibtext> </blist> <blist> <bibl id="bib5" idref="ref4" type="bt">5</bibl> <bibtext> Poynton M.R., Choi B., Kim Y., Park I., Noh G., Hong S., Boo Y., Kang S. Machine Learning Methods Applied to Pharmacokinetic Modelling of Remifentanil in Healthy Volunteers: A Multi-Method Comparison. J. Int. Med. Res. 2009; 37: 1680-1691. 10.1177/147323000903700603</bibtext> </blist> <blist> <bibl id="bib6" idref="ref7" type="bt">6</bibl> <bibtext> Woillard J.-B., Labriffe M., Prémaud A., Marquet P. Estimation of drug exposure by machine learning based on simulations from published pharmacokinetic models: The example of tacrolimus. Pharmacol. Res. 2021; 167: 105578. 10.1016/j.phrs.2021.105578</bibtext> </blist> <blist> <bibl id="bib7" type="bt">7</bibl> <bibtext> Woillard J., Labriffe M., Debord J., Marquet P. Tacrolimus Exposure Prediction Using Machine Learning. Clin. Pharmacol. Ther. 2020; 110: 361-369. 10.1002/cpt.2123</bibtext> </blist> <blist> <bibl id="bib8" idref="ref11" type="bt">8</bibl> <bibtext> Koch G., Pfister M., Daunhawer I., Wilbaux M., Wellmann S., Vogt J.E. Pharmacometrics and Machine Learning Partner to Advance Clinical Data Analysis. Clin. Pharmacol. Ther. 2020; 107: 926-933. 10.1002/cpt.1774</bibtext> </blist> <blist> <bibl id="bib9" idref="ref48" type="bt">9</bibl> <bibtext> Bies R.R., Muldoon M.F., Pollock B.G., Manuck S., Smith G., Sale M.E. A Genetic Algorithm-Based, Hybrid Machine Learning Approach to Model Selection. J. Pharmacokinet. Pharmacodyn. 2006; 33: 195-221. 10.1007/s10928-006-9004-6</bibtext> </blist> <blist> <bibtext> Sherer E.A., Sale M.E., Pollock B.G., Belani C., Egorin M.J., Ivy P.S., Lieberman J.A., Manuck S.B., Marder S.R., Muldoon M.F. Application of a single-objective, hybrid genetic algorithm approach to pharmacokinetic model building. J. Pharmacokinet. Pharmacodyn. 2012; 39: 393-414. 10.1007/s10928-012-9258-0</bibtext> </blist> <blist> <bibtext> Janssen A., Leebeek F., Cnossen M., Mathôt R. The Neural Mixed Effects Algorithm: Leveraging Machine Learning for Pharmacokinetic ModellingAvailable online: https://<ulink href="http://www.page-meeting.org/print%5fabstract.asp?abstract%5fid=9826(accessed">www.page-meeting.org/print%5fabstract.asp?abstract%5fid=9826(accessed</ulink> on 21 March 2022)</bibtext> </blist> <blist> <bibtext> Lu J., Bender B., Jin J.Y., Guan Y. Deep learning prediction of patient response time course from early data via neural-pharmacokinetic/pharmacodynamic modelling. Nat. Mach. Intell. 2021; 3: 696-704. 10.1038/s42256-021-00357-4</bibtext> </blist> <blist> <bibtext> World Health Organization. Guidelines for Treatment of Drug-Susceptible Tuberculosis and Patient Care; World Health Organization: Geneva, Switzerland. 2017</bibtext> </blist> <blist> <bibtext> Smythe W., Khandelwal A., Merle C., Rustomjee R., Gninafon M., Lo M.B., Sow O.B., Olliaro P.L., Lienhardt C., Horton J. A Semimechanistic Pharmacokinetic-Enzyme Turnover Model for Rifampin Autoinduction in Adult Tuberculosis Patients. Antimicrob. Agents Chemother. 2012; 56: 2091-2098. 10.1128/AAC.05792-11. 22252827</bibtext> </blist> <blist> <bibtext> Svensson R.J., Aarnoutse R.E., Diacon A.H., Dawson R., Gillespie S., Boeree M.J., Simonsson U.S.H. A Population Pharmacokinetic Model Incorporating Saturable Pharmacokinetics and Autoinduction for High Rifampicin Doses. Clin. Pharmacol. Ther. 2017; 103: 674-683. 10.1002/cpt.778. 28653479</bibtext> </blist> <blist> <bibtext> Chirehwa M.T., Rustomjee R., Mthiyane T., Onyebujoh P., Smith P., McIlleron H., Denti P. Model-Based Evaluation of Higher Doses of Rifampin Using a Semimechanistic Model Incorporating Autoinduction and Saturation of Hepatic Extraction. Antimicrob. Agents Chemother. 2016; 60: 487-494. 10.1128/AAC.01830-15</bibtext> </blist> <blist> <bibtext> Keutzer L., Simonsson U.S.H. Individualized Dosing with High Inter-Occasion Variability Is Correctly Handled With Model-Informed Precision Dosing—Using Rifampicin as an Example. Front. Pharmacol. 2020; 11: 794. 10.3389/fphar.2020.00794</bibtext> </blist> <blist> <bibtext> Barrett J.S., Fossler M.J., Cadieu K.D., Gastonguay M.R. Pharmacometrics: A Multidisciplinary Field to Facilitate Critical Thinking in Drug Development and Translational Research Settings. J. Clin. Pharmacol. 2008; 48: 632-649. 10.1177/0091270008315318</bibtext> </blist> <blist> <bibtext> Trivedi A., Lee R., Meibohm B. Applications of pharmacometrics in the clinical development and pharmacotherapy of anti-infectives. Expert Rev. Clin. Pharmacol. 2013; 6: 159-170. 10.1586/ecp.13.6</bibtext> </blist> <blist> <bibtext> Meibohm B., Derendorf H. Pharmacokinetic/Pharmacodynamic Studies in Drug Product Development. J. Pharm. Sci. 2002; 91: 18-31. 10.1002/jps.1167</bibtext> </blist> <blist> <bibtext> Romero K., Corrigan B., Tornoe C.W., Gobburu J.V., Danhof M., Gillespie W.R., Gastonguay M.R., Meibohm B., Derendorf H. Pharmacometrics as a discipline is entering the "industrialization" phase: Standards, automation, knowledge sharing, and training are critical for future success. J. Clin. Pharmacol. 2010; 50: 9S. 10.1177/0091270010377788</bibtext> </blist> <blist> <bibtext> Marshall S., Madabushi R., Manolis E., Krudys K., Staab A., Dykstra K., Visser S.A. Model-Informed Drug Discovery and Development: Current Industry Good Practice and Regulatory Expectations and Future Perspectives. CPT Pharmacomet. Syst. Pharmacol. 2019; 8: 87-96. 10.1002/psp4.12372</bibtext> </blist> <blist> <bibtext> Van Wijk R.C., Ayoun Alsoud R., Lennernäs H., Simonsson U.S.H. Model-Informed Drug Discovery and Development Strategy for the Rapid Development of Anti-Tuberculosis Drug Combinations. Appl. Sci. 2020; 102376. 10.3390/app10072376</bibtext> </blist> <blist> <bibtext> Stone J.A., Banfield C., Pfister M., Tannenbaum S., Allerheiligen S., Wetherington J.D., Krishna R., Grasela D.M. Model-Based Drug Development Survey Finds Pharmacometrics Impacting Decision Making in the Pharmaceutical Industry. J. Clin. Pharmacol. 2010; 50: 20S-30S. 10.1177/0091270010377628</bibtext> </blist> <blist> <bibtext> Pfister M., D'Argenio D.Z. The Emerging Scientific Discipline of Pharmacometrics. J. Clin. Pharmacol. 2010; 50: 6S. 10.1177/0091270010377789</bibtext> </blist> <blist> <bibtext> Wang Y., Zhu H., Madabushi R., Liu Q., Huang S., Zineh I. Model-Informed Drug Development: Current US Regulatory Practice and Future Considerations. Clin. Pharmacol. Ther. 2019; 105: 899-911. 10.1002/cpt.1363</bibtext> </blist> <blist> <bibtext> Lindstrom M.J., Bates D. Nonlinear Mixed Effects Models for Repeated Measures Data. Biometrics. 1990; 46: 673-687. 10.2307/2532087. 2242409</bibtext> </blist> <blist> <bibtext> Sheiner L.B., Rosenberg B., Melmon K.L. Modelling of individual pharmacokinetics for computer-aided drug dosage. Comput. Biomed. Res. 1972; 5: 441-459. 10.1016/0010-4809(72)90051-1</bibtext> </blist> <blist> <bibtext> Bauer R.J. NONMEM Tutorial Part II: Estimation Methods and Advanced Examples. CPT Pharmacomet. Syst. Pharmacol. 2019; 8: 538-556. 10.1002/psp4.12422</bibtext> </blist> <blist> <bibtext> Bauer R.J. NONMEM Tutorial Part I: Description of Commands and Options, With Simple Examples of Population Analysis. CPT Pharmacomet. Syst. Pharmacol. 2019; 8: 525-537. 10.1002/psp4.12404</bibtext> </blist> <blist> <bibtext> Gieschke R., Steimer J.-L. Pharmacometrics: Modelling and simulation tools to improve decision making in clinical drug development. Eur. J. Drug Metab. Pharmacokinet. 2000; 25: 49-58. 10.1007/BF03190058</bibtext> </blist> <blist> <bibtext> Rajman I. PK/PD modelling and simulations: Utility in drug development. Drug Discov. Today. 2008; 13: 341-346. 10.1016/j.drudis.2008.01.003</bibtext> </blist> <blist> <bibtext> Chien J.Y., Friedrich S., Heathman M.A., De Alwis D.P., Sinha V. Pharmacokinetics/pharmacodynamics and the stages of drug development: Role of modeling and simulation. AAPS J. 2005; 7: E544-E559. 10.1208/aapsj070355. 16353932</bibtext> </blist> <blist> <bibtext> Svensson R.J., Svensson E.M., Aarnoutse R.E., Diacon A.H., Dawson R., Gillespie S., Moodley M., Boeree M.J., Simonsson U.S.H. Greater Early Bactericidal Activity at Higher Rifampicin Doses Revealed by Modeling and Clinical Trial Simulations. J. Infect. Dis. 2018; 218: 991-999. 10.1093/infdis/jiy242. 29718390</bibtext> </blist> <blist> <bibtext> Maloney A., Karlsson M.O., Simonsson U.S.H. Optimal Adaptive Design in Clinical Drug Development: A Simulation Example. J. Clin. Pharmacol. 2007; 47: 1231-1243. 10.1177/0091270007308033. 17906158</bibtext> </blist> <blist> <bibtext> Bonate P.L. Clinical Trial Simulation in Drug Development. Pharm. Res. 2000; 17: 252-256. 10.1023/A:1007548719885</bibtext> </blist> <blist> <bibtext> Beal S., Sheiner L., Boeckmann A., Bauer R.. Nonmem 7.4 Users Guides [Internet]; ICON plc: Gaithersburg, MD, USA. 1989Available online: https://nonmem.iconplc.com/nonmem743/guides(accessed on 21 March 2022)</bibtext> </blist> <blist> <bibtext> Beal S.L., Sheiner L.B. Estimating population kinetics. Crit. Rev. Biomed. Eng. 1982; 8: 195-222</bibtext> </blist> <blist> <bibtext> Karlsson M.O., Holford N.H. A Tutorial on Visual Predictive ChecksAvailable online: <ulink href="http://www.page-meeting.org/?abstract=1434(accessed">www.page-meeting.org/?abstract=1434(accessed</ulink> on 21 March 2022)</bibtext> </blist> <blist> <bibtext> Holford N.H. The Visual Predictive Check—Superiority to Standard Diagnostic (Rorschach) PlotsAvailable online: <ulink href="http://www.page-meeting.org/?abstract=738(accessed">www.page-meeting.org/?abstract=738(accessed</ulink> on 21 March 2022)</bibtext> </blist> <blist> <bibtext> Post T.M., Freijer J.I., Ploeger B.A., Danhof M. Extensions to the Visual Predictive Check to facilitate model performance evaluation. J. Pharmacokinet. Pharmacodyn. 2008; 35: 185-202. 10.1007/s10928-007-9081-1</bibtext> </blist> <blist> <bibtext> Nguyen T.H.T., Mouksassi M.-S., Holford N., Al-Huniti N., Freedman I., Hooker A.C., John J., Karlsson M.O., Mould D.R., Pérez Ruixo J.J. Model Evaluation of Continuous Data Pharmacometric Models: Metrics and Graphics. CPT Pharmacomet. Amp. Syst. Pharmacol. 2017; 6: 87-109. 10.1002/psp4.12161</bibtext> </blist> <blist> <bibtext> Keizer R.J., Karlsson M.O., Hooker A. Modeling and Simulation Workbench for NONMEM: Tutorial on Pirana, PsN, and Xpose. CPT Pharmacomet. Syst. Pharmacol. 2013; 2: e50. 10.1038/psp.2013.24</bibtext> </blist> <blist> <bibtext> Reichstein M., Camps-Valls G., Stevens B., Jung M., Denzler J., Carvalhais N., Prabhat Deep learning and process understanding for data-driven Earth system science. Nature. 2019; 566: 195-204. 10.1038/s41586-019-0912-1</bibtext> </blist> <blist> <bibtext> Talevi A., Morales J.F., Hather G., Podichetty J.T., Kim S., Bloomingdale P.C., Kim S., Burton J., Brown J.D., Winterstein A.G. Machine Learning in Drug Discovery and Development Part 1: A Primer. CPT Pharmacomet. Syst. Pharmacol. 2020; 9: 129-142. 10.1002/psp4.12491</bibtext> </blist> <blist> <bibtext> Nasteski V. An overview of the supervised machine learning methods. Horizons B. 2017; 4: 51-62. 10.20544/HORIZONS.B.04.1.17.P05</bibtext> </blist> <blist> <bibtext> Lee S., El Naqa I. Machine Learning Methodology. Machine Learning in Radiation Oncology: Theory and ApplicationsEl Naqa I., Li R., Murphy M.J. ; Springer International Publishing: Cham, Switzerland. 2015: 21-39. 10.1007/978-3-319-18305-3_3</bibtext> </blist> <blist> <bibtext> Van Engelen J.E., Hoos H.H. A survey on semi-supervised learning. Mach. Learn. 2020; 109: 373-440. 10.1007/s10994-019-05855-6</bibtext> </blist> <blist> <bibtext> Vamathevan J., Clark D., Czodrowski P., Dunham I., Ferran E., Lee G., Li B., Madabhushi A., Shah P., Spitzer M. Applications of machine learning in drug discovery and development. Nat. Rev. Drug Discov. 2019; 18: 463-477. 10.1038/s41573-019-0024-5</bibtext> </blist> <blist> <bibtext> Koromina M., Pandi M.-T., Patrinos G.P. Rethinking Drug Repositioning and Development with Artificial Intelligence, Machine Learning, and Omics. Omics J. Integr. Biol. 2019; 23: 539-548. 10.1089/omi.2019.0151. 31651216</bibtext> </blist> <blist> <bibtext> Ekins S., Puhl A.C., Zorn K.M., Lane T.R., Russo D.P., Klein J.J., Hickey A.J., Clark A.M. Exploiting machine learning for end-to-end drug discovery and development. Nat. Mater. 2019; 18: 435-441. 10.1038/s41563-019-0338-z. 31000803</bibtext> </blist> <blist> <bibtext> Artificial Intelligence: A Modern Approach, Global Edition-Stuart Russell, Peter Norvig-Pocket (9781292153964)\|Adlibris Bokhandel [Internet]Available online: https://<ulink href="http://www.adlibris.com/se/bok/artificial-intelligence-a-modern-approach-global-edition-9781292153964?gclid=Cj0KCQjwsLWDBhCmARIsAPSL3%5f18T0hHwvmO8ajpXmAiu3d9il07p7BqlK%5foSHqol6BHokjL-OXZ1TkaAurjEALw%5fwcB(accessed">www.adlibris.com/se/bok/artificial-intelligence-a-modern-approach-global-edition-9781292153964?gclid=Cj0KCQjwsLWDBhCmARIsAPSL3%5f18T0hHwvmO8ajpXmAiu3d9il07p7BqlK%5foSHqol6BHokjL-OXZ1TkaAurjEALw%5fwcB(accessed</ulink> on 7 April 2021)</bibtext> </blist> <blist> <bibtext> Ripley B.D.. Pattern Recognition and Neural Networks; Cambridge University Press: Cambridge, UK. 1996Available online: https://<ulink href="http://www.cambridge.org/core/books/pattern-recognition-and-neural-networks/4E038249C9BAA06C8F4EE6F044D09C5C(accessed">www.cambridge.org/core/books/pattern-recognition-and-neural-networks/4E038249C9BAA06C8F4EE6F044D09C5C(accessed</ulink> on 7 April 2021)</bibtext> </blist> <blist> <bibtext> Baştanlar Y., Özuysal M. Introduction to Machine Learning. miRNomics: MicroRNA Biology and Computational AnalysisYousef M., Allmer J. ; Humana Press: Totowa, NJ, USA. 2014: 105-128. 10.1007/978-1-62703-748-8_7</bibtext> </blist> <blist> <bibtext> Hutmacher M.M., Kowalski K.G. Covariate selection in pharmacometric analyses: A review of methods. Br. J. Clin. Pharmacol. 2015; 79: 132-147. 10.1111/bcp.12451</bibtext> </blist> <blist> <bibtext> Liu Y., Chen P.-H., Krause J., Peng L. How to Read Articles That Use Machine Learning: Users' Guides to the Medical Literature. JAMA. 2019; 322: 1806-1816. 10.1001/jama.2019.16489</bibtext> </blist> <blist> <bibtext> Alajmi M.S., Almeshal A.M. Predicting the Tool Wear of a Drilling Process Using Novel Machine Learning XGBoost-SDA. Materials. 2020; 134952. 10.3390/ma13214952</bibtext> </blist> <blist> <bibtext> Hyperparameter Optimization in Machine Learning [Internet]. DataCamp Community. 2018Available online: https://<ulink href="http://www.datacamp.com/community/tutorials/parameter-optimization-machine-learning-models(accessed">www.datacamp.com/community/tutorials/parameter-optimization-machine-learning-models(accessed</ulink> on 7 April 2021)</bibtext> </blist> <blist> <bibtext> Yang L., Shami A. On hyperparameter optimization of machine learning algorithms: Theory and practice. Neurocomputing. 2020; 415: 295-316. 10.1016/j.neucom.2020.07.061</bibtext> </blist> <blist> <bibtext> Polikar R. Ensemble Learning. Ensemble Machine Learning: Methods and ApplicationsZhang C., Ma Y. ; Springer: Boston, MA, USA. 2012: 1-34. 10.1007/978-1-4419-9326-7_1</bibtext> </blist> <blist> <bibtext> Hjort N.L., Claeskens G. Frequentist Model Average Estimators. J. Am. Stat. Assoc. 2003; 98: 879-899. 10.1198/016214503000000828</bibtext> </blist> <blist> <bibtext> Chawla N.V., Bowyer K.W., Hall L.O., Kegelmeyer W.P. SMOTE: Synthetic Minority Over-sampling Technique. J. Artif. Intell. Res. 2002; 16: 321-357. 10.1613/jair.953</bibtext> </blist> <blist> <bibtext> Sheiner L.B., Beal S.L. Bayesian Individualization of Pharmacokinetics: Simple Implementation and Comparison with Non-Bayesian Methods. J. Pharm. Sci. 1982; 71: 1344-1348. 10.1002/jps.2600711209. 7153881</bibtext> </blist> <blist> <bibtext> Van Beek S.W., Ter Heine R., Keizer R.J., Magis-Escurra C., Aarnoutse R.E., Svensson E.M. Personalized Tuberculosis Treatment Through Model-Informed Dosing of Rifampicin. Clin. Pharmacokinet. 2019; 58: 815-826. 10.1007/s40262-018-00732-2</bibtext> </blist> <blist> <bibtext> Boeree M.J., Diacon A.H., Dawson R., Narunsky K., du Bois J., Venter A., Phillips P.P.J., Gillespie S.H., McHugh T.D., Hoelscher M. A Dose-Ranging Trial to Optimize the Dose of Rifampin in the Treatment of Tuberculosis. Am. J. Respir. Crit. Care Med. 2015; 191: 1058-1065. 10.1164/rccm.201407-1264OC. 25654354</bibtext> </blist> <blist> <bibtext> Sturkenboom M.G.G., Mulder L.W., De Jager A., Van Altena R., Aarnoutse R.E., De Lange W.C.M., Proost J.H., Kosterink J.G.W., Van Der Werf T.S., Jan-Willem C. Pharmacokinetic Modeling and Optimal Sampling Strategies for Therapeutic Drug Monitoring of Rifampin in Patients with Tuberculosis. Antimicrob. Agents Chemother. 2015; 59: 4907-4913. 10.1128/AAC.00756-15</bibtext> </blist> <blist> <bibtext> Wilkins J. Package 'Pmxtools' [Internet]. 2020Available online: https://github.com/kestrel99/pmxTools(accessed on 21 March 2022)</bibtext> </blist> <blist> <bibtext> Polley E. SuperLearner: Super Learner Prediction [Internet]. 2019Available online: https://CRAN.R-project.org/package=SuperLearner(accessed on 21 March 2022)</bibtext> </blist> <blist> <bibtext> R Development Core Team. R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria. 2013Available online: https://<ulink href="http://www.R-project.org/(accessed">www.R-project.org/(accessed</ulink> on 21 March 2022)</bibtext> </blist> <blist> <bibtext> Bedding A., Scott G., Brayshaw N., Leong L., Herrero-Martinez E., Looby M., Lloyd P. Clinical Trial Simulations—An Essential Tool in Drug DevelopmentAvailable online: https://<ulink href="http://www.abpi.org.uk/publications/clinical-trial-simulations-an-essential-tool-in-drug-development/(accessed">www.abpi.org.uk/publications/clinical-trial-simulations-an-essential-tool-in-drug-development/(accessed</ulink> on 21 March 2022)</bibtext> </blist> </ref> <aug> <p>By Lina Keutzer; Huifang You; Ali Farnoud; Joakim Nyberg; Sebastian G. Wicha; Gareth Maher-Edwards; Georgios Vlasakakis; Gita Khalili Moghaddam; Elin M. Svensson; Michael P. Menden and Ulrika S. H. Simonsson</p> <p>Reported by Author; Author; Author; Author; Author; Author; Author; Author; Author; Author; Author</p> </aug> <nolink nlid="nl1" bibid="bib10" firstref="ref12"></nolink> <nolink nlid="nl2" bibid="bib11" firstref="ref14"></nolink> <nolink nlid="nl3" bibid="bib13" firstref="ref15"></nolink> <nolink nlid="nl4" bibid="bib14" firstref="ref16"></nolink> <nolink nlid="nl5" bibid="bib16" firstref="ref17"></nolink> <nolink nlid="nl6" bibid="bib17" firstref="ref18"></nolink> <nolink nlid="nl7" bibid="bib18" firstref="ref19"></nolink> <nolink nlid="nl8" bibid="bib20" firstref="ref20"></nolink> <nolink nlid="nl9" bibid="bib22" firstref="ref21"></nolink> <nolink nlid="nl10" bibid="bib24" firstref="ref22"></nolink> <nolink nlid="nl11" bibid="bib26" firstref="ref23"></nolink> <nolink nlid="nl12" bibid="bib27" firstref="ref24"></nolink> <nolink nlid="nl13" bibid="bib29" firstref="ref31"></nolink> <nolink nlid="nl14" bibid="bib31" firstref="ref34"></nolink> <nolink nlid="nl15" bibid="bib33" firstref="ref35"></nolink> <nolink nlid="nl16" bibid="bib35" firstref="ref36"></nolink> <nolink nlid="nl17" bibid="bib36" firstref="ref37"></nolink> <nolink nlid="nl18" bibid="bib37" firstref="ref38"></nolink> <nolink nlid="nl19" bibid="bib39" firstref="ref39"></nolink> <nolink nlid="nl20" bibid="bib41" firstref="ref40"></nolink> <nolink nlid="nl21" bibid="bib42" firstref="ref41"></nolink> <nolink nlid="nl22" bibid="bib44" firstref="ref43"></nolink> <nolink nlid="nl23" bibid="bib45" firstref="ref44"></nolink> <nolink nlid="nl24" bibid="bib46" firstref="ref45"></nolink> <nolink nlid="nl25" bibid="bib47" firstref="ref46"></nolink> <nolink nlid="nl26" bibid="bib48" firstref="ref47"></nolink> <nolink nlid="nl27" bibid="bib49" firstref="ref50"></nolink> <nolink nlid="nl28" bibid="bib51" firstref="ref51"></nolink> <nolink nlid="nl29" bibid="bib52" firstref="ref59"></nolink> <nolink nlid="nl30" bibid="bib54" firstref="ref60"></nolink> <nolink nlid="nl31" bibid="bib55" firstref="ref63"></nolink> <nolink nlid="nl32" bibid="bib56" firstref="ref64"></nolink> <nolink nlid="nl33" bibid="bib57" firstref="ref65"></nolink> <nolink nlid="nl34" bibid="bib58" firstref="ref66"></nolink> <nolink nlid="nl35" bibid="bib59" firstref="ref67"></nolink> <nolink nlid="nl36" bibid="bib15" firstref="ref70"></nolink> <nolink nlid="nl37" bibid="bib64" firstref="ref71"></nolink> <nolink nlid="nl38" bibid="bib65" firstref="ref73"></nolink> <nolink nlid="nl39" bibid="bib66" firstref="ref76"></nolink> <nolink nlid="nl40" bibid="bib67" firstref="ref77"></nolink> <nolink nlid="nl41" bibid="bib68" firstref="ref78"></nolink> <nolink nlid="nl42" bibid="bib43" firstref="ref92"></nolink> <nolink nlid="nl43" bibid="bib69" firstref="ref93"></nolink> <nolink nlid="nl44" bibid="bib70" firstref="ref95"></nolink> <nolink nlid="nl45" bibid="bib12" firstref="ref108"></nolink> CustomLinks: – Url: https://resolver.ebsco.com/c/xy5jbn/result?sid=EBSCO:edsdoj&genre=article&issn=19994923&ISBN=&volume=14&issue=8&date=20220701&spage=1530&pages=1530-1530&title=Pharmaceutics&atitle=Machine%20Learning%20and%20Pharmacometrics%20for%20Prediction%20of%20Pharmacokinetic%20Data%3A%20Differences%2C%20Similarities%20and%20Challenges%20Illustrated%20with%20Rifampicin&aulast=Lina%20Keutzer&id=DOI:10.3390/pharmaceutics14081530 Name: Full Text Finder (for New FTF UI) (s8985755) Category: fullText Text: Find It @ SCU Libraries MouseOverText: Find It @ SCU Libraries – Url: https://doaj.org/article/25a790d93aeb473e9a0152c76324f8f0 Name: EDS - DOAJ (s8985755) Category: fullText Text: View record from DOAJ MouseOverText: View record from DOAJ
Header	DbId: edsdoj DbLabel: Directory of Open Access Journals An: edsdoj.25a790d93aeb473e9a0152c76324f8f0 RelevancyScore: 922 AccessLevel: 3 PubType: Academic Journal PubTypeId: academicJournal PreciseRelevancyScore: 922.127746582031
IllustrationInfo
Items	– Name: Title Label: Title Group: Ti Data: Machine Learning and Pharmacometrics for Prediction of Pharmacokinetic Data: Differences, Similarities and Challenges Illustrated with Rifampicin – Name: Author Label: Authors Group: Au Data: <searchLink fieldCode="AR" term="%22Lina+Keutzer%22">Lina Keutzer</searchLink><br /><searchLink fieldCode="AR" term="%22Huifang+You%22">Huifang You</searchLink><br /><searchLink fieldCode="AR" term="%22Ali+Farnoud%22">Ali Farnoud</searchLink><br /><searchLink fieldCode="AR" term="%22Joakim+Nyberg%22">Joakim Nyberg</searchLink><br /><searchLink fieldCode="AR" term="%22Sebastian+G%2E+Wicha%22">Sebastian G. Wicha</searchLink><br /><searchLink fieldCode="AR" term="%22Gareth+Maher-Edwards%22">Gareth Maher-Edwards</searchLink><br /><searchLink fieldCode="AR" term="%22Georgios+Vlasakakis%22">Georgios Vlasakakis</searchLink><br /><searchLink fieldCode="AR" term="%22Gita+Khalili+Moghaddam%22">Gita Khalili Moghaddam</searchLink><br /><searchLink fieldCode="AR" term="%22Elin+M%2E+Svensson%22">Elin M. Svensson</searchLink><br /><searchLink fieldCode="AR" term="%22Michael+P%2E+Menden%22">Michael P. Menden</searchLink><br /><searchLink fieldCode="AR" term="%22Ulrika+S%2E+H%2E+Simonsson%22">Ulrika S. H. Simonsson</searchLink><br /><searchLink fieldCode="AR" term="%22on+behalf+of+the+UNITE4TB+Consortium%22">on behalf of the UNITE4TB Consortium</searchLink> – Name: TitleSource Label: Source Group: Src Data: Pharmaceutics, Vol 14, Iss 8, p 1530 (2022) – Name: Publisher Label: Publisher Information Group: PubInfo Data: MDPI AG, 2022. – Name: DatePubCY Label: Publication Year Group: Date Data: 2022 – Name: Subset Label: Collection Group: HoldingsInfo Data: LCC:Pharmacy and materia medica – Name: Subject Label: Subject Terms Group: Su Data: <searchLink fieldCode="DE" term="%22machine+learning%22">machine learning</searchLink><br /><searchLink fieldCode="DE" term="%22pharmacometrics%22">pharmacometrics</searchLink><br /><searchLink fieldCode="DE" term="%22population+pharmacokinetics%22">population pharmacokinetics</searchLink><br /><searchLink fieldCode="DE" term="%22rifampicin%22">rifampicin</searchLink><br /><searchLink fieldCode="DE" term="%22pharmacokinetics%22">pharmacokinetics</searchLink><br /><searchLink fieldCode="DE" term="%22simulation%22">simulation</searchLink><br /><searchLink fieldCode="DE" term="%22Pharmacy+and+materia+medica%22">Pharmacy and materia medica</searchLink><br /><searchLink fieldCode="DE" term="%22RS1-441%22">RS1-441</searchLink> – Name: Abstract Label: Description Group: Ab Data: Pharmacometrics (PM) and machine learning (ML) are both valuable for drug development to characterize pharmacokinetics (PK) and pharmacodynamics (PD). Pharmacokinetic/pharmacodynamic (PKPD) analysis using PM provides mechanistic insight into biological processes but is time- and labor-intensive. In contrast, ML models are much quicker trained, but offer less mechanistic insights. The opportunity of using ML predictions of drug PK as input for a PKPD model could strongly accelerate analysis efforts. Here exemplified by rifampicin, a widely used antibiotic, we explore the ability of different ML algorithms to predict drug PK. Based on simulated data, we trained linear regressions (LASSO), Gradient Boosting Machines, XGBoost and Random Forest to predict the plasma concentration-time series and rifampicin area under the concentration-versus-time curve from 0–24 h (AUC0–24h) after repeated dosing. XGBoost performed best for prediction of the entire PK series (R2: 0.84, root mean square error (RMSE): 6.9 mg/L, mean absolute error (MAE): 4.0 mg/L) for the scenario with the largest data size. For AUC0–24h prediction, LASSO showed the highest performance (R2: 0.97, RMSE: 29.1 h·mg/L, MAE: 18.8 h·mg/L). Increasing the number of plasma concentrations per patient (0, 2 or 6 concentrations per occasion) improved model performance. For example, for AUC0–24h prediction using LASSO, the R2 was 0.41, 0.69 and 0.97 when using predictors only (no plasma concentrations), 2 or 6 plasma concentrations per occasion as input, respectively. Run times for the ML models ranged from 1.0 s to 8 min, while the run time for the PM model was more than 3 h. Furthermore, building a PM model is more time- and labor-intensive compared with ML. ML predictions of drug PK could thus be used as input into a PKPD model, enabling time-efficient analysis. – Name: TypeDocument Label: Document Type Group: TypDoc Data: article – Name: Format Label: File Description Group: SrcInfo Data: electronic resource – Name: Language Label: Language Group: Lang Data: English – Name: ISSN Label: ISSN Group: ISSN Data: 1999-4923 – Name: NoteTitleSource Label: Relation Group: SrcInfo Data: https://www.mdpi.com/1999-4923/14/8/1530; https://doaj.org/toc/1999-4923 – Name: DOI Label: DOI Group: ID Data: 10.3390/pharmaceutics14081530 – Name: URL Label: Access URL Group: URL Data: <link linkTarget="URL" linkTerm="https://doaj.org/article/25a790d93aeb473e9a0152c76324f8f0" linkWindow="_blank">https://doaj.org/article/25a790d93aeb473e9a0152c76324f8f0</link> – Name: AN Label: Accession Number Group: ID Data: edsdoj.25a790d93aeb473e9a0152c76324f8f0
PLink	https://login.libproxy.scu.edu/login?url=https://search.ebscohost.com/login.aspx?direct=true&site=eds-live&scope=site&db=edsdoj&AN=edsdoj.25a790d93aeb473e9a0152c76324f8f0
RecordInfo	BibRecord: BibEntity: Identifiers: – Type: doi Value: 10.3390/pharmaceutics14081530 Languages: – Text: English PhysicalDescription: Pagination: PageCount: 1 StartPage: 1530 Subjects: – SubjectFull: machine learning Type: general – SubjectFull: pharmacometrics Type: general – SubjectFull: population pharmacokinetics Type: general – SubjectFull: rifampicin Type: general – SubjectFull: pharmacokinetics Type: general – SubjectFull: simulation Type: general – SubjectFull: Pharmacy and materia medica Type: general – SubjectFull: RS1-441 Type: general Titles: – TitleFull: Machine Learning and Pharmacometrics for Prediction of Pharmacokinetic Data: Differences, Similarities and Challenges Illustrated with Rifampicin Type: main BibRelationships: HasContributorRelationships: – PersonEntity: Name: NameFull: Lina Keutzer – PersonEntity: Name: NameFull: Huifang You – PersonEntity: Name: NameFull: Ali Farnoud – PersonEntity: Name: NameFull: Joakim Nyberg – PersonEntity: Name: NameFull: Sebastian G. Wicha – PersonEntity: Name: NameFull: Gareth Maher-Edwards – PersonEntity: Name: NameFull: Georgios Vlasakakis – PersonEntity: Name: NameFull: Gita Khalili Moghaddam – PersonEntity: Name: NameFull: Elin M. Svensson – PersonEntity: Name: NameFull: Michael P. Menden – PersonEntity: Name: NameFull: Ulrika S. H. Simonsson – PersonEntity: Name: NameFull: on behalf of the UNITE4TB Consortium IsPartOfRelationships: – BibEntity: Dates: – D: 01 M: 07 Type: published Y: 2022 Identifiers: – Type: issn-print Value: 19994923 Numbering: – Type: volume Value: 14 – Type: issue Value: 8 Titles: – TitleFull: Pharmaceutics Type: main
ResultId	1