A spatiotemporal recommendation engine for malaria control Free

4.3. Global utility function

The global utility is defined as the summation of all local utilities but also with a penalty on the differences of resource allocations among neighborhood health zones:

$$\begin{equation*}\label{global_util}U_G(\mathbf{a}_t,\mathbf{s}_t; \boldsymbol{\alpha})=\sum_{l}U(a_{lt},p_{lt}(\mathbf{s}_t, \boldsymbol{\alpha}))-\alpha_0\sum_{i\sim j}(a_{it}-a_{jt})^2, \end{equation*}$$

where |$\alpha_0\geq0$|⁠, |$ \boldsymbol{\alpha}=(\alpha_0,\dots,\alpha_q)$|⁠, and |$i\sim j$| indicates that zone |$i$| and zone |$j$| are neighbors. The penalty term with a positive weight can smooth the resource allocation and account for fairness. A large penalty |$\alpha_0$| encourages spatial clusters of zone to be given intense treatments which could be more effective than scattered sites with intensive treatments. As the weight of the penalty term is to be optimized, this additional term allows for a more flexible class of policies to be considered.

The RE suggests allocations that maximize the global utility subject to the resource constraints on the total number of bednets distributed to the whole region:

$$\begin{equation}\label{constr} \begin{aligned} \pi(\mathbf{s}_t; \boldsymbol{\alpha})=\underset{\mathbf{a}_{t}}{\arg\max}\text{ }U_G(\mathbf{a}_t,\mathbf{s}_t; \boldsymbol{\alpha}) {\quad}\text{such that } \sum_{l}a_{lt}N_{lt}/(\sum_{l}N_{lt})\leq \mathcal{C}\text{ and }0\leq a_{lt}\leq 1.\end{aligned}\end{equation}$$

(4.4)

Other nuances in the resource allocation decision making can be easily taken into consideration by adding more constraints to (4.4). For example, if it is agreed that allocating bednets to zones with malaria prevalence |$<1$|% is ill-advised, we can add a constraint that |$a_{lt}=0$| if |$z_{lt}<0.01$|⁠.

5. Policy search

The optimal priority score weights |$ \boldsymbol{\alpha}$| minimizes an optimality criterion |$L( \boldsymbol{\alpha})$|⁠, such as the estimated expected cumulative (over space and time) malaria prevalence over the next 5 years. Therefore, although the policy only gives the resource allocation of one time point, the policy optimizes long-term outcomes. Given the posterior samples of the parameters in the Bayesian spatiotemporal model, the future malaria prevalence can be simulated to construct an estimator of the expected cumulative prevalence over space and time |$\tilde{L}( \boldsymbol{\alpha})$|⁠. The plug-in estimator of the optimal weights is |$\tilde{ \boldsymbol{\alpha}}_{\rm opt}=\underset{ \boldsymbol{\alpha}}{\arg\min}\text{ }{\tilde{L}( \boldsymbol{\alpha})}$|⁠. For numerical stability, we replace |$\tilde{L}( \boldsymbol{\alpha})$| with |$\hat{L}( \boldsymbol{\alpha})=\tilde{L}( \boldsymbol{\alpha})+0.0001\sum_{i=0}^{q}\alpha_i^2$|⁠. Then |$\widehat{ \boldsymbol{\alpha}}_{\rm opt}$| is defined as the minimizer of |$\hat{L}( \boldsymbol{\alpha})$|⁠.

Similar to the optimal allocation strategy estimation method in Laber and others (2018), we also draw samples from the posterior distribution over the postulated system dynamic model and use simulation to estimate the allocation strategy that optimizes long-term outcome. But different from the setting in Laber and others (2018) where the decision space is a binary vector, the action space in our setting is a high dimensional continuous space. We use a Kriging-based optimization method (Picheny and others, 2013), which achieves balance between exploration and exploitation, along with the simulation to estimate |$ \boldsymbol{\alpha}$| which minimizes |$\hat{L}( \boldsymbol{\alpha})$|⁠. At the first step, we evaluate the value for the initial |$100$| points of |$ \boldsymbol{\alpha}$| from a Latin hypercube design generated using “optimumLHS” function in the R package “LHS.” We do a linear transformation of the initial design by setting the range of |$\alpha_1,\dots,\alpha_q\in[-5,5]$| and |$\alpha_0\in[0,1]$|⁠. The value |$\hat{L}_k$| corresponding to each point |$ \boldsymbol{\alpha}_k$| in the design is estimated using Monte Carlo simulation given the posterior samples of the parameters. By using a different posterior sample of the model parameters for each simulated trajectory in the Monte Carlo simulation, trajectories are samples from the full posterior predictive distribution of future prevalence and thus our policy accounts for both parametric and aleatoric uncertainty. Using the initial training data |$\{ \boldsymbol{\alpha}_k,\hat{L}_k\}$|⁠, a Gaussian process regression model is fit to predict the value corresponding to a new |$ \boldsymbol{\alpha}$| using kriging. The algorithm sequentially selects next weight |$ \boldsymbol{\alpha}$| to visit that optimizes the expected improvement (EI) defined in Jones and others (1998) and updates the Gaussian process model parameters at each iteration. We use the the R package “DiceOptim” (Roustant and others, 2012) to implement the optimization procedure.

At each of the future years, if new data are available, the Bayesian spatiotemporal model can be refitted and the posterior distribution of the parameters can be updated. The resource allocation decision for the future years can be made by reoptimizing the priority score weights based on the updated posterior samples.

Each posterior sample of the parameters and predicted malaria prevalence in the Markov chain Monte Carlo (MCMC) procedure corresponds to an optimal |$ \boldsymbol{\alpha}$|⁠. We can quantify the uncertainty of |$\hat{ \boldsymbol{\alpha}}_{\rm opt}$| by applying the optimization procedure for each posterior draw to get the posterior distribution for |$ \boldsymbol{\alpha}_{\rm opt}$|⁠.

6. Simulation

6.1. Generative model

For the simulation, we assume that there are |$n=100$| health zones that are arranged as a |$10\times10$| square grid with grid spacing 1 between adjacent sites and include only one environmental covariate |$X_{l1}$| simulated from the Gaussian process with mean zero and variance one, and correlation |${\rm Cor}(X_{l1},X_{j1})=\exp(-d_{lj}/2)$|⁠, where |$d_{lj}$| is the distance between the centroids of zone |$l$| and zone |$j$|⁠. We simulate the baseline latent process for the health zones from a Gaussian process such that |$ \boldsymbol{\eta}_0=(\eta_{10},\dots,\eta_{n0})^T\sim MVN(0,0.5\label{key}^2(M-0.9G)^{-1})$| and the corresponding logit transformation of the disease rates are simulated from |$Y_{l0}\sim N(\eta_{l0},0.01^2)$| for |$l=1,\dots,n$|⁠. We simulate the disease spread for |$T=5$| years following the generative model such that for |$l=1,\dots,100$|⁠, |$t=1,\dots,5$|⁠, |$\eta_{lt}=(0.9-0.1A_{lt})\eta_{lt-1}+(0.1-0.1A_{lt})/m_l\sum_{j\in N_l }\eta_{jt-1}+0.2-0.7A_{lt}+0.12X_{1l}-0.1X_{1l}A_{lt}+\epsilon_{lt}$|⁠, where |$ \boldsymbol{\epsilon}_t\sim MVN(0,0.1^2(M-0.9 G)^{-1})$|⁠, and resource allocation is assumed increasing over time (to mimic the real malaria data) such that |$A_{lt}$| are simulated from |$N(0.1*t,0.05^2)$| and are truncated at |$0$| and |$1$|⁠. The disease rates for the 5 years are simulated from |$Z_{lt}\sim N(Y_{lt},0.01^2)$| for |$l=1,\dots,n$| and |$t=1,\dots,5$|⁠. We simulate |$100$| data sets using this generative model.

6.2. Policy estimation

We consider three risk factors in the policy: the environmental covariate |$f_{1lt}=X_{l1}$|⁠, logit of disease rate at previous time point |$f_{2lt}=Y_{lt-1}$|⁠, mean logit of disease rates of neighborhood zones at previous time point |$f_{3lt}=\sum_{j\sim l}Y_{jt-1}/m_l$|⁠. The priority score of zone |$l$| at time |$t$| is then |$1/\{1+\exp[-(\alpha_1X_{l1}+\alpha_2 Y_{lt-1}+\alpha_3\sum_{j\sim l}Y_{jt-1}/m_l)]\}$|⁠. We assume that the number of individuals in different health zones are the same and give the constraint that |$\frac{1}{n}\sum_lA_{lt}\leq \mathcal{C}$|⁠. We consider three scenarios with different values of resource constraint level |$\mathcal{C}=0.2,0.5,$| and |$0.8$|⁠.

We consider the following resource allocation policies for comparison:

Linear utility (Linear): our proposed policy with linear local utility function and a spatial penalty term to smooth the resource allocation.
Quadratic utility (Quad): our proposed policy with quadratic local utility function and a spatial penalty term to smooth the resource allocation.
Highest rate (Highest_rate): assign a bednet to each individual in the |$n\mathcal{C}$| zones with highest disease rates and no bednets to the remaining zones.
Even: assign the same percentage of bednets |$\mathcal{C}$| to each healthzone.

For each simulated data set, we use all information simulated up to year |$T=5$| as the training data to fit our proposed Bayesian spatiotemporal model using MCMC sampling with |$5000$| iterations. For the first two policies, the optimality criterion |$L( \boldsymbol{\alpha})$| for each |$ \boldsymbol{\alpha}$| is estimated using Monte Carlo simulations given the posterior samples. Sequential optimization is used to estimate the optimal policy that minimizes the expected mean malaria prevalence in the future 5 years within the pre-specified class. The loss value associated with estimated policy are approximated by |$1000$| Monte Carlo simulations given the true generative model. For the last two policies, there is no need to fit the model and no parameters to estimate. We just use |$1000$| Monte Carlo simulations given the true generative model to approximate the expected mean malaria prevalence in the future 5 years under the policy. The approximated loss values associated with the four policies for |$i$|th simulated data set are denoted as |$L_{l}^i,L_{q}^i,L_{hr}^i,L_{ev}^i$|⁠, respectively.

We use the two naive policies “Highest_rate” and “Even” as two baseline policies and show the improvement of our proposed policies in terms of loss value compared with the baseline policy. For each of the simulated data sets, we compute the improvement as |$(L_{hr}^i-L_{l}^i)/L_{hr}^i$|⁠, |$(L_{ev}^i-L_{l}^i)/L_{ev}^i$| and |$(L_{hr}^i-L_{q}^i)/L_{hr}^i$|⁠, |$(L_{ev}^i-L_{q}^i)/L_{ev}^i$|⁠. Figure 2 plots the sampling distribution of the improvement of our proposed policies with different utility functions. Under this simulation setting, “Highest_rate” policy is preferred over “Even” policy. But we can see our proposed policies have significant improvement compared with either of the naive policies, especially when there are moderate level of total resources (⁠|$\mathcal{C}=0.5$|⁠). The policy with the linear utility function works slightly better than the policy with quadratic utility function when the total resource level is low (⁠|$\mathcal{C}=0.2$|⁠). This indicates more extreme resource allocation might improve the overall benefits under this specific simulation setting.

$The improvement of the proposed policies with linear utility function or quadratic utility function compared with “Highest_rate” policy (left) and “Even” policy (right) with different resource constraints $\mathcal{C}=0.2, 0.5$, and $0.8$ when the model is correctly specified.$

Fig. 2.

The improvement of the proposed policies with linear utility function or quadratic utility function compared with “Highest_rate” policy (left) and “Even” policy (right) with different resource constraints |$\mathcal{C}=0.2, 0.5$|⁠, and |$0.8$| when the model is correctly specified.

Supplemental materials available at Biostatistics online include an additional simulation study assuming the disease transmission model is misspecified to check the robustness of our method. From the simulation results, we can see under different simulation settings, our proposed policies are significantly better than naive policies and also consider fairness by allocating the resources more smoothly.

7. Application to the DRC data

We illustrate our method using data in the Democratic Republic of the Congo (DRC) primarily based on Demographic Health Surveys (DHS). DHS are cross-sectional, population-based cluster household surveys. In each survey, clusters are randomly chosen to be representative of the national population. Within each cluster, households are randomly selected to participate in the survey. Two DHS program surveys—one in 2007 and another in 2014—were conducted to study malaria prevalence and treatment allocations in DRC. In the survey, structured questionnaires are administered to selected households to collect malaria-related information, such as their treatment status including bednet use. Also, dried blood spots were collected to test the malaria status.

Bhatt and others (2015) made use of the data from the DHS program surveys and five additional non-DHS program surveys and built a Bayesian hierarchical model to construct a malaria endemicity map across Africa from 2000 to 2015 in terms of Plasmodium falciparum parasite rate (PfPR). Interventions coverage levels from 2000 to 2015, including ITN, IRS, and ACT, are also estimated. We download the surface data of PfPR and ITN for DRC from https://map.ox.ac.uk/country-profiles/#!/COD. In the surface data, PfPR and ITN rate are estimated at a 5 km by 5 km resolution. Using the smoothed surfaces as data may bias our parameter estimates, but they greatly expand the spatiotemporal coverage of our data which is needed to build a disease progression model.

As malaria intervention resources are allocated in health zone level in DRC, we map the surface data of PfPR (⁠|$Y_{lt}$|⁠) and ITN coverage rate (⁠|$A_{lt}$|⁠) to each of the 515 health zones in DRC by taking the average values of the small cells lying in each health zone as the value of PfPR or ITN coverage rate in the corresponding health zone. We make the assumption that the populations at each 5 km by 5 km cell within a health zone are the same so that the mean rate of the cells can be used as the rate of the health zone. As there are almost no ITN usage before 2007, we only use the mapped data from 2007 to 2015 to train the Bayesian spatiotemporal model. In the model, we include two environmental covariates (⁠|$\mathbf{X}_1$| and |$\mathbf{X}_2$|⁠): annual average temperature and annual average precipitation. As the annual average temperature and annual average precipitation are relatively stable in a certain area, we assume the annual average temperature and annual average precipitation are constant over time in each health zone. We download the monthly worldwide average temperature and precipitation data for 1970–2000 from http://worldclim.org/version2 and map them to health zone level. We use the standardized mean annual average temperature and precipitation in 1970–2000 as the constant environment covariate values used in the model training and prediction.

The collected and processed data are fitted to the model in (3.1), (3.2), and (3.3). We use 5000 iterations in Gibbs sampling and discard first burn-in 2000 samples to obtain 3000 posterior samples. Table 1 summarizes the posterior mean and 95% credible interval of the parameters in the model. Temperature, previous disease status, and previous neighborhood disease status all have significant effects on the disease spread and the intervention ITN can significantly decrease the disease rate. Environmental covariates and previous disease status also have interaction effects with ITN on the disease progression.

Table 1.

Open in new tab

The posterior mean and 95% credible interval for parameters. The posterior mean with “|$\star$|” represents the corresponding |$95\%$|credible intervals that excludes zero

Response	Mean	95% CI		Mean	95% CI
Intercept (⁠\|$c_0$\|⁠)	−0.131\|$^\star$\|	(−0.192, −0.070)	\|$\sigma_e^2$\|	0.011\|$^\star$\|	(0.010, 0.012)
ITN (⁠\|$b_0$\|⁠)	−0.302\|$^\star$\|	(−0.366, −0.234)	\|$\sigma_2^2$\|	0.142\|$^\star$\|	(0.139, 0.146)
Temperature (⁠\|$\beta_{11}$\|⁠)	0.033\|$^\star$\|	(0.020, 0.045)	\|$\rho$\|	0.999\|$^\star$\|	(0.998, 0.999)
Precipitation (⁠\|$\beta_{12}$\|⁠)	0.003	(−0.012, 0.018)
Temperature*ITN (⁠\|$\beta_{21}$\|⁠)	−0.097\|$^\star$\|	(−0.125, −0.069)
Precipitation*ITN(⁠\|$\beta_{22}$\|⁠)	−0.053\|$^\star$\|	(−0.088, −0.019)
Previous (⁠\|$1+c_1$\|⁠)	0.917\|$^\star$\|	(0.902, 0.932)
Previous*ITN (⁠\|$b_1$\|⁠)	0.096\|$^\star$\|	(0.062, 0.129)
Prev_neighbor (⁠\|$c_2$\|⁠)	0.040\|$^\star$\|	(0.016, 0.065)
Prev_neighbor*ITN (⁠\|$b_2$\|⁠)	−0.092\|$^\star$\|	(−0.145, −0.039)

Response	Mean	95% CI		Mean	95% CI
Intercept (⁠\|$c_0$\|⁠)	−0.131\|$^\star$\|	(−0.192, −0.070)	\|$\sigma_e^2$\|	0.011\|$^\star$\|	(0.010, 0.012)
ITN (⁠\|$b_0$\|⁠)	−0.302\|$^\star$\|	(−0.366, −0.234)	\|$\sigma_2^2$\|	0.142\|$^\star$\|	(0.139, 0.146)
Temperature (⁠\|$\beta_{11}$\|⁠)	0.033\|$^\star$\|	(0.020, 0.045)	\|$\rho$\|	0.999\|$^\star$\|	(0.998, 0.999)
Precipitation (⁠\|$\beta_{12}$\|⁠)	0.003	(−0.012, 0.018)
Temperature*ITN (⁠\|$\beta_{21}$\|⁠)	−0.097\|$^\star$\|	(−0.125, −0.069)
Precipitation*ITN(⁠\|$\beta_{22}$\|⁠)	−0.053\|$^\star$\|	(−0.088, −0.019)
Previous (⁠\|$1+c_1$\|⁠)	0.917\|$^\star$\|	(0.902, 0.932)
Previous*ITN (⁠\|$b_1$\|⁠)	0.096\|$^\star$\|	(0.062, 0.129)
Prev_neighbor (⁠\|$c_2$\|⁠)	0.040\|$^\star$\|	(0.016, 0.065)
Prev_neighbor*ITN (⁠\|$b_2$\|⁠)	−0.092\|$^\star$\|	(−0.145, −0.039)

Table 1.

Open in new tab

The posterior mean and 95% credible interval for parameters. The posterior mean with “|$\star$|” represents the corresponding |$95\%$|credible intervals that excludes zero

Response	Mean	95% CI		Mean	95% CI
Intercept (⁠\|$c_0$\|⁠)	−0.131\|$^\star$\|	(−0.192, −0.070)	\|$\sigma_e^2$\|	0.011\|$^\star$\|	(0.010, 0.012)
ITN (⁠\|$b_0$\|⁠)	−0.302\|$^\star$\|	(−0.366, −0.234)	\|$\sigma_2^2$\|	0.142\|$^\star$\|	(0.139, 0.146)
Temperature (⁠\|$\beta_{11}$\|⁠)	0.033\|$^\star$\|	(0.020, 0.045)	\|$\rho$\|	0.999\|$^\star$\|	(0.998, 0.999)
Precipitation (⁠\|$\beta_{12}$\|⁠)	0.003	(−0.012, 0.018)
Temperature*ITN (⁠\|$\beta_{21}$\|⁠)	−0.097\|$^\star$\|	(−0.125, −0.069)
Precipitation*ITN(⁠\|$\beta_{22}$\|⁠)	−0.053\|$^\star$\|	(−0.088, −0.019)
Previous (⁠\|$1+c_1$\|⁠)	0.917\|$^\star$\|	(0.902, 0.932)
Previous*ITN (⁠\|$b_1$\|⁠)	0.096\|$^\star$\|	(0.062, 0.129)
Prev_neighbor (⁠\|$c_2$\|⁠)	0.040\|$^\star$\|	(0.016, 0.065)
Prev_neighbor*ITN (⁠\|$b_2$\|⁠)	−0.092\|$^\star$\|	(−0.145, −0.039)

Response	Mean	95% CI		Mean	95% CI
Intercept (⁠\|$c_0$\|⁠)	−0.131\|$^\star$\|	(−0.192, −0.070)	\|$\sigma_e^2$\|	0.011\|$^\star$\|	(0.010, 0.012)
ITN (⁠\|$b_0$\|⁠)	−0.302\|$^\star$\|	(−0.366, −0.234)	\|$\sigma_2^2$\|	0.142\|$^\star$\|	(0.139, 0.146)
Temperature (⁠\|$\beta_{11}$\|⁠)	0.033\|$^\star$\|	(0.020, 0.045)	\|$\rho$\|	0.999\|$^\star$\|	(0.998, 0.999)
Precipitation (⁠\|$\beta_{12}$\|⁠)	0.003	(−0.012, 0.018)
Temperature*ITN (⁠\|$\beta_{21}$\|⁠)	−0.097\|$^\star$\|	(−0.125, −0.069)
Precipitation*ITN(⁠\|$\beta_{22}$\|⁠)	−0.053\|$^\star$\|	(−0.088, −0.019)
Previous (⁠\|$1+c_1$\|⁠)	0.917\|$^\star$\|	(0.902, 0.932)
Previous*ITN (⁠\|$b_1$\|⁠)	0.096\|$^\star$\|	(0.062, 0.129)
Prev_neighbor (⁠\|$c_2$\|⁠)	0.040\|$^\star$\|	(0.016, 0.065)
Prev_neighbor*ITN (⁠\|$b_2$\|⁠)	−0.092\|$^\star$\|	(−0.145, −0.039)

We consider four risk factors in the priority score in the policy: the standardized annual average temperature (⁠|$f_{1lt}=X_{l1}$|⁠) and the standardized annual average precipitation (⁠|$f_{2lt}=X_{l2}$|⁠,), logit of disease rate at previous time point |$f_{3lt}=Y_{lt-1}$|⁠, and mean logit of disease rate of neighborhood zones at previous time point |$f_{4lt}=(1/m_l)\sum_{j\sim l}Y_{jt-1}$|⁠. We evaluate the policies with either the linear local utility function or the quadratic local utility function and use the resource constraint level |$\mathcal{C}=0.5$|⁠.

We randomly draw 100 posterior samples of the parameters and estimate the optimal policy in terms of |$ \boldsymbol{\alpha}_{\rm opt}$| corresponding to each posterior draw to get the posterior distribution of |$ \boldsymbol{\alpha}_{\rm opt}$|⁠. Unlike typical MCMC sampling which suffers from autocorrelation, the samples of |$ \boldsymbol{\alpha}_{\rm opt}$| should be independent draws from the posterior and so fewer samples are needed for these parameters than are needed in typical MCMC sampling. Figure 3 plots the posterior distribution of |$ \boldsymbol{\alpha}_{\rm opt}$| when using either the linear utility function or the quadratic utility function. The posterior mean weights for all risk factors are positive, which suggests the priority of being allocated resources for each health zone tends to be positively correlated with temperature, precipitation, current disease status, and current neighborhood disease status. For both utility functions, temperature and current disease rate of the health zone seem to be most important factors in determining the risk score and priority. The posterior distribution of weight for the spatial penalty term concentrates more towards zero when using the quadratic utility function compared with using the linear utility function. This suggests that the spatial penalty term does help to better allocate the resources in terms of smoothness and efficiency when the linear utility function is used but does little to help when the quadratic utility function is used as the quadratic utility function is able to smooth the resource allocation inherently.

$Posterior distribution (5th, 25th, 50th, 75th, and 95th percentiles) of the weights $ \boldsymbol{\alpha}_{\rm opt}$ for risk factors and the spatial penalty term when using linear utility function (left) or quadratic utility function (right). The risk factors include temperature (standardized), precipitation (standardized), current disease rate (logit), and current neighborhood disease rate (logit).$

Fig. 3.

Posterior distribution (5th, 25th, 50th, 75th, and 95th percentiles) of the weights |$ \boldsymbol{\alpha}_{\rm opt}$| for risk factors and the spatial penalty term when using linear utility function (left) or quadratic utility function (right). The risk factors include temperature (standardized), precipitation (standardized), current disease rate (logit), and current neighborhood disease rate (logit).

We also estimate one optimal policy averaging over the uncertainty of the parameters. The estimated optimal policy with the linear local utility function is with the priority score

$$P_{lt}=1/\{1+\exp[-(2.1X_{l1}+1.3X_{l2}+3.1Y_{lt-1}+0.77\sum_{j\sim l}Y_{jt-1}/m_l)]\}$$

and the weight for the spatial penalty |$\alpha_0=0.06$|⁠. The estimated optimal policy with the quadratic local utility function is with the priority score

$$P_{lt}=1/\{1+\exp[-(3.5X_{l1}+1.1X_{l2}+3.3Y_{lt-1}+0.23\sum_{j\sim l}Y_{jt-1})/m_l]\}$$

and the weight for the spatial penalty |$\alpha_0=0.03$|⁠. The estimated optimal weights suggest that health zones with higher temperature, more precipitation, higher disease rate in the zone, and neighborhood zones tend to have higher priority to be allocated more resources. The loss value corresponding to the two optimal polices are 0.135 and 0.136, respectively and the loss values corresponding to “Highest_rate” policy and “Even” policy are 0.140 and 0.149, all with standard error |$0.0005$|⁠. The proposed policies improve the value by about |$3\%$| and |$9\%$| compared to the two naive policies. This is a substantial improvement when considering the number of disease cases that can be eliminated.

Figure 4 illustrates the resource allocation next year using the two estimated optimal policies or using “Highest_rate” rate policy. The estimated optimal policy using the quadratic utility function allocates the resources most smoothly while the “Highest_rate” policy only give extreme resource allocation (⁠|$0$| or |$1$|⁠).

Fig. 4.

The resource allocation next year using the estimated optimal policies with either the linear local utility function (left) or the quadratic local utility function (middle), or using “Highest_rate” policy (right).

We refit the model using data in years 2007–2012, 2007–2013, and 2007–2014 as training data and estimate the risk factor weights |$ \boldsymbol{\alpha}_{\rm opt}$| with linear utility function for resource allocation recommendation in year 2013, 2014, and 2015, respectively. The posterior distribution of optimal weights are given in Figure 5. They show that the updated estimated optimal policy is stable across years suggesting that a dynamic model would not lead to dramatic improvements for this analysis. Of course, for cases where treatment is applied annually there is ample time to update the policy, and so updating the policy using all available data would be advisable.

$Posterior distribution (5th, 25th, 50th, 75th, and 95th percentiles) of the weights $ \boldsymbol{\alpha}_{\rm opt}$ for risk factors and the spatial penalty term when using linear utility function and data in year 2007–2012 (left), year 2007–2013 (middle), and year 2007–2014 (right) as training data, respectively. The risk factors include temperature (standardized), precipitation (standardized), current disease rate (logit), and current neighborhood disease rate (logit). (a) Training data: year 2007–2012. (b) Training data: year 2007–2013. (c) Training data: year 2007–2014.$

Fig. 5.

Posterior distribution (5th, 25th, 50th, 75th, and 95th percentiles) of the weights |$ \boldsymbol{\alpha}_{\rm opt}$| for risk factors and the spatial penalty term when using linear utility function and data in year 2007–2012 (left), year 2007–2013 (middle), and year 2007–2014 (right) as training data, respectively. The risk factors include temperature (standardized), precipitation (standardized), current disease rate (logit), and current neighborhood disease rate (logit). (a) Training data: year 2007–2012. (b) Training data: year 2007–2013. (c) Training data: year 2007–2014.

8. Discussion

We develop a recommender system for spatiotemporal resource allocation to maximize the efficacy of malaria control efforts. Our proposed statistical framework deals with the challenges of spatial dependence and continuous action space. We used a hierarchical Bayesian spatiotemporal model to approximate the system dynamics of the disease transmission involving the effect of environmental covariates and allocated resources, and construct a flexible and interpretable class of allocation policies that is also computationally feasible for searching the optimal resource allocation policy with the continuous action space. The simulation experiments suggest the proposed method performs well, and it is shown to be able to improve the resource allocation efficacy compared with naive polices in both simulation studies and the application to DRC data.

There are some limitations in our studies which provide possible directions for the future work. Our proposed RE relies on the postulated spatiotemporal model for the malaria transmission. A more flexible semi- or nonparametric model can be constructed to improve the robustness of the method to model misspecification. In the current framework, we only consider one intervention (ITN) at a time when optimizing the resource allocation. Our method can be extended to consider several interventions simultaneously and recommend the optimum allocation policy for all the related resources. The current framework assumes the resources are yearly allocated, and we do not consider seasonality in the model. If a monthly or even more frequent allocation is necessary, the seasonality can be incorporated as a covariate to the model, and the strategy can be defined for more frequent resource allocation with the constraint that an annual budget to be allocated across in space and time. As malaria intervention resources are allocated at health zone level in DRC, we build the spatiotemporal model based on the use of data aggregated at the district level and define RE to decide how to allocate resources to a finite number of regions. If applying treatment to points rather than a finite number of regions is more of interest and data are available at point locations, then a geostatistical model would be preferable to avoid bias in estimating covariate effects and a completely new framework is required to define the point level treatment allocation policy, and it is a topic for future work.

When we apply our method to the DRC data, the data we use to fit the model is the generated PfPR and ITN surface data estimated in Bhatt and others (2015) using the Bayesian hierarchical model based on very sparse survey data instead of the real yearly collected health zone level data. Also, we make the assumption that the populations at each 5 km by 5 km cell within a health zone are the same so that the mean rate of the cells can be used as the rate of the health zone. As a result, we only use the data as the illustration purpose instead of accurately reflecting the situation in DRC.

Software

The R code that implements the proposed method for the simulation study and DRC data analysis is available at https://github.com/qianguan/SpatiotemporalTreatmentAllocation. All the data used for DRC data analysis is available at https://figshare.com/account/home#/projects/99599.

Supplementary material

Supplementary material is available online at http://biostatistics.oxfordjournals.org.

Acknowledgments

The authors thank the Bill and Melinda Gates Foundation (OPP1161913) and the National Institutes of Health (R01ES031651-01) for supporting this research and Dr. Mark Janko of Duke University for assistance with the DRC data.

Conflict of Interest: None declared.

Funding

Bill and Melinda Gates Foundation (OPP1161913) and the National Institutes of Health (R01ES031651-01).

References

Bhadra,

A.

,

Ionides,

E. L.

,

Laneri,

K.

,

Pascual,

M.

,

Bouma,

M.

and

Dhiman,

R. C.

(

2011

).

Malaria in Northwest India: data analysis via partially observed stochastic differential equation models driven by Lévy noise

.

Journal of the American Statistical Association

106

,

440

–

451

.

Bhatt,

S.

,

Weiss,

D. J.

,

Cameron,

E.

,

Bisanzio,

D.

,

Mappin,

B.

,

Dalrymple,

U.

,

Battle,

K. E.

,

Moyes,

C. L.

,

Henry,

A.

,

Eckhoff,

P. A.

et al. (

2015

).

The effect of malaria control on Plasmodium falciparum in Africa between 2000 and 2015

.

Nature

526

,

207

–

211

.

Chakraborty,

B.

and

Moodie,

E. E.

(

2013

).

Statistical Methods for Dynamic Treatment Regimes

.

New York, NY

:

Springer

.

Chen,

G.

,

Zeng,

D.

and

Kosorok,

M. R.

(

2016

).

Personalized dose finding using outcome weighted learning

.

Journal of the American Statistical Association

111

,

1509

–

1521

.

Daniels,

N.

,

Bryant,

J.

,

Castano,

R. A.

,

Dantes,

O. G.

,

Khan,

K. S.

and

Pannarunothai,

S.

(

2000

).

Benchmarks of fairness for health care reform: a policy tool for developing countries

.

Bulletin of the World Health Organization

78

,

740

–

750

.

PubMed

Eastman,

R. T.

and

Fidock,

D. A.

(

2009

).

Artemisinin-based combination therapies: a vital tool in efforts to eliminate malaria

.

Nature Reviews Microbiology

7

,

864

.

Gibson,

J. L.

,

Martin,

D. K.

and

Singer,

P. A.

(

2004

).

Setting priorities in health care organizations: criteria, processes, and parameters of success

.

BMC Health Services Research

4

,

25

.

Griffin,

J. T.

,

Bhatt,

S.

,

Sinka,

M. E.

,

Gething,

P. W.

,

Lynch,

M.

,

Patouillard,

E.

,

Shutes,

E.

,

Newman,

R. D.

,

Alonso,

P.

,

Cibulskis,

R. E.

et al. (

2016

).

Potential for reduction of burden and local elimination of malaria by reducing Plasmodium falciparum malaria transmission: a mathematical modelling study

.

The Lancet Infectious Diseases

16

,

465

–

472

.

Griffin,

J. T.

,

Ferguson,

N. M.

and

Ghani,

A. C.

(

2014

).

Estimates of the changing age-burden of Plasmodium falciparum malaria disease in sub-Saharan Africa

.

Nature Communications

5

,

3136

.

Guan,

Q.

,

Reich,

B. J.

,

Laber,

E. B.

and

Bandyopadhyay,

D.

(

2020

).

Bayesian nonparametric policy search with application to periodontal recall intervals

.

Journal of the American Statistical Association

115

,

1066

–

1078

.

Hay,

S. I.

,

Guerra,

C. A.

,

Gething,

P. W.

,

Patil,

A. P.

,

Tatem,

A. J.

,

Noor,

A. M.

,

Kabaria,

C. W.

,

Manh,

B. H.

,

Elyazar,

I. R. F.

,

Brooker,

S.

et al. (

2009

).

A world malaria map: Plasmodium falciparum endemicity in 2007

.

PLoS Medicine

6

,

e1000048

.

Jones,

D. R.

,

Schonlau,

M.

and

Welch,

W. J.

(

1998

).

Efficient global optimization of expensive black-box functions

.

Journal of Global optimization

13

,

455

–

492

.

Kang,

S. Y.

,

Battle,

K. E.

,

Gibson,

H. S.

,

Ratsimbasoa,

A.

,

Randrianarivelojosia,

M.

,

Ramboarina,

S.

,

Zimmerman,

P. A.

,

Weiss,

D. J.

,

Cameron,

E.

,

Gething,

P. W.

et al. (

2018

).

Spatio-temporal mapping of Madagascars Malaria Indicator Survey results to assess Plasmodium falciparum endemicity trends between 2011 and 2016

.

BMC Medicine

16

,

71

.

Laber,

E. B.

and

Zhao,

Y.

(

2015

).

Tree-based methods for optimal treatment allocation

.

Biometrika

102

,

501

–

514

.

Laber,

E. B.

,

Meyer,

N. J.

,

Reich,

B. J.

,

Pacifici,

K.

,

Collazo,

J. A.

and

Drake,

J. M.

(

2018

).

Optimal treatment allocations in space and time for on-line control of an emerging infectious disease

.

Journal of the Royal Statistical Society: Series C (Applied Statistics)

67

,

743

–

789

.

Lengeler,

C.

(

1998

).

Insecticide treated bednets and curtains for malaria control

.

Cochrane Database of Systematic Reviews

, Issue

3

.

Mugglin,

A. S.

,

Cressie,

N.

and

Gemmell,

I.

(

2002

).

Hierarchical statistical modelling of influenza epidemic dynamics in space and time

.

Statistics in Medicine

21

,

2703

–

2721

.

Murphy,

S. A.

(

2003

).

Optimal dynamic treatment regimes

.

Journal of the Royal Statistical Society: Series B (Statistical Methodology)

65

,

331

–

355

.

Nord,

E.

(

2015

).

Cost-value analysis of health interventions: introduction and update on methods and preference data

.

Pharmacoeconomics

33

,

89

–

95

.

Nord,

E.

,

Pinto,

J. L.

,

Richardson,

J.

,

Menzel,

P.

and

Ubel,

P.

(

1999

).

Incorporating societal concerns for fairness in numerical valuations of health programmes

.

Health Economics

8

,

25

–

39

.

Okell,

L. C.

,

Cairns,

M.

,

Griffin,

J. T.

,

Ferguson,

N. M.

,

Tarning,

J.

,

Jagoe,

G.

,

Hugo,

P.

,

Baker,

M.

,

DAlessandro,

U.

,

Bousema,

T.

et al. (

2014

).

Contrasting benefits of different artemisinin combination therapies as first-line malaria treatments using model-based cost-effectiveness analysis

.

Nature Communications

5

,

5606

.

Orellana,

L.

,

Rotnitzky,

A.

and

Robins,

J. M.

(

2010

).

Dynamic regime marginal structural mean models for estimation of optimal dynamic treatment regimes, Part I: Main content

.

The International Journal of Biostatistics

6

.

Picheny,

V.

,

Ginsbourger,

D.

,

Richet,

Y.

and

Caplin,

G.

(

2013

).

Quantile-based optimization of noisy computer experiments with tunable precision

.

Technometrics

55

,

2

–

13

.

Pluess,

B.

,

Tanser,

F. C.

,

Lengeler,

C.

and

Sharp,

B. L.

(

2010

).

Indoor residual spraying for preventing malaria

.

Cochrane Database Systematic Review

4

.

Rich,

B.

,

Moodie,

E. E. M.

and

Stephens,

D. A.

(

2016

).

Optimal individualized dosing strategies: a pharmacologic approach to developing dynamic treatment regimens for continuous-valued treatments

.

Biometrical Journal

58

,

502

–

517

.

Robins,

J.

,

Orellana,

L.

and

Rotnitzky,

A.

(

2008

).

Estimation and extrapolation of optimal treatment and testing strategies

.

Statistics in Medicine

27

,

4678

–

4721

.

Robins,

J. M.

(

2004

). Optimal structural nested models for optimal sequential decisions. In:

Proceedings of the Second Seattle Symposium in Biostatistics

.

New York, NY

:

Springer

. pp.

189

–

326

.

Roustant,

O.

,

Ginsbourger,

D.

and

Deville,

Y.

(

2012

).

Dicekriging, Diceoptim: Two R packages for the analysis of computer experiments by kriging-based metamodelling and optimization

.

Journal of Statistical Software

51

,

54p

.

Schulte,

P. J.

,

Tsiatis,

A. A.

,

Laber,

E. B.

and

Davidian,

M.

(

2014

).

Q-and A-learning methods for estimating optimal dynamic treatment regimes

.

Statistical Science: A Review Journal of the Institute of Mathematical Statistics

29

,

640

.

Stuckey,

E. M.

,

Stevenson,

J.

,

Galactionova,

K.

,

Baidjoe,

A. Y.

,

Bousema,

T.

,

Odongo,

W.

,

Kariuki,

S.

,

Drakeley,

C.

,

Smith,

T. A.

,

Cox,

J.

et al. (

2014

).

Modeling the cost effectiveness of malaria control interventions in the highlands of western Kenya

.

PLoS One

9

,

e107700

.

Walker,

P. G. T.

,

Griffin,

J. T.

,

Ferguson,

N. M.

and

Ghani,

A. C.

(

2016

).

Estimating the most efficient allocation of interventions to achieve reductions in Plasmodium falciparum malaria burden and transmission in Africa: a modelling study

.

The Lancet Global Health

4

,

e474

–

e484

.

WHO. (

2017

). World malaria report 2017.

Technical Report

,

World Health Organization

.

Google Preview

Zhang,

B.

,

Tsiatis,

A. A.

,

Laber,

E. B.

and

Davidian,

M.

(

2012

).

A robust method for estimating optimal treatment regimes

.

Biometrics

68

,

1010

–

1018

.

Zhang,

B.

,

Tsiatis,

A. A.

,

Laber,

E. B.

and

Davidian,

M.

(

2013

).

Robust estimation of optimal dynamic treatment regimes for sequential treatment decisions

.

Biometrika

100

,

681

–

694

.