Delivery Facility Choice Model
Background
The care that a birthing person receives during labor and delivery depends on where the delivery took place, as not all facilities have the same access to resources and personnel. Healthcare providers trained in emergency obstetric and newborn care (EmONC) are crucial for reducing maternal and neonatal deaths, especially in high-burden settings.
Seven essential obstetric services, known as “signal functions,” have been designated as fundamental to basic emergency obstetric and newborn care (BEmONC): parenteral antibiotic administration; parenteral anticonvulsant administration; parenteral uterotonic administration; manual removal of retained products (manual vacuum aspiration); assisted vaginal delivery; manual placental removal; and newborn resuscitation. [UNICEF_2009] Comprehensive emergency obstetric and newborn care (CEmONC) encompasses all BEmONC services plus surgical capability (e.g., for c-sections) and blood transfusion capacity. These critical services determine a health facility’s capability to manage obstetric and newborn emergencies. [UNICEF_2009] Due to data constraints and for simplicity, in our model we assume that all hospitals are CEmONC facilities and all other delivery facilities (not including home births) are BEmONC facilities.
To capture the complex relationship between choice of delivery facility (home birth vs a BEmONC facility vs a CEmONC facility), the belief about gestational age (believed pre-term vs believed full term), and the related factors of antenatal care (ANC), and low birth weight and short gestation (LBWSG) risk exposure, we will include two novel affordances in our simulation: (1) correlated propensities for ANC, in-facility delivery (IFD), and LBWSG category; and (2) causal conditional probabilities for in-facility delivery that differ based on the believed preterm status when labor begins.
As a simplification, we will model the choice of delivery location in two steps: First, the birthing person decides whether to deliver at home or go to a facility, depending on the believed preterm status at the start of labor. Then, among simulants delivering in-facility, we will randomly assign simulants to BEmONC or CEmONC facilities, independent of other choices that have been made. Additionally, we do not currently model facility transfers, and we think of the delivery facility as the final location of the delivery if there were transfers between facilities.
Coming up with values for the needed correlations and causal probabilities for facility choice that are consistent with GBD and external evidence is detailed at the end of this document. But before we get to that complexity, let’s start with how we will use these correlations and causal probabilities in the simulation.
Note that the calibration procedure, and hence the values we’re using here (i.e., the correlations and the values of \(\Pr[\text{IFD status} \mid \operatorname{do}(\text{believed preterm status})]\)) depend critically on the implementations of other pieces of the model that are described elsewhere (most notably, choice of ultrasound type given ANC status, and deriving an estimated gestational age from the true gestational age – see the AI Ultrasound Module for more details).
Causal model
The following causal diagram shows the simulant attributes needed for choosing each simulant’s delivery facility, and the causal relationships between them that we will simulate:
Causal graph showing the variables affecting birth facility
Legend
Nodes
- black and white oval:
dichotomous variable
- green oval:
polytomous variable
- orange oval:
continuous variable
- blue-grey rectangle:
propensity, \(u \sim \operatorname{Uniform}([0,1])\)
Edges
- dashed line:
correlation
- black arrow:
probabilistic causal relationship
- purple arrow:
deterministic causal relationship
- blue-grey arrow:
input a propensity to simulate randomness
Note that the only exogenous variables in the model are the propensities, and the simulant attributes in all the ovals are endogenous, being completely determined once the propensities are specified.
The causal model calibration uses observed data and an optimization procedure to find consistent values for the three correlations between the propensities \(u_\text{ANC}\), \(u_\text{IFD}\), and \(u_\text{cat}\), and the causal probabilities \(\Pr[\text{IFD status} \mid \operatorname{do}(P')]\) for the arrow from believed preterm status to in-facility delivery status. The sections below record the values of these correlations and causal probabilities and detail how to use them in the Vivarium simulation to assign the final birth facility node, F.
Assumptions and limitations
The causal model was designed to capture the effect of expanded coverage of AI ultrasound on choice of delivery facility, so only the variables deemed important for this effect were included. If in the future we want to intervene on variables besides the ultrasound (U) node (for example, expand ANC coverage), we would likely need to add more nodes and/or edges to the model.
Moving to a higher level care facility during the intrapartum period is common (referred up once labor begins if there is an issue) and the ability to do this is often a result of available transport, distance to clinics, etc. We currently do not include this level of detail and instead have simulants remain at a single facility for the whole intrapartum period. In the future, we may devise a strategy to model facility transfers, which may necessitate some changes to the facility choice model.
The timing of a standard ultrasound affects its accuracy in determining gestational age (ultrasounds in the first trimester are more accurate than ultrasounds in later pregnancy). However, the facility choice model currently uses a dichotomous variable for ANC (“no ANC” vs. “some ANC”), so we are unable to model the timing of the ultrasound, instead defining a single category “standard ultrasound” that uses the average measurement error for ultrasounds taken at any point during pregnancy. In Wave II, we are planning to add more detail to the timing of ANC visits, which should allow us to more accurately model the uncertainty in GA estimation with standard ultrasounds, using the data in this paper.
The diagram posits a causal relationship of gestational age (GA) on the error (E) in estimating the gestational age. Specifically, we have some empirical data from GF that shows that, in the absence of an accurate ultrasound, larger gestational ages are more likely to be underestimated, while smaller gestational ages are more likely to be overestimated. E.g., if the true GA is 42 when you go into labor, you are more likely to think that the GA is 40 than to think it is 44, since very few pregnancies last 44 weeks. This effect would correspond to having the mean of the distribution of E depend on the value of GA, but for simplicity we do not model this effect, instead assuming that the mean error is 0 regardless of GA. Thus, in our current modeling strategy, the arrow from GA to E is a “no-op” relationship, and E depends only on the ultrasound type. The impact on our results of omitting this effect will likely be small since the effect is more pronounced at the extremes of the GA distribution and not as pronounced near the preterm cutoff of 37 weeks.
The causal model includes birth weight (BW) and low birth weight status (LBW), but these are not currently used in the causal model optimization due to lack of data.
Causal conditional probabilities for in-facility delivery
In addition to correlation, we posit that a belief about preterm status is influential in the decision to have a home delivery (see the Facility choice module). We will model this as a causal conditional probability of home delivery given a belief about preterm status. Although deriving consistent values for these probabilities is complex, and described in the final section of this page, using the causal conditional probabilities is simple: Simply select in-facility delivery with probability \(\text{Pr}[\text{in-facility}\mid \operatorname{do}(\text{believed preterm})]\) or \(\text{Pr}[\text{in-facility}\mid \operatorname{do}(\text{believed term})]\) for the corresponding cases, using the correlated IFD propensity and category ordering defined in the previous section.
Causal probability |
Ethiopia |
Nigeria |
Pakistan |
|---|---|---|---|
\(\text{Pr}[\text{at-home}\mid \operatorname{do}(\text{believed preterm})]\) |
0.38 |
0.38 |
0.17 |
\(\text{Pr}[\text{in-facility}\mid \operatorname{do}(\text{believed preterm})]\) |
1 - 0.38 |
1 - 0.38 |
1 - 0.17 |
\(\text{Pr}[\text{at-home}\mid \operatorname{do}(\text{believed term})]\) |
0.55 |
0.51 |
0.26 |
\(\text{Pr}[\text{in-facility}\mid \operatorname{do}(\text{believed term})]\) |
1 - 0.55 |
1 - 0.51 |
1 - 0.26 |
More explicitly, given the simulant’s believed preterm status (either “believed preterm” or “believed term”) and their IFD propensity, \(u_\text{IFD}\), the simulant’s IFD status is given by the following function \(f_\text{IFD}\):
Note that, as described in the previous section, smaller values of \(u_\text{IFD}\) correspond with home delivery, while larger values of \(u_\text{IFD}\) correspond with in-facility delivery. This ordering is important for the model to calibrate using the specified propensity correlations. The function \(f_\text{IFD}\) is one of the structural equations defining the causal model drawn above.
The above causal probabilities were computed in the facility_choice_optimization_3_countries notebook in the MNCNH Portfolio research repository.
Note
The above probabilities represent the causal effect of a simulant’s believed preterm status on their choice of home delivery or in-facility delivery. These will be different from the population’s observed conditional probabilities of IFD status given the believed preterm status, because of the correlations of \(u_\text{IFD}\) with \(u_\text{ANC}\) and \(u_\text{cat}\).
Choosing BEmONC vs. CEmONC
For simulants whose IFD status is “in-facility,” we assign CEmONC facility delivery using location-specific probabilities provided by the Health Systems team. These estimates represent the proportion of in-facility deliveries occurring in hospitals, which we are using as a proxy for CEmONC facilities. Since all in-facility deliveries occur in either BEmONC or CEmONC facilities, the probability of delivering in a BEmONC facility equals the complement of the CEmONC probability (i.e., 1 - P(CEmONC)). The decision of whether a simulant who gives birth in-facility delivers in a BEmONC or CEmONC facility should be independent from other choices in the model.
We have copied the HS team estimates to our J drive as-is. Before use in the simulation, we subset to our modeled locations and the latest year (2024) and retain only the draw columns.
import pandas as pd
hosp_any = pd.read_csv('/snfs1/Project/simulation_science/mnch_grant/MNCNH portfolio/hosp_any_st-gpr_results_weighted_aggregates_2025-06-06.csv')
location_ids = [165,179,214] # Pakistan, Ethiopia, Nigeria (modeled locations in the MNCNH portfolio simulation)
# improvement: include some function to get location IDs for the locations used in the simulation
# location_ids = get_location_ids(metadata.LOCATIONS)
hosp_ifd_proportion = hosp_any.loc[
(hosp_any.location_id.isin(location_ids)) &
(hosp_any.year_id == 2024) # Use most recent year available
].drop(columns=['mean', 'lower', 'upper'])
This data is specific to a given location ID and has 100 draws. To add the required 500 draws to the artifact for the MNCNH simulation for GBD 2021, duplicate the data five times such that draw 0 has the same value as draw 100, 200, 300, 400, etc. For GBD 2023, duplicate the data 2.5 times such that draw 0 has the same value as draw 100 and 200 and that draw 100 has the same value as draw 200 (data for draws 0-49 will be used three times as data for draws 50-99 will be used twice).
Once BEmONC or CEmONC has been chosen for all in-facility deliveries, use this choice in conjunction with the IFD status to assign one of the three values “home”, “BEmONC”, or “CEmONC” as the final birth facility (F) of each simulant.
Note
Before switching to using the HS team data, we used microdata-based estimates of the proportion of in-facility deliveries occurring in CEmONC facilities in Pakistan from BMGF. These estimates are not alarmingly different from the HS team estimates: 34% from the BMGF data vs. ~27% from the HS team data.
Overall delivery setting rates
While these values will not be used as direct inputs in assigning a delivery setting to simulants in the simulation, the population-level delivery setting rates will still be relevant in calculating PAFs for interventions that vary by delivery setting as well as for verification and validation. Therefore, the following parameters should be included in the artifact:
Parameter |
Definition |
Value |
Use |
|---|---|---|---|
|
Proportion of births that occur in facility settings (inclusive of both BEmONC and CEmONC facilities) that occur in BEmONC facilities |
Defined in the Choosing BEmONC vs. CEmONC section |
Directly used in assigning a delivery facility in the facility choice model |
|
Proportion of all births that occur in facility settings (including both BEmONC and CEmONC) |
mean_value of GBD covariate 51 (do NOT include any parameter uncertainty in this parameter as only the mean_value was used as an input to the delivery facility model calibration) |
Used in the calculation of the following parameters |
|
Proportion of all births that occur at home |
|
Used in calculating total population intervention coverage as a weighted average across delivery settings for intervention with coverage that varies by delivery facility at baseline and for V&V |
|
Proportion of all births that occur in BEmONC facilities |
|
Used in calculating total population intervention coverage as a weighted average across delivery settings for intervention with coverage that varies by delivery facility at baseline and for V&V |
|
Proportion of all births that occur in CEmONC facilities |
|
Used in calculating total population intervention coverage as a weighted average across delivery settings for intervention with coverage that varies by delivery facility at baseline and for V&V |
Challenge of calibrating the model
We have developed a nonlinear optimization model to find a consistent set of parameters for the Gaussian copula and the causal conditional probabilities. It will be described in detail here.
Code for running the causal optimization model can be found in the /facility_choice folder in the MNCNH Portfolio research repo. The original writeup describing the idea behind the optimization is on Sharepoint.
Todo
Add more details about how the calibration works.
Range of propensity and probabilities that are consistent with existing data
An important result of this optimization was to determine that the system is underdetermined. With the existing data we have available, there are a range of consistent values for the propensity and probability parameters. This section explores the tradeoffs between the parameters, to guide us in setting appropriate values.
It might be easier to think about “probability gaps”, meaning the difference between the conditional probabilities conditioned on believed full term and believed preterm than to think about the absolute magnitude of these probabilities.
References
UNICEF. (2009). Monitoring emergency obstetric care: a handbook. https://www.who.int/publications/i/item/9789241547734