Alzheimer’s Population Model with Demographic Forecasts

Abbreviations
Abbreviation	Definition
AD	Alzheimer’s Disease
BBBM	Blood-Based Biomarker
GBD	Global Burden of Disease
FHS	Future Health Scenarios
MCI	Mild Cognitive Impairment

Overview 

The goal of this population model is to model only simulants with Alzheimer’s disease (and other dementias), in order to reduce the necessary population size for the CSU Alzheiemer’s simulation. The model document is split into two parts: 1) initializing the population, and 2) adding new simulants during the simulated timeframe. We will also describe two different versions of the population model, corresponding to progressive model versions of the CSU Alzheiemer’s simulation:

Models 2 and 3: Modeling simulants with Alzheimer’s disease (AD) and other dementias as defined by GBD
Model 4 and above: Modeling simulants with presymptomatic AD, MCI, or AD dementia

Initializing the Population 

We will first describe how to initialize the population for AD and other dementias as defined by GBD, then we will explain how to modify the initialization strategy when including the presymptomatic and MCI stages of AD.

Model Scale 

Let $t_0$ be the starting time of our simulation, let $X_{t_0}$ be the size of our simulated population at initialization (i.e., the initial population size per draw specified in the concept model), and let $X^\text{real}_{t_0}$ be the corresponding real-world population at time $t_0$ that our simulation is supposed to represent. The model scale, $S$, of our simulation is defined to be $S = X_{t_0} / X^\text{real}_{t_0}$. We will use the model scale both for initializing our simulated population and for adding new simulants.

In our case, $X^\text{real}_{t_0}$ is the population of people with Alzheimer’s disease and other dementias at time $t_0$ in a particular country. We can compute this as

\[X^\text{real}_{t_0} = p_\text{AD} \cdot Y^\text{real}_{t_0},\]

where $Y^\text{real}_{t_0}$ is the total population at time $t_0$ in our simulated location according to GBD, and $p_\text{AD}$ is the prevalence of Alzheimer’s disease and other dementias across all age groups and sexes in that location. Note that the model scale can also be computed as $S = Y_{t_0} / Y^\text{real}_{t_0}$, where $Y_{t_0} = X_{t_0} / p_\text{AD}$ is the size of an imagined total model population including all people with and without Alzheiemer’s disease, of which those with Alzheimer’s are the ones who appear in our simulation. Putting everything together,

(1)\[S = \frac{X_{t_0}}{p_\text{AD}\cdot Y^\text{real}_{t_0}},\]

which computes the model scale in terms of known parameters.

Initializing Demographic Subgroups 

Let $g$ denote a demographic subgroup of the population in a given location, namely $g = (\text{age group, sex})$. For each demographic group $g$ and time $t$, we generalize the notation in the previous section to define the following populations in demographic group $g$ at time $t$:

$X_{g,t}$ = the number of simulants in group $g$ at time $t$
$X^\text{real}_{g,t}$ = the real population corresponding to our simulated population $X_{g,t}$
$Y_{g,t}$ = the imagined total model population in group $g$, including people with and without AD, of which $X_{g,t}$ counts the subset with AD
$Y^\text{real}_{g,t}$ = the total real population in group $g$ at time $t$ according to GBD

We need to determine $X_{g,t_0}$ (the initial simulated population) for each demographic group $g$. Let $p_{g,t}$ be the prevalence of Alzheimer’s disease and other dementias in demographic group $g$ at time $t,$ for a given location. Two relations among the above quantities are:

\[\begin{aligned} X_{g,t} = S \cdot X^\text{real}_{g,t} \quad\text{and}\quad X^\text{real}_{g,t} = p_{g,t} \cdot Y^\text{real}_{g,t}. \end{aligned}\]

(For $t\ne t_0$, the first relation assumes that our simulated population accurately tracks the real-world population over time.) Therefore, at time $t_0$,

(2)\[X_{g,t_0} = S \cdot X^\text{real}_{g,t_0} = S\cdot p_{g,t_0} \cdot Y^\text{real}_{g,t_0} = X_{t_0} \cdot \frac{p_{g,t_0}}{p_\text{AD}} \cdot \frac{Y^\text{real}_{g,t_0}}{Y^\text{real}_{t_0}},\]

where the final equality follows from plugging in formula (1) for the model scale $S$. This equation tells us how many simulants to initialize into each demographic group based on known parameters.

Note

Another way to write (2) is

\[X_{g,t_0} = X_{t_0} \cdot \frac{\text{\# of real people in subgroup $g$ with Alzheimer's}} {\text{\# of real people in whole population with Alzheimer's}}.\]

Thus, we could compute $X_{g,t_0}$ using prevalence counts from GBD instead of prevalence rates.

To verify that (2) gives us the correct total number of initial simulants, note that

\[\begin{split}\begin{aligned} \sum_g X_{g,t_0} = \sum_g X_{t_0} \cdot \frac{p_{g,t_0} \cdot Y^\text{real}_{g,t_0}} {p_\text{AD} \cdot Y^\text{real}_{t_0}} &= X_{t_0} \cdot \sum_g \frac{X^\text{real}_{g,t_0}}{X^\text{real}_{t_0}} \\ &= X_{t_0} \cdot \frac{\sum_g X^\text{real}_{g,t_0}}{X^\text{real}_{t_0}} = X_{t_0} \cdot \frac{X^\text{real}_{t_0}}{X^\text{real}_{t_0}} = X_{t_0}. \end{aligned}\end{split}\]

Todo

Add a note about how the initial values in each subgroup are related to the “population structure” of the simulation.

Initializing simulants with presymptomatic and MCI stages 

Starting in Model 4 of the CSU Alzheimer’s simulation, the Alzheimer’s cause model includes two predementia stages, BBBM-AD, and MCI-AD, in addition to the dementia stage AD-dementia. When computing the model scale and initializing demographic subgroups, $p_\text{AD}$ should be replaced by $p_\text{(all AD states)}$, the combined prevalence of the three states BBBM-AD, MCI-AD, and AD-dementia, across all demographic groups at time $t_0$. Similarly, $p_{g,t}$ should now refer to the combined prevalence of all three AD stages in demographic group $g$ at time $t$. The value of $p_{g,t}$ is defined on the Alzheimer’s cause model page. With these updated definitions, the model scale and initial population size in each group are defined the same as above:

\[S = \frac{X_{t_0}}{p_\text{(all AD states)}\cdot Y^\text{real}_{t_0}} = \frac{X_{t_0}}{\sum_g p_{g,t_0}\cdot Y^\text{real}_{g,t_0}}, \qquad X_{g,t_0} = X_{t_0} \cdot \frac{p_{g,t_0} \cdot Y^\text{real}_{g,t_0}} {\sum_g p_{g,t_0} \cdot Y^\text{real}_{g,t_0}}.\]

Adding New Simulants 

Let $N_{g,t}$ denote the number of new simulants in demographic group $g$ that we want to add to the simulation at time $t$. We will assume that $N_{g,t}$ is a Poisson random variable with mean $\lambda_{g,t} \cdot \Delta t \cdot 1_{\{\text{simulation step times}\}}(t)$, where $\lambda_{g,t}$ is the entrance rate of new simulants (measured in count of simulants per unit time) at time $t$, $\Delta t$ is the length of a simulation time step, and $1_A$ is the indicator function of the set $A$ (the indicator function zeros out the entrance rate at times when the simulation is not taking a step). Our goal is to determine the entrance rate $\lambda_{g,t}$ for each $g$ and $t$.

Calculating entrance rate when simulating AD-dementia only 

First we describe how to calculate the entrance rate in the case where we are modeling only simulants with AD-dementia (i.e., we are not modeling the presymptomatic or MCI statges). Let $A_g(t)$ be the cumulative number of incident cases of AD by time $t$ in demographic group $g$ in the real population. Since our simulation is scaled down by a factor of $S$, the rate at which we want to add simulants is

\[\lambda_{g,t} = S \cdot \dot A_g(t),\]

where $\dot A_g(t)$ is the derivative of $A_g(t)$ with respect to $t$. To calculate $\lambda_{g,t}$, we rewrite it in terms of quantities that we can estimate from the available data:

(3)\[\lambda_{g,t} = S \cdot \dot A_g(t) = S \cdot \frac{\dot A_g(t)}{Y^\text{real}_{g,t}} \cdot Y^\text{real}_{g,t} = S \cdot i_{g,t}^\text{AD} \cdot Y^\text{real}_{g,t},\]

where $i_{g,t}^\text{AD} = \dot A_g(t) /Y^\text{real}_{g,t}$ is the total population incidence hazard of AD in demographic group $g$ at time $t$. We know the model scale $S$ from (1) above, and we can estimate the quantities $i_{g,t}^\text{AD}$ and $Y^\text{real}_{g,t}$ from GBD as follows.

Let $y(t)$ denote the year to which time $t$ belongs. If we assume that the hazard $i_{g,t}^\text{AD}$ is constant throughout the year $y(t)$, then it is equal to its person-time-average over the year, which is the total population incidence rate:

\[i_{g,t}^\text{AD} = \frac{\text{\# of incident cases of AD in group $g$ in year $y(t)$}} {\text{total person-years in group $g$ in year $y(t)$}}.\]

This is the raw AD incidence rate we pull from GBD (not the susceptible population incidence rate usually calculated by Vivarium Inputs). If we assume that the population $Y^\text{real}_{g,t}$ is constant throughout the year $y(t)$, then it is equal to its time-average over the year:

\[Y^\text{real}_{g,t} = \text{average population in group $g$ during the year $y(t)$}.\]

This is the population we pull from GBD using get_population. Thus, (3) expresses the entrance rate $\lambda_{g,t}$ in terms of quantities we can estimate from data.

Calculating entrance rate of simulants in the BBBM-AD state 

The number of simulants to add in the BBBM-AD state was determined through an MCMC optimization simultaneously with the other parameters for the cause model. See Cause Model Calibration Strategy Docs for details.

Data Tables 

The following table shows the variables that come directly from our data sources. Other quantities needed for the simulation are defined above in terms of these values.

Data Sources
Variable	Definition	Source or value	Notes
$X_{t_0}$	The initial size of our simulated population	“Initial population size per draw” in the simulation parameter specifications table in the concept model	Includes all demographic groups
$Y^\text{real}_{g,t}$	Total population of demographic group $g$ at time $t$	population_forecast in AD cause model data sources table	From GBD 2021 Forecasting Capstone. Available for years 2021-2050.
$p_{g,t}$	The combined prevalence of all AD stages in demographic group $g$ at time $t$	$p_\text{(All AD states)}$ in the Attention box on the AD cause model page	Calculated from the GBD 2023 dementia envelope using the dementia subtype proportions provided by the dementia modelers. We will only need the value for the single year $t_0$.
$i^\text{AD}_{g,t}$	Total-population incidence rate of AD dementia in demographic group $g$ at time $t$	incidence_AD in AD cause model data sources table	Calculated from the GBD 2023 dementia envelope using the dementia subtype proportions provided by the dementia modelers. Assumed to be independent of $t$.
$m_{g,t}$	Background mortality hazard in demographic group $g$ at time $t$	m_BBBM or m_MCI in the AD cause model data sources table	Equal to all-cause mortality rate minus cause-specific mortality rate for AD-dementia. Uses all-cause mortality rate forecasts for 2021–2050 from GBD 2021 Forecasting Capstone.
$\Delta_\text{BBBM}$, $\Delta_\text{MCI}$	Average duration of the BBBM-AD state or MCI-AD state, respectively	$\Delta_\text{BBBM}$ and $\Delta_\text{MCI}$ in the AD cause model data sources table

Variable	Definition	Source or value	Notes
\(X_{t_0}\)	The initial size of our simulated population	“Initial population size per draw” in the simulation parameter specifications table in the concept model	Includes all demographic groups
\(Y^\text{real}_{g,t}\)	Total population of demographic group \(g\) at time \(t\)	population_forecast in AD cause model data sources table	From GBD 2021 Forecasting Capstone. Available for years 2021-2050.
\(p_{g,t}\)	The combined prevalence of all AD stages in demographic group \(g\) at time \(t\)	\(p_\text{(All AD states)}\) in the Attention box on the AD cause model page	Calculated from the GBD 2023 dementia envelope using the dementia subtype proportions provided by the dementia modelers. We will only need the value for the single year \(t_0\).
\(i^\text{AD}_{g,t}\)	Total-population incidence rate of AD dementia in demographic group \(g\) at time \(t\)	incidence_AD in AD cause model data sources table	Calculated from the GBD 2023 dementia envelope using the dementia subtype proportions provided by the dementia modelers. Assumed to be independent of \(t\).
\(m_{g,t}\)	Background mortality hazard in demographic group \(g\) at time \(t\)	m_BBBM or m_MCI in the AD cause model data sources table	Equal to all-cause mortality rate minus cause-specific mortality rate for AD-dementia. Uses all-cause mortality rate forecasts for 2021–2050 from GBD 2021 Forecasting Capstone.
\(\Delta_\text{BBBM}\), \(\Delta_\text{MCI}\)	Average duration of the BBBM-AD state or MCI-AD state, respectively	\(\Delta_\text{BBBM}\) and \(\Delta_\text{MCI}\) in the AD cause model data sources table