Further increases in the AOW-age from 2022 (and age 67) onward, are to be estimated using a stochastic mortality model and will be based on the best estimate of the remaining life expectancy forecasts.

Whether or not an increase in the AOW-age is deemed justified, depends on the result of the formulas used to determine whether an increase is necessary.

These formulas (as found in textit{art. 7a AOW}) look like:

begin{eqnarray}

V^{mbox{AOW}}_{t} := (L_{t}-18.26)-(P_{t-1}-65)nonumber\

V^{mbox{pension}}_{t} := (L_{t+10}-18.26)-(P_{t-1}-65)nonumber\

end{eqnarray}

For the AOW-age and the pensionable-age, respectively.

Here, $t$ is the year in which a possible increase in the AOW-age or pensionable-age will happen, $V_{t}$ determines the increase in the AOW-/pensionable-age and the initial age of accrual, measured in periods of years, $L_{t}$ is the best estimate of the sex-neutral remaining life expectancy of a 65-year old in year $t$. This estimate will be determined by the CBS at $t-5$ for the AOW age.

$P_{t-1}$ is the previous AOW-age or pensionable age.

As a minimum increase in $V^{mbox{AOW}}_{t}$ in one year that prompts an increase in the AOW-age, a value of 0.25 (or three months) is maintained. For any increase below 0.25 (or possibly even a decrease), the AOW-age in that year remains unchanged.

Similarly for the pensionable age, for any increase in $V^{mbox{pension}}_{t}$ in a given year greater than or equal to 1, the pensionable age increases with 1 year, while any increase below 1 or any decrease leaves the pensionable age unchanged in that year.

From this it follows that the forecast of the AOW-age can be expressed as a linear transformation of the projection of the sex-neutral life expectancy:

begin{equation}

P_{t} := P_{t-1}+V_{t} Rightarrow P_{t-1}+(L_{t}-18.26)-(P_{t-1}-65) = L_{t}+46.74

end{equation}

Similarly for the pensionable age:

begin{equation}

P_{t} := P_{t-1}+V_{t} Rightarrow P_{t-1}+(L_{t+10}-18.26)-(P_{t-1}-65) = L_{t+10}+46.74

end{equation}

The incremental increases with three months for the AOW-age and one year for the pensionable-age described above will inevitably cause a mismatch between AOW and pension and the payout of entitlements. While everyone would agree this is undesirable, there is not really a straightforward solution: Having the pensionable age increase in steps of three months would undoubtedly result in large amounts of additional administrative work and costs associated with that, which will ultimately be passed on to the individuals participating in the pension plan. On the other hand, an increase of the AOW-age in stepts of one year would disadvantage individuals very close to retirement to a great extent, since there is usually already a set date for retirement. Increasing the AOW-age by a year will in that case result in such an individual having to bridge a 1-year financial gap with no income and no, or insufficient, pension entitlements.

section{Data}

If we let $x$ be the age of a person and $t$ is the year that this person is $x$ years old, then we need

the number of deaths $D_{x,t}$ and the exposures-to-risk $E_{x,t}$ to construct the mortality model.

$D_{x,t}$ then gives the observed death count of people aged $[x, x + 1)$ in the year $[t,t + 1)$.

In other words, these are the people who die in year $t$ with exact age $x$, before reaching the

age $x+1$. And in a similar way, $E_{x,t}$ is an estimate of the population aged $[x, x + 1)$ exposed to the risk of death during interval $[t,t + 1)$,

based on annual population estimates.

All estimations, forecasts and further computations in this thesis are based on Dutch data of death rates and exposure-to-risk, as made available by the Royal Dutch Actuarial Association (Koninklijk Actuariaal Genootschap), see cite{22}. This dataset contains observations of both 1 year in age and time $(1 times 1)$ with ages $x in mathcal{X} := (0,1,dots,90)$ and years $t in mathcal{T} := (1970,1971,dots,2015)$.

section{Assumptions}

Here we describe the notation used and the assumptions made for further derivation of formulas, formulation of models and estimation of parameters and variables.

subsection{Estimation for total population}

It is common practice to make separate estimations for males and females when using stochastic mortality models, but AOW- and pensionable ages are ultimately based on estimations for the total population, since discrimination based on gender is not allowed (even though gender has significant effect on the remaining life expectancy). Taking this into consideration, we only focus on modeling mortality for the total population, with exception of some visualization of mortality rates for both sexes.

subsection{Stochastic Framework}

First we let the 1-year probability of death $q_{x,t}$ be the probability that an individual aged $x$ is alive at the first day of year $t$, but dies somewhere in the interval $(t,t+1)$.

Taking this one step further, we define $_{k}q_{x,t}$ as being the probability of death for an individual aged $x$ at time $t$ in the interval $(t,t+k)$.

From this it immediately follows that the survival probability of the same individual in the same interval can be defined as $_{k}p_{x,t} = 1- _{k}q_{x,t}$ and we assume that $_{k}q_{x,t}$ as a function of $k$ is differentiable. With these functions it is possible to express textit{death in the next moment of time conditional on survival up to age $x$}, by a conditional density function.

This is the force of mortality $mu_{x,t}$ which is defined as:

begin{equation}

mu_{x,t} := frac{1}{_{k}p_{x,t}} frac{partial_{k} q_{x,t}}{partial k} := -frac{1}{_{k}p_{x,t}} frac{partial_{k} p_{x,t}}{partial k}.

end{equation}

This can be rewritten using an integral, making it look like:

begin{equation}

_{k}p_{x,t} quad := quad text{exp}left(-int_{0}^{k} mu_{x+u,t+u}diff u right).

end{equation}

subsection{Piece-wise constant hazard rates}

To ultimately be able to do calculations, we need approximations of the 1-year mortality rate $q_{x,t}$. If we assume $mu_{x+k,t+k} = mu_{x,t}$ for $k in [0,1)$ and that $mu_{x,t}$ is a piece-wise constant hazard rate, we can write:

subsection{Remaining life excpectancy}

We define remaining life expectancy as the average number of years that an $x$-year old survives.

This we calculate in two different ways, the first being based on the assumption that the 1-year mortality rate does not change over time, notated as $q_{x,t}=q_{x,t+s}$, for all $s>0$.

This would give us the complete remaining period life expectancy, given as:

The other way in which we calculate the remaining life expectancy, is by taking into account all future improvements and deteriorations in mortality rates:

Which is defined as the complete remaining cohort life expectancy, where $q_{x+l},{t+l}$ is the 1-year mortality rate of an individual aged $x+l$ in the year $t+l$.

Although remaining life expectancy in the past was commonly calculated based on periods (likely due to insufficient data and/or computational limitations), it makes much more sense to use the cohort-method, given that an individual experiences mortality rates over consecutive years, which one can visualize as moving diagonally through a table of mortality years for given ages.

In this thesis, we will use both methods for further calculations.

subsection{Closing the life tables for the higher ages}

The dataset used in this thesis only considers the ages $mathcal{X} :={0,dots,90}$, but considering the fact that longevity is the red thread running through this thesis, the higher ages certainly deserve attention and for estimating and forecasting cohort life expectancy, including these higher ages even becomes a necessity. The ages $mathcal{X}_{old} := {91, dots, 120}$ are likely excluded because the corresponding mortality rates are very volatile, caused by a low number of observations and small exposure-to risk. To close the life table, we can use the logistic mortality law of cite{17}. Life tables as established by the Koninklijk Actuarieel Genootschap cite{22} use this method for closing the lifetables, as well. Kannist”{o}’s method works as follows:\

The force of mortality $mu{x,t}$ for the higher ages is modeled as:

and these parameters $phi_{1,t}$ and $phi_{2,t}$ can be estimated for any $t$ by first transforming the mortality rates into hazard rates, under the assumption that these hazard rates are constant during an age-time interval:

We now apply a logistic transform of the form ‘logit($x$) = $ln(frac{x}{1-x})$’ to this equation:

With this linear model, the parameters $phi_{1,t}$ and $phi_{1,t}$ can be estimated by applying Ordinary Least Squares (OLS) to the ages $mathcal{X}_{fit} := {80, dots, 90}$:

and the OLS estimate $hat{phi}_{t}$ is then found by matrix multiplication:

begin{equation}

hat{phi}_{t} quad := quad (X’X)^{-1}X’text{logit}(mu_{80:90,t}).

end{equation}

The estimates $hat{phi}_{1,t}$ and $hat{phi}_{2,t}$ that are found this way, enable the extrapolation of the mortality rates for higher ages:

begin{equation}

q^{text{old.age}}_{x,t} := 1 – text{e}^{(-frac{hat{phi}_{1,t} cdot text{e}^{hat{phi}_{2,t} cdot x}}{1+hat{phi}_{1,t} cdot text{e}^{hat{phi}_{2,t} cdot x}})}, qquad forall x in mathcal{X}_{text{old.age}} := {91, 92, dots, 120}.

end{equation}

Ultimately, we can use the force of mortality found from the above equation to determine the mortality rates for each year $t$, using an equation for the approximation of the 1-year mortality rate $q_{x,t}$:

begin{equation}

q_{x,t} quad := quad 1-p_{x,t} quad = quad 1-text{exp} left( -int_{0}^{1} mu_{x+u,t+u} diff u right) quad approx quad 1-text{exp}(-mu_{x,t})).

end{equation}

subsection{Best estimate and confidence interval}

In this thesis, as in cite{24}, we define the best estimate as the most likely outcome under uncertainty in the projections of our stochastic mortality model.

For the mortality probabilities $q_{x,t}$, we obtain this best estimate by setting the error terms in the stochastic time series to zero. The remaining life expectancy caclulated from the best estimate of the projection table of these $q_{x,t}$ then becomes a best estimate of the remaining life expectancy of its own.\

The above-mentioned best estimate follows from a number of simulations $N$ of the mortality projections, with realizations $X_{1}, dots, X_{N}$.

For a certain confidence level $alpha$, we define the confidence level as $[hat{Phi}^(frac{alpha}{2}), hat{Phi}^(1-frac{alpha}{2})]$, where $hat{Phi}^(frac{alpha}{2})$ and $hat{Phi}^(1-frac{alpha}{2})$ are the estimated lower and upper bound for the $(frac{alpha}{2})$-th percentile and $(1-frac{alpha}{2})$-th percentile, respectively. These estimates for the upper and lower bounds become more accurate as the number of realizations $N$ is increased.

section{Mortality rates}

The maximum likelihood estimator of the mortality rates, as defined previously, is equal to the number of deaths at age $x$ in year $t$ denominated by the corresponding exposure-to-risk $E_{x,t}$:

begin{equation}

mu_{x,t} = frac{D_{x,t}}{E_{x,t}}.

end{equation}

This result follows from the assumption that the distribution of the number of deaths $D_{x,t}$ is Poisson distributed with mean $E_{x,t} cdot mu_{x,t}$ proportional to the population size, as in cite{27}.

The likelihood function $L$ in this case looks like:

begin{equation}

L quad := quad prodlimits_{x in mathcal{X}} prodlimits_{t in mathcal{T}}left[frac{(E_{x,t}cdot mu_{x,t})^{D_{x,t}} cdot text{e}^{-E_{x,t} cdot mu_{x,t}}}{D_{x,t}!} right].

end{equation}

The log-likelihood is then derived from this as:

begin{equation}

ln(L) quad := quad sumlimits_{x in mathcal{X}} sumlimits_{t in mathcal{T}} left[D_{x,t} cdot (ln(E_{x,t}cdot mu_{x,t})-E_{x,t} cdot mu_{x,t} – ln(D_{x,t}!)right],

end{equation}

and from this, we maximize this log-likelihood by taking the partial derivative with respect to $mu_{x,t}$:

begin{equation}

frac{partial ln(L)}{partial mu_{x,t}} quad := quad frac{D_{x,t}}{mu_{x,t}}-E_{x,t} quad = quad 0.

end{equation}

When we rewrite this we end up with the equation introduced at the beginning of this subsection, which is the estimator of the hazard rate $hat{mu}_{x,t}$, for which the log-likelihood is maximized:

begin{equation}

hat{mu}_{x,t} quad = quad frac{D_{x,t}}{E_{x,t}} quad = quad m_{x,t}.

end{equation}

This $hat{mu}_{x,t}$ stands for the number of people that died during an $(x,t)$-interval, divided by the number of people that textit{could have died} during the same interval, and this estimator is known as the textit{central death rate} which is denoted, as in the above equation, as $m_{x,t}$. This ratio allows for visualizing using the available datasets. The image below shows the pattern of logarithmic death rates for ages 0 to 90 for both males, females and the total population, respectively:

begin{figure}[H]

centering

includegraphics[width=0.7linewidth]{“../Data/Rplot Log Death Rates Age”}

caption{Logarithm of death rates for Dutch males, females and total population for ages 0 to 90 and all years}

label{fig:rplot-log-death-rates-age}

end{figure}

Notice the rates declining for children as they age, with the ‘accident bump’ around age 20 much more pronounced for males.\

If we take a look at the log central death rates relative to time, we get different insights:\

begin{figure}[H]

centering

includegraphics[width=0.7linewidth]{“../Data/Rplot Log Death Rates Time”}

caption{Estimates of the log force of mortality for Dutch males, females and total population for the period 1970 to 2015 for all ages}

label{fig:rplot-log-death-rates-time}

end{figure}

This figure shows that mortality in the Dutch population is falling rapidly for the younger ages, but not all that much for the highest ages.

Finally, we can create a 3D plot for the log death rates, plotting them against both time and age, resulting in the following image:

begin{figure}[H]

centering

includegraphics[width=0.7linewidth]{“../Data/Rplot central death rates”}

caption{Estimates of the log force of mortality for ages 0 to 90 and years 1970 to 2015 for the total Dutch population}

label{fig:rplot-central-death-rates}

end{figure}

This image shows consistenly higher log central death rates for the earlier years in the data set, as well as a more pronounced ‘accident bump’. Later years show a more smooth progression as the age increases and lower values for the log central death rates, supporting the accepted and obvious view that going into the future, longevity is increasing.\

chapter{Stochastic Mortality Model}

Human mortality modeling is done using the Lee-Carter model, defined as:

begin{equation}

ln (hat{mu}_{x,t}) := ln left(frac{D_{x,t}}{E_{x,t}}right) := alpha_{x}+beta_{x} kappa_{t}+epsilon_{x,t}.

end{equation}

Here $alpha_{x}$ and $beta_{x}$ are age-dependent parameters which do not change over time, while $kappa_{t}$ is a time-varying parameter which does not depend on the age $x$. More specifically:

begin{itemize}

item $alpha_{x}$ is the average of the logarithm of the central death rates ($ln(m_{x,t})$) over time;

item $beta_{x}$ is the rate of change of the central death rates for age $x$ given a change in the time-varying parameter $kappa_{t}$;

item $kappa_{t}$ is defined as the time-varying index of the level of mortality;

end{itemize}

This way, the model separates age- and period effects. The error term is assumed to be an independent identically distributed Gaussian variable for all pairs $(x,t)$.

To fit the model, there are three fitting procedures that are most commonly associated with the Lee-Carter model.

The first option is through singular value composition, where one would have a least squares criterion minimizing the sum of squared residuals as below:

begin{equation}

SSR := underset{alpha_{x},beta_{x},kappa_{t}}{text{min}} sumlimits_{x in mathcal{X}} sumlimits_{t in mathcal{T}}left(lnleft(frac{D_{x,t}}{E_{x,t}}right)-alpha_{x}-beta_{x} cdot kappa_{t}right)^{2}

end{equation}

The problem with this approach is that in the event that the number of deaths $D_{x,t}$ or exposure-to-risk $E_{x,t}$ equals zero, it results in an error. Sticking with singular value decomposition, this can only be avoided by excluding the conflicting age(s) $x$ or year(s) $t$, which becomes problematic when we want to use the full dataset.

A second method of fitting then, is to use weighted least squares, where the $D_{x,t}$ act as weights for the observations:

begin{equation}

WSSR := underset{alpha_{x},beta_{x},kappa_{t}}{text{min}} sumlimits_{x in mathcal{X}} sumlimits_{t in mathcal{T}}D_{x,t} cdot left(lnleft(frac{D_{x,t}}{E_{x,t}}right)-alpha_{x}-beta_{x} cdot kappa_{t}right)^{2}

end{equation}

Although this method avoids the problem of errors caused by taking the logarithm of zero, both these methods share the drawback that errors are assumed to be normally distributed and homoskedastic. Considering that the smaller (absolute) number of deaths at the older ages makes the logarithm of the central death rates much more variable for these ages, these seem like unrealistic assumptions.

The third option available for the fitting procedure, which does not suffer from the problems mentioned above, is ‘maximum likelihood estimation’, which comes down to maximizing the likelihood of a Poisson distribution, explained in much greater detail by cite{25}, among others.

If we let $D_{x,t} sim text{Pois}(E_{x,t}cdotmu_{x,t})$, and take the hazard rates to be $mu_{x,t} := text{e}^{alpha_{x}+beta_{x} kappa_{t}}$, this makes it so that the natural logarithm of the hazard rates is now specified like the Lee-Carter model. We can now define the likelihood function $L$:

begin{equation}

L := prodlimits_{x in mathcal{X}} prodlimits_{t in mathcal{T}}D_{x,t} cdot left(frac{(E_{x,t}cdot text{e}^{alpha_{x}+beta_{x} kappa_{t}})^{D_{x,t}}}{D_{x,t}!} cdot text{e}^{-E_{x,t}cdot text{e}^{alpha_{x}+beta_{x} kappa_{t}}}right)

end{equation}

with log-likelihood, $ln(L)$:

begin{equation*}

begin{aligned}

&ln(L) quad& := quad & sumlimits_{x in mathcal{X}} sumlimits_{t in mathcal{T}} left(D_{x,t} cdot (ln(E_{x,t})+alpha_{x}+beta_{x}kappa_{t})-E_{x,t} cdot text{e}^{alpha_{x}+beta_{x}kappa_{t}} – ln(D_{x,t}!)right) \

& quad & := quad & sumlimits_{x in mathcal{X}} sumlimits_{t in mathcal{T}} left(D_{x,t} cdot (alpha_{x}+beta_{x}kappa_{t})-E_{x,t} cdot text{e}^{alpha_{x}+beta_{x}kappa_{t}} right) + C,

end{aligned}

end{equation*}

with C a constant.

To maximize the log-likelihood, we first have to perform partial differentiation with respect to $alpha_{x}$, $beta_{x}$ and $kappa_{t}$:

begin{equation*}

begin{aligned}

& frac{partial ln(L)}{partial alpha_{x}} & := &text{ } sumlimits_{t in mathcal{T}} left(D_{x,t} – E_{x,t} cdot text{e}^{alpha_{x}+beta_{x}kappa_{t}} right) & = &text{ } 0, & forall x in mathcal{X}. \

& frac{partial ln(L)}{partial beta_{x}} & := &text{ } sumlimits_{t in mathcal{T}} left(D_{x,t} cdot kappa_{t} – E_{x,t} cdot kappa_{t} cdot text{e}^{alpha_{x}+beta_{x}kappa_{t}} right) & = &text{ } 0, & forall x in mathcal{X}. \

& frac{partial ln(L)}{partial kappa_{t}} & := &text{ } sumlimits_{x in mathcal{X}} left(D_{x,t} cdot beta_{x} – E_{x,t} cdot beta_{x} cdot text{e}^{alpha_{x}+beta_{x}kappa_{t}} right) & = & text{ }0, & forall t in mathcal{T}. \

end{aligned}

end{equation*}

This system of equations cannot be solved analytically, but it is possible to find an approximation through an iterative procedure using the Newton-Raphson method. This can be written as:

begin{equation*}

begin{aligned}

&hat{alpha}_{x,n+1} & := & &hat{alpha}_{x,n} – & frac{sumlimits_{t in mathcal{T}}left(D_{x,t} – E_{x,t} cdot text{e}^{hat{alpha}_{x,n}+hat{beta}_{x,n}hat{kappa}_{t,n}} right)}{-sumlimits_{t in mathcal{T}}left(E_{x,t} cdot text{e}^{hat{alpha}_{x,n}+hat{beta}_{x,n}hat{kappa}_{t,n}}right)}, & forall x in mathcal{X}. \

&hat{beta}_{x,n+1} & := & &hat{beta}_{x,n} – & frac{sumlimits_{t in mathcal{T}} hat{kappa}_{t,n+1} cdot left(D_{x,t} – E_{x,t} cdot text{e}^{hat{alpha}_{x,n+1}+hat{beta}_{x,n}hat{kappa}_{t,n+1}} right)}{-sumlimits_{t in mathcal{T}} hat{kappa}_{t,n+1}^{2} cdotleft(E_{x,t} cdot text{e}^{hat{alpha}_{x,n+1}+hat{beta}_{x,n}hat{kappa}_{t,n+1}}right)}, & forall x in mathcal{X}. \

&hat{kappa}_{x,n+1} & := & & hat{kappa}_{x,n} – & frac{sumlimits_{t in mathcal{T}} hat{beta}_{x,n} cdot left(D_{x,t} – E_{x,t} cdot text{e}^{hat{alpha}_{x,n+1}+hat{beta}_{x,n}hat{kappa}_{t,n}} right)}{-sumlimits_{x in mathcal{X}} hat{beta}_{x,n}^{2} cdotleft(E_{x,t} cdot text{e}^{hat{alpha}_{x,n+1}+hat{beta}_{x,n}hat{kappa}_{t,n}}right)}, & forall t in mathcal{T}. \

end{aligned}

end{equation*}

By imposing the restrictions that

begin{eqnarray}

sumlimits_{x in mathcal{X}} hat{beta}_{x} = 1 & text{and} & sumlimits_{t in mathcal{T}} hat{kappa}_{t} = 0,

end{eqnarray}

using a set of starting values $S_{0} := (alpha_{x,0}=0, beta_{x,0}=1, kappa_{t,0}=0)$, and following through the iterative procedure, the parameters can be solved numerically by setting a remainder term and continuining the procedure of computing and adjusting the parameters until successive computations result in a change that is no larger than the specified remainder term.

section{Forecasting}

Once fitted with historical data, this model yields a set of parameter estimates $mathcal{X} = (hat{alpha}_{x},hat{beta}_{x},hat{kappa}_{t})$. The time factor $hat{kappa}_{t}$ is a stochastic process for all $t$ after today. We need an ARIMA time series model to forecast this time factor, and for this we proceed in the same manner as Lee and Carter did for the U.S. population, where they used a random walk with drift model (ARIMA(0,1,0)).

This model is defined as:

begin{equation}

kappa_{t+1} := kappa_{t} + theta + sigma cdot zeta_{t},

end{equation}

with $theta$ the drift term, $sigma$ the standard deviation of the (random) noise and the $zeta_{t}$ are considered iid standard Gaussian variables.

To get estimates $hat{theta}$ and $hat{sigma}$, we consider the following:

begin{equation}

begin{aligned}

Deltakappa_{t} quad &:= quad & quad kappa_{t-1}-kappa_{t} quad & = & quad theta+sigma cdot zeta_{t}. quad & & \

E[Deltakappa_{t}] quad &:= quad & quad E[theta+sigma cdot zeta_{t}] quad & = & quad theta+sigma cdot E[zeta_{t}] quad & = & hat{theta}. \

Var[Deltakappa_{t}] quad & := quad & quad Var[theta+sigma cdot zeta_{t}] quad & = & quad sigma^{2} cdot E[zeta_{t}] quad & = & hat{theta}^{2}. \

end{aligned}

end{equation}

The estimates $hat{theta}$ and $hat{sigma}$ that we obtain from this along with the last value of $hat{kappa}_{t}$ generate mortality-level forecasts, which are denoted as $tilde{kappa}_{t}$. If we assume $alpha_{x}$ and $beta_{x}$ to remain constant over time, these $tilde{kappa}_{t}$-forecasts enable us to make forefasts of future mortality rates:

begin{equation}

begin{aligned}

mu_{x,t} quad & approx & quad m_{x,t} quad &:=& quad text{e}^{hat{alpha}_{x}+hat{beta}_{x} hat{kappa}_{t}}.\

q_{x,t} quad & approx & quad 1-text{e}^{-mu_{x,t}} quad & approx & quad 1-text{e}^{-text{e}^{hat{alpha}_{x}+hat{beta}_{x} hat{kappa}_{t}}}.\

end{aligned}

end{equation}

Lee and Carter went with this time series model because of its relative simplicity and good fit, but as mentioned, they concerned themselves with forecasting $hat{kappa}_{t}$ for the U.S. population, which means it might not necessarily be the best choice for the Dutch population.

Regardless, the ease of using ARIMA(0,1,0) compared to more complex time series models is too good to pass up and as such, forecasting of $hat{kappa}_{t}$

for the Dutch data will be based on this time series model.\

It should be noted that forecasts based on the Lee-Carter method are nothing more than extrapolations of past trends. The model does not take into account any improvements or deteriorations in mortality due to external factors, such as medical innovations, disasters, war, socio-economic changes.\

chapter{Results and Analysis}

The previous section covered the different components of the stochastic mortality model used. In this section we will look at and discuss the parameters as estimated and fitted using the mortality model. Using this, we find estimations of the remaining life expectations and make a distinction for groups of different education leve/income. Forecasts are done starting frome the year 2015 (as this is the last year for which data is available). All estimations are done in $R$, with the exception of some tinkering in Excel, due to the CBS data coming in an $.xls$-file and not needing much programming.

section{Parameter estimates}

We fit the Lee-Carter model with historical mortality data, which we obtain from the datasets provided by cite{22} (Koninklijk Actuarieel Genootschap). The data of importance are $D^{text{NL}}_{x,t}$ and $E^{text{NL}}_{x,t}$, which are the observed number of deaths and the exposures-to-risk for the Netherlands for the period 1970-2015. We only consider the data for the total population, which includes the ages 0-90. All the ages above need to be extrapolated using the method by Kannist”{o} as explained in the previous section.\

It was found earlier that the maximum likelihood estimation method was best suited for our fitting procedure with Lee-Carter and the estimated parameters ($hat{alpha}_{x},hat{beta}_{x}, hat{kappa}_{t}$) are shown below:

begin{figure}[H]

centering

includegraphics[width=1linewidth, height=.31textheight]{“../Data/Rplot Parameters x3”}

caption{Estimated parameters of the Lee-Carter model}

label{fig:rplot-parameters-x3}

end{figure}

Here $hat{alpha}_{x}$ is the effect of age $x$, or the average of the log central death rates $m_{x,t}$ over time. The figure for $hat{alpha}_{x}$ clearly shows how mortality decreases rapidly from infancy to (early) adolescence, with the ‘accident bump’ somewhere around the age of 20 and then increasing with age after that.

The estimates $hat{kappa}_{t}$ stand for the effect of calendar year $t$, capturing the time-variation in the level of mortality. At each age, and when $kappa_{t}$ is linear in time, the central death rates change in accordance with this time-variation at a constant exponential rate, as can be seen in the figure. The values of $hat{kappa}_{t}$ are decreasing approximately linearly for the total population.

Finally, $hat{beta}_{x}$ is the effect of a specific age responding with a particular calendar year, or how $m_{x,t}$ declines for age $x$ in response to a change in $hat{kappa}_{t}$.

From the figure we see that the values of $hat{beta}_{x}$ are decreasing and positive, with higher values at the low ages and a hump near the retirement age, suggesting that developments in the central death rates are strongest for both the lowest and the highest ages.\

The mortality-level index $hat{kappa}_{t}$ is a stochastic process for $t$ into the future. As described in the previous chapter, we use an appopriate ARIMA$(p,d,q)$ time series model to forecast this time factor $hat{kappa}_{t}$ using a random walk with drift (RWD) model in the same way Lee and Carter did for the U.S. population. This model might not necessarily be the best option for the Dutch population, as mentioned before, but for the sake of parsimony we will use it regardless. After computation, this results in the parameter estimates $(hat{theta},hat{sigma})$ of the RWD-model for the Dutch population shown in the table below.\

begin{table}[H]

centering

begin{tabular}{ccc} cline{1-3}

& Drift Term & Standard error\

Population & $hat{theta}$& $hat{sigma}$\ hline

hline

Total & -1.905928 & 2.368908 \ [.25ex]

hline

end{tabular}

caption{Estimates of the drift term and standard error for the total Dutch population}

end{table}

section{Forecasting the remaining life expectancy}

Reiterating our assumptions that remaining life expectancy is estimated based on both period- and cohort methods and that the best estimate is the most likely outcome under uncertainty in the projections of our stochastic mortality model, combined with Kannist”{o}’s method for closing the life tables, we are able to produce projections of the remaining life expectancy for 0-, 25-, 65- and 85-year old individuals based on 10,000 simulations of mortality rates projections, which are then transformed into life expectancies. These projections are shown for period life expectancy in the following figure:\

begin{figure}[H]

label{}

begin{minipage}[b]{0.5linewidth}

centering

includegraphics[width=textwidth]{“../Data/Rplot Rem Life Exp 0”}

caption*{Remaining life expectancy of a 0-year old}

vspace{2ex}

end{minipage}%%

begin{minipage}[b]{0.5linewidth}

centering

includegraphics[width=textwidth]{“../Data/Rplot Rem Life Exp 25”}

caption*{Remaining life expectancy of a 25-year old}

vspace{2ex}

end{minipage}

begin{minipage}[b]{0.5linewidth}

centering

includegraphics[width=textwidth]{“../Data/Rplot Rem Life Exp 65”}

caption*{Remaining life expectancy of a 65-year old}

vspace{2ex}

end{minipage}%%

begin{minipage}[b]{0.5linewidth}

centering

includegraphics[width=textwidth]{“../Data/Rplot Rem Life Exp 85”}

caption*{Remaining life expectancy of an 85-year old}

vspace{2ex}

end{minipage}

caption{Projections of the remaining period life expectancies for the total Dutch population based on 10,000 simulations}

end{figure}

What becomes visible from these projections is that remaining life expectancy for all ages is increasing as we go into the future, and that the rate at which remaining life expectancy increases, is decelerating. The projected life expectancy increasing, results directly from the decreasing projected mortality rates, as per our assumptions about the stochastic framework of the used model. While the (admittedly poor) choice of units on the $y$-axis makes it so that it’s not directly clear, the figures suggest that improvements in life expectancy are on the one hand larger for the lower ages compared to the higher ages, while on the other hand the systematic uncertainty as expressed by the confidence intervals is also considerably higher for these lower ages in contrast with the higher ages.\

For the cohort remaining life expectancy we move through the life table diagonally, taking the 1-year mortality rates of the years in which a certain age is attained, so it is necessary to make sure that for age $x$, there are just as many calendar years $t$ (starting from 1970) as there are textit{years after $x$}, or $(x-120)$, since we take $x=120$ to be the maximum attainable age. This approach is more realistic, but due to the increased amount of estimation and forecasting involved, overall there is more uncertainty, as well. Because of this, the confidence intervals appear earlier on compared to the period life expectancy, because they are partially forecasted even for calendar years for which estimates of the remaining life expectancy are available. This is shown in the following figure:

begin{figure}[H]

label{}

begin{minipage}[b]{0.5linewidth}

centering

includegraphics[width=textwidth]{“../Data/Rplot Cohort Rem Lif Exp 0”}

caption*{Remaining life expectancy of a 0-year old}

vspace{2ex}

end{minipage}%%

begin{minipage}[b]{0.5linewidth}

centering

includegraphics[width=textwidth]{“../Data/Rplot Cohort Rem Lif Exp 25”}

caption*{Remaining life expectancy of a 25-year old}

vspace{2ex}

end{minipage}

begin{minipage}[b]{0.5linewidth}

centering

includegraphics[width=textwidth]{“../Data/Rplot Cohort Rem Lif Exp 65”}

caption*{Remaining life expectancy of a 65-year old}

vspace{2ex}

end{minipage}%%

begin{minipage}[b]{0.5linewidth}

centering

includegraphics[width=textwidth]{“../Data/Rplot Cohort Rem Lif Exp 85”}

caption*{Remaining life expectancy of an 85-year old}

vspace{2ex}

end{minipage}

caption{Projections of the remaining cohort life expectancies for the total Dutch population based on 10,000 simulations}

end{figure}

From these figures, we can see that incorporating the effects of future improvements/deteriorations in mortality into these projections results in a much more smooth increase.

Furthermore, the best estimates of the projections for cohort life expectancy seem to be higher than for period life expectancy. Considering the fact that the cohort method takes future effects on mortality rates into account while the period method does not, leads to the conjecture that projections of the period life expectancy are understimated for this method.

A final thing we observe is that while there is more uncertainty overall with cohort life expectancy, the confidence intervals for the later years are actually more narrow than with period life expectancy. This is due to the fact that the dynamics in later years as described before are not taken into account with the latter method.\

section{Forecasting the AOW-age}

In the previous section, we looked at projections of the remaining life expectancy for both period- and cohort life expectancy for different ages. In this section, we focus on the life expectancies of 65-year old individuals over time and we use these forecasts as the best estimates $(L_{t}$ in the formula first introduced in Chapter 2:

begin{equation}

V^{mbox{AOW}}_{t} := (L_{t}-18.26)-(P_{t-1}-65)nonumber\

end{equation}

As described in Chapter 2, we only consider increases in remaining life expectancy after retirement of greater than or equal to 0.25, or three months. Applying this formula to the remaining life expectancy projections as described in the previous section, results in the following figure:

begin{figure}[H]

centering

includegraphics[width=1linewidth]{“../Data/Rplot AOW-age”}

caption{AOW-age forecast for both period- and cohort life expectancy for a 65-year old}

label{fig:rplot-aow-age}

end{figure}

We see the best estimate and the upper- and lower limits for the AOW-age projections into the future.

Interestingly, both period- and cohort remaining life expectancy projections yield this same AOW-age projection, despite the fact that the cohort-based projections are higher. This is probably because of the constraint that an increase in remaining life expectancy after retirement has to be at least three months, so even if the cohort-based projections are somewhat higher than the period-based projections, because remaining life expectancy at $x=65$ increases at more or less the same rate, the AOW-age will rise equally for both approaches.

This same constraint for the AOW-age increase is also responsible for the clear staircase pattern we see in the figure. We see that the maximum and minimum values for the AOW-age increase appear from the year 2022 onwards. This is because the increase schedule up to the year 2021 is fixed (see Chapter 2).

Below we compare the best estimates of the increase in AOW-age as forecasted using the Lee-Carter model with the predictions made by the CBS. From the table we can see that both approaches give different best estimates right from the start and the difference increases more and more over time, which finally results in a difference of 1.5 years in the best estimate of the AOW-age for the two different models. It is safe to assume that the Lee-Carter model, being much simpler than the models used by the CBS, underestimates increases in longevity and the associated AOW-age increases.\

begin{table}[H]

renewcommand{arraystretch}{1.75}

setlengthtabcolsep{.5cm}

label{their-label}

begin{tabular}{@{hskip3pt}c@{hskip3pt}||@{hskip8pt}c@{hskip8pt}@{hskip8pt}c@{hskip8pt}c@{hskip8pt}c@{hskip8pt}@{hskip8pt}c@{hskip8pt}@{hskip8pt}c@{hskip8pt}@{hskip8pt}c@{hskip8pt}@{hskip8pt}c@{hskip8pt}@{hskip8pt}c@{hskip8pt}@{hskip8pt}c@{hskip8pt}} cline{1-10}

multirow{2}{*}{Model} & multicolumn{9}{c}{textbf{Best Estimate AOW-age per year}} \

&2022 &2025 &2030 &2035 &2040 &2045 &2050 &2055 &2060 \ hline

hline

CBS (2014-2060) &67.25 &67.50 &68.00 &68.50 &69.25 &69.75 &70.25 &71.00 &71.50 \ hline

Lee-Carter &67.00 &67.25 &67.75 &68.00 &68.50 &69.00 &69.25 &69.75 &70 \ hline

hline

end{tabular}

caption{Best estimates of the AOW-age as published by the CBS and as forecasted with Lee-Carter (Period+Cohort). Source: cite{28}.}

end{table}

vspace{2cm}

section{Forecasting the pensionable age}

Similar to the AOW-age, the pensionable age increases based on the following formula first introduced in Chapter 2:

begin{equation}

V^{mbox{pension}}_{t} := (L_{t+10}-18.26)-(P_{t-1}-65)nonumber.\

end{equation}

For the pensionable-age, we only consider increases $V^{mbox{pension}}_{t}$ greater than or equal to 1 year. Any year-to-year increase in $V_{t}$ less than 1 will leave the pensionable age unchanged for that year. Furthermore, the best estimate of the remaining life expectancy is taken from a calendar year that is 10 years later compared to the year of a possible pensionable-age increase. Because of this, AOW-age and pensionable-age will be different.

Below are the projections of the pensionable age using both period- and cohort life expectancy for the total Dutch population:

begin{figure}[H]

label{}

begin{minipage}[b]{0.5linewidth}

centering

captionsetup{justification=centering}

includegraphics[width=textwidth]{“../Data/Rplot Period Pens age”}

caption{Pensionable-age forecast\ for period life expectancy}

vspace{2ex}

end{minipage}%%

begin{minipage}[b]{0.5linewidth}

centering

captionsetup{justification=centering}

includegraphics[width=textwidth]{“../Data/Rplot Cohort Pens age”}

caption{Pensionable-age forecast\ for cohort life expectancy}

vspace{2ex}

end{minipage}

end{figure}

As with the AOW-age projections, we see the pensionable-age projections following a staircase pattern as they are steadily increasing into the future. The same goes for the minimum and maximum values of the pensionable age projections, as visualized by the blue squares. Another observation we make is that the increase seems to be decelerating: the horizontal levels between pensionable-age increases get longer going into the future, the tables below also seem to suggest this.

begin{table} [H]

renewcommand{arraystretch}{1.75}

setlengthtabcolsep{.5cm}

label{tho-label}

begin{tabular}{@{hskip3pt}c@{hskip3pt}||@{hskip8pt}c@{hskip8pt}@{hskip8pt}c@{hskip8pt}@{hskip8pt}c@{hskip8pt}@{hskip8pt}c@{hskip8pt}@{hskip8pt}c@{hskip8pt}@{hskip8pt}c@{hskip8pt}@{hskip8pt}c@{hskip8pt}@{hskip8pt}c@{hskip8pt}@{hskip8pt}c@{hskip8pt}@{hskip8pt}c@{hskip8pt}} cline{1-11}

multirow{2}{*}{Model} & multicolumn{10}{c}{textbf{Best Estimate pensionable-age per year (period)}} \

&2015 &2020 &2025 &2030 &2035 &2040 &2045 &2050 &2055 &2060 \ hline

hline

CBS (2014-2060)&67 &68 &68 &69 &69 &70 &71 &71 &… &… \ hline

Lee-Carter&67 &67 &68 &68 &69 &69 &69 &70 &70 &70 \ hline

hline

end{tabular}

caption{Best estimates of the AOW-age as published by the CBS and as forecasted with Lee-Carter (Period). Source: cite{28}.}

end{table}

begin{table} [H]

renewcommand{arraystretch}{1.75}

setlengthtabcolsep{.5cm}

label{his-label}

begin{tabular}{@{hskip3pt}c@{hskip3pt}||@{hskip8pt}c@{hskip8pt}@{hskip8pt}c@{hskip8pt}@{hskip8pt}c@{hskip8pt}@{hskip8pt}c@{hskip8pt}@{hskip8pt}c@{hskip8pt}@{hskip8pt}c@{hskip8pt}@{hskip8pt}c@{hskip8pt}@{hskip8pt}c@{hskip8pt}@{hskip8pt}c@{hskip8pt}@{hskip8pt}c@{hskip8pt}} cline{1-11}

multirow{2}{*}{Model} & multicolumn{10}{c}{textbf{Best Estimate pensionable-age per year (cohort)}} \

&2015 &2020 &2025 &2030 &2035 &2040 &2045 &2050 &2055 &2060 \ hline

hline

CBS (2014-2060) &67 &68 &68 &69 &69 &70 &71 &71 &… &… \ hline

Lee-Carter &67 &67 &67 &68 &68 &69 &69 &69 &70 &70 \ hline

hline

end{tabular}

caption{Best estimates of the AOW-age as published by the CBS and as forecasted with Lee-Carter (Cohort). Source: cite{28}.}

end{table}

When we compare AOW-age increase and pensionable-age increase, we observe that on average the projections of the the pensionable age are higher than the AOW-age projections, the result from taking the best estimate from a calendar year 10 years into the future in the case of pensionable-age projections.\

In the above tables, we also compare the best estimates of the pensionable-age as estimated using the Lee-Carter forecasts and using the remaining life-expectancy forecasts as published by the CBS cite{28}. As with the AOW-age projections, the Lee-Carter model underestimates the increase in pensionable-age compared to the CBS-model.

The missing values for CBS in the years $x=2055$ and $x=2060$ are due to the fact that the best estimate of the remaining life expectancy is taken 10 years into the future, while the CBS only has data available up until the year 2060.

subsection{Difference between AOW- and pensionable-age}

When we look at the projections for AOW- and pensionable-age, we see that the pensionable-age increases faster on average, but since this increase also looks like it is decelerating over time, we can expect to see that the difference between AOW-age and pensionable-age will fluctuate less and less over time. This is visualized in the following figure:

begin{figure}[H]

label{}

begin{minipage}[b]{0.5linewidth}

centering

captionsetup{justification=centering}

includegraphics[width=textwidth]{“../Data/Rplot Period diff Pens-AOW”}

caption{Best-estimate projection of the difference between AOW-age and pensionable age (Period)}

vspace{2ex}

end{minipage}%%

begin{minipage}[b]{0.5linewidth}

centering

captionsetup{justification=centering}

includegraphics[width=textwidth]{“../Data/Rplot Cohort diff Pens-AOW”}

caption{Best-estimate projection of the difference between AOW-age and pensionable age (Cohort)}

vspace{2ex}

end{minipage}

end{figure}

There is a clear pattern visible, where the difference decreases to 0 over the span of a few years as the AOW-age catches up to the increased pensionable-age, only for this difference to shoot up from one year to the next as the pensionable-age increases with 1 year in the new year.

For both period life expectancy and cohort life expectancy we see that the first spike in difference in considerably higher than the other spikes. This is because of the pensionable-age increasing by 2 years starting from the year 2014. We see this a decline from this spike as the AOW-age is also increased according to the increase schedule as provided in Chapter 2, then starting from 2022, any further increase in both AOW-age and pensionable-age becomes dependent on the best estimate of the remaining life expectancy at age 65, and the staircase pattern emerges. One observation that can be made for both these figures is that the difference on average decreases for each time the pattern is repeated: For period life expectancy we see that the 0-level is held longer each time it is reached, while for the cohort life expectancy we see the 0.5-level (the largest difference) being held shorter each time the pattern repeats. Interesting to note is that for cohort life expectancy, the difference also becomes negative, meaning that sometimes the AOW-age will actually be larger than the pensionable-age when forecasts are based on this method.\

section{Comparing remaining life expectancy and entitlemens for different education levels}

In this next section we consider the effect of differences in social standing on the remaining life expectancy at the pensionable age and AOW-age after both of these have been increased. CBS offers data on life expectancy and education level cite{18} and life expectancy and income level cite{19}. Both of these have few datapoints that can be used for the specified purpose, with six datapoints for education level and three for income level. For this reason, we make the generalizations that education and income are strongly correlated, that any result stemming from the effect of education also applies to income and as such, that we can only look at the effect of education (supposing that twice the amount of datapoints is likely to give a more accurate result).

cite{18} gives the average remaining life expectancy at 65 for 6 periods for both males and females and for different education levels (which can be translated to ‘elementary-‘, ‘lower secondary-‘, ‘higher secondary-‘ and ‘higher education’). In line with the rest of this thesis, men and women are not considered separately, so the average of the remaining life expectancies of both is taken. Furthermore, the average life expectancy regardless of education level is taken to be the average of all four education levels.

This gives the following graphical representation:

begin{figure}[H]

centering

includegraphics[width=0.7linewidth]{../Data/Excel-afbeelding}

caption{Remaining life expectancy at age 65 for different levels of education, including tangent lines.}

label{fig:excel-afbeelding}

end{figure}

As can be seen, the (scarce) data suggests that the effect of different education levels decreases as time progresses, which seems like a plausible assumption, although the tangent lines in this graph also suggest that an individual with only elementary education will eventually outlive the average individual, who in turn will at some point outlive a person with higher education. This seems highly unlikely, but since our forecasts from the previous *chapter? do not reach the years in which these tangent lines cross, we can proceed without too much trouble. On further note, the first intuition would be to use something like a logarithmic function instead of a linear function for the tangent lines (based on the assumption that remaining life expectancy will level off at some point), but because there is so little data, this results in a very exaggerated progression of the tangent line that makes it impossible to work with.

We take the plotted ‘Average’ as being representative of the remaining life expectancy as was estimated and forecasted previously, and we find the equivalents of these for elementary- and higher education by discounting the ‘Average’. First, divide the remaining life expectancy of Elementary and Higher by Average, to find the ratio by which they differ from the average in the year 2000. This gives:

begin{eqnarray}

15.75/17.975=0.8762 & text{and} &19.95/17.975=1.1099,

end{eqnarray}

for Elementary and Higher, respectively. Next, we solve the tangent equations:

So Elementary equals Average roughly 106 years after the year 2000, and similarly Higher would we equal to Average about 310 years after the year 2000. Solving the following, gives the required discount yields:

Taking $t=2000$ as the starting point, it is now possible to discount for lower-than-average education in any year down to 1970 by multiplying with $0.8762*(1.001248)^{(j-31)}$, and similarly for higher-than-average education by multiplying with $1.1099*(.999664))^{(j-31)}$. Applied to both the period- and cohort method, this gives the following visual representations: