Loss Curve Modelling

Mick Cooney mickcooney@gmail.com

2018-07-17

Introduction

NAIC Schedule P Dataset

## Observations: 77,900
## Variables: 14
## $ grcode          <chr> "266", "266", "266", "266", "266", "266", "266", "2...
## $ grname          <chr> "Public Underwriters Grp", "Public Underwriters Grp...
## $ accidentyear    <int> 1988, 1988, 1988, 1988, 1988, 1988, 1988, 1988, 198...
## $ developmentyear <int> 1988, 1989, 1990, 1991, 1992, 1993, 1994, 1995, 199...
## $ developmentlag  <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 7,...
## $ incurloss       <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 22, 24, 21, 24, 25, 2...
## $ cumpaidloss     <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 6, 20, 21, 23, 24, 24...
## $ bulkloss        <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 6, 1, 0, 0, 0, 0, 0, ...
## $ earnedpremdir   <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 25, 25, 25, 25, 25, 2...
## $ earnedpremceded <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
## $ earnedpremnet   <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 25, 25, 25, 25, 25, 2...
## $ single          <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
## $ postedreserve97 <int> 932, 932, 932, 932, 932, 932, 932, 932, 932, 932, 9...
## $ lob             <chr> "comauto", "comauto", "comauto", "comauto", "comaut...
## Observations: 77,900
## Variables: 10
## $ grcode     <chr> "266", "266", "266", "266", "266", "266", "266", "266", ...
## $ grname     <chr> "Public Underwriters Grp", "Public Underwriters Grp", "P...
## $ lob        <chr> "comauto", "comauto", "comauto", "comauto", "comauto", "...
## $ acc_year   <chr> "1988", "1988", "1988", "1988", "1988", "1988", "1988", ...
## $ dev_year   <int> 1988, 1989, 1990, 1991, 1992, 1993, 1994, 1995, 1996, 19...
## $ dev_lag    <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 7, 8, 9...
## $ premium    <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 25, 25, 25, 25, 25, 25, 25...
## $ cum_loss   <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 6, 20, 21, 23, 24, 24, 24,...
## $ loss_ratio <dbl> NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, 0.2400...
## $ dev_factor <dbl> NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, 0.2500...

Grid Approximation

2D Grid

Specifying the Model

Core Concept


\[ \text{Loss}(t) = \text{Premium} \times \text{ULR} \times \text{GF}(t) \]

Full Specification


\[ \text{Loss}(Y, t) \sim \text{Normal}(\mu(Y, t), \, \sigma_Y) \]

where

\[\begin{eqnarray*} \mu(Y, t) &=& \text{Premium}(Y) \times \text{LR}(Y) \times \text{GF}(t) \\ \text{GF}(t) &=& \text{growth function of } t \\ \sigma_Y &=& \text{Premium}(Y) \times \sigma \\ \text{LR}_Y &\sim& \text{Lognormal}(\mu_{\text{LR}}, \sigma_{\text{LR}}) \\ \mu_{\text{LR}} &\sim& \text{Normal}(0, 0.5) \end{eqnarray*}\]

Importance of the Functional Form


Does the choice of function matter?

Not much difference…

(probably)

Building the Stan Model

Getting Started

functions {
    real growth_factor_weibull(real t, real omega, real theta) {
        return 1 - exp(-(t/theta)^omega);
    }

    real growth_factor_loglogistic(real t, real omega, real theta) {
        real pow_t_omega = t^omega;
        return pow_t_omega / (pow_t_omega + theta^omega);
    }
}
data {
    int<lower=0,upper=1> growthmodel_id;

    int n_data;
    int n_time;
    int n_cohort;

    int cohort_id[n_data];
    int t_idx[n_data];

    int cohort_maxtime[n_cohort];

    vector<lower=0>[n_time] t_value;

    vector[n_cohort] premium;
    vector[n_data]   loss;
}
parameters {
    real<lower=0> omega;
    real<lower=0> theta;

    vector<lower=0>[n_cohort] LR;

    real mu_LR;
    real<lower=0> sd_LR;

    real<lower=0> loss_sd;
}
transformed parameters {
    vector[n_time] gf;
    vector[n_data] lm;

    for(i in 1:n_time) {
        gf[i] = growthmodel_id == 1 ?
            growth_factor_weibull    (t_value[i], omega, theta) :
            growth_factor_loglogistic(t_value[i], omega, theta);
    }

    for (i in 1:n_data) {
        lm[i] = LR[cohort_id[i]] * premium[cohort_id[i]] * gf[t_idx[i]];
    }
}
model {
    mu_LR ~ normal(0, 0.5);
    sd_LR ~ lognormal(0, 0.5);

    LR ~ lognormal(mu_LR, sd_LR);

    loss_sd ~ lognormal(0, 0.7);

    omega ~ lognormal(0, 0.5);
    theta ~ lognormal(0, 0.5);

    loss ~ normal(lm, (loss_sd * premium)[cohort_id]);
}

Models Fits and Diagnostics

Model Convergence

Parameter Values

Constructing the Predictive Checks

Range of Loss Ratios


Project out ULR values


Find smallest and largest value for each iteration


Compare to data

Total Reserves


Project out ultimate loss ratios


Calculate difference to current loss ratio


Add up across accounting years

Conclusion

Further Iterations


Non-constant variance


Other functional forms (CDFs)


Multiple lines of business


Multiple coverholders

On that note…


https://magesblog.com/post/2018-07-15-hierarchical-loss-reserving-with-growth-cruves-using-brms/

Questions


Thank You!!!


mickcooney@gmail.com


http://mc-stan.org/users/documentation/case-studies/losscurves_casestudy.html