Loss Data Analytics

5.1 Introduction

The objective of this chapter is to build a probability model to describe the aggregate claims by an insurance system occurring in a fixed time period. The insurance system could be a single policy, a group insurance contract, a business line, or an entire book of an insurer’s business. In this chapter, aggregate claims refer to either the number or the amount of claims from a portfolio of insurance contracts. However, the modeling framework can be readily applied in the more general setup.

Consider an insurance portfolio of $n$ individual contracts, and let $S$ denote the aggregate losses of the portfolio in a given time period. There are two approaches to modeling the aggregate losses $S$, the individual risk model and the collective risk model. The individual risk model emphasizes the loss from each individual contract and represents the aggregate losses as: \[\begin{aligned} S_n=X_1 +X_2 +\cdots+X_n, \end{aligned}\] where $X_i~(i=1,\ldots,n)$ is interpreted as the loss amount from the $i$th contract. It is worth stressing that $n$ denotes the number of contracts in the portfolio and thus is a fixed number rather than a random variable. For the individual risk model, one usually assumes $X_{i}$’s are independent. Because of different contract features such as coverage and exposure, $X_{i}$’s are not necessarily identically distributed. A notable feature of the distribution of each $X_i$ is the probability mass at zero corresponding to the event of no claims.

The collective risk model represents the aggregate losses in terms of a frequency distribution and a severity distribution: \[\begin{aligned} S_N=X_1 +X_2 +\cdots+X_N. \end{aligned}\] Here, one thinks of a random number of claims $N$ that may represent either the number of losses or the number of payments. In contrast, in the individual risk model, we use a fixed number of contracts $n$. We think of $X_1, X_2, \ldots, X_N$ as representing the amount of each loss. Each loss may or may not correspond to a unique contract. For instance, there may be multiple claims arising from a single contract. It is natural to think about $X_i>0$ because if $X_i=0$ then no claim has occurred. Typically we assume that conditional on $N=n$, $X_{1},X_{2},\ldots ,X_{n}$ are iid random variables. The distribution of $N$ is known as the frequency distribution, and the common distribution of $X$ is known as the severity distribution. We further assume $N$ and $X$ are independent. With the collective risk model, we may decompose the aggregate losses into the frequency ($N$) process and the severity ($X$) model. This flexibility allows the analyst to comment on these two separate components. For example, sales growth due to lower underwriting standards could lead to higher frequency of losses but might not affect severity. Similarly, inflation or other economic forces could have an impact on severity but not on frequency.

5.2 Individual Risk Model

As noted earlier, for the individual risk model, we think of $X_i$ as the loss from $i$th contract and interpret \[\begin{eqnarray*} S_n=X_1 +X_2 +\cdots+X_n \end{eqnarray*}\]

to be the aggregate loss from all contracts in a portfolio or group of contracts. Here, the $X_i$’s are not necessarily identically distributed and we have \[\begin{aligned} {\rm E}(S_n) &= \sum_{i=1}^{n} {\rm E}(X_i)~. \end{aligned}\]

Under the independence assumption on $X_i$’s (which implies $\mathrm{Cov}\left( X_i, X_j \right) = 0$ for all $i \neq j$), it can further be shown that \[\begin{aligned} {\rm Var}(S_n) &= \sum_{i=1}^{n} {\rm Var}(X_i) \\ P_{S_n}(z) &= \prod_{i=1}^{n}P_{X_i}(z) \\ M_{S_n}(t) &= \prod_{i=1}^{n}M_{X_i}(t), \end{aligned}\] where $P_{S_n}(\cdot)$ and $M_{S_n}(\cdot)$ are the probability generating function (pgf) and the moment generating function (mgf) of $S_n$, respectively. The distribution of each $X_i$ contains a probability mass at zero, corresponding to the event of no claims from the $i$th contract. One strategy to incorporate the zero mass in the distribution is to use the two-part framework: \[\begin{aligned} X_i = I_i\times B_i = \left\{\begin{array}{ll} 0~, & \text{if }~ I_i=0 \\ B_i~, & \text{if }~ I_i=1 \end{array} \right. \end{aligned}\] Here, $I_i$ is a Bernoulli variable indicating whether or not a loss occurs for the $i$th contract, and $B_i$ is a random variable with nonnegative support representing the amount of losses of the contract given loss occurrence. Assume that $I_1 ,\ldots,I_n ,B_1 ,\ldots,B_n$ are mutually independent. Denote ${\rm Pr} (I_i =1)=q_i$, $\mu_i={\rm E}(B_i)$, and $\sigma_i^2={\rm Var}(B_i)$. It can be shown (see Technical Supplement B.1 for details) that \[\begin{aligned} \mathrm{E}(S_n)& =\sum_{i=1}^n ~q_i ~\mu _i \\ \mathrm{Var}(S_n) & =\sum_{i=1}^n \left( q_i \sigma _i^2+q_i (1-q_i)\mu_i^2 \right)\\ P_{S_n}(z) & =\prod_{i=1}^n \left( 1-q_i+q_i P_{B_i}(z) \right)\\ M_{S_n}(t) & =\prod_{i=1}^n \left( 1-q_i+q_i M_{B_i}(t) \right) \end{aligned}\] A special case of the above model is when $B_i$ follows a degenerate distribution with $\mu_i=b_i$ and $\sigma^2_i=0$. One example is term life insurance or a pure endowment insurance where $b_i$ represents the insurance benefit amount of the $i$th contract.

Another strategy to accommodate the zero mass in the loss from each contract is to consider them in aggregate on the portfolio level, as in the collective risk model. Here, the aggregate loss is $S_{N} = X_1 + \cdots X_N$, where $N$ is a random variable representing the number of non-zero claims that occurred out of the entire group of contracts. Thus, not every contract in the portfolio may be represented in this sum, and $S_N=0$ when $N=0$. The collective risk model will be discussed in detail in the next section.

Example 5.2.1. Actuarial Exam Question. An insurance company sold 300 fire insurance policies as follows:

\[ {\small \begin{matrix} \begin{array}{c c c} \hline \text{Number of} & \text{Policy} & \text{Probability of}\\ \text{Policies} & \text{Maximum} & \text{Claim Per Policy}\\ & (M_i) & (q_i) \\ \hline 100 & 400 & 0.05\\ 200 & 300 & 0.06\\ \hline \end{array} \end{matrix} } \]

You are given:
(i) The claim amount for each policy, $X_i$, is uniformly distributed between $0$ and the policy maximum $M_i$.
(ii) The probability of more than one claim per policy is $0$.
(iii) Claim occurrences are independent.

Calculate the mean, $\mathrm{E~}(S_{300})$, and variance, $\mathrm{Var~}(S_{300})$, of the aggregate claims. How would these results change if every claim is equal to the policy maximum?

Chapter 5 Aggregate Loss Models

5.1 Introduction

5.2 Individual Risk Model

5.3 Collective Risk Model

5.3.1 Moments and Distribution

5.3.2 Stop-loss Insurance

5.3.3 Analytic Results

5.3.4 Tweedie Distribution

5.4 Computing the Aggregate Claims Distribution

5.4.1 Recursive Method

5.4.2 Simulation

5.5 Effects of Coverage Modifications

5.5.1 Impact of Exposure on Frequency

5.5.2 Impact of Deductibles on Claim Frequency

5.5.3 Impact of Policy Modifications on Aggregate Claims

5.6 Further Resources and Contributors

5.6.0.1 Exercises

Contributors

Technical Supplement B. Aggregate Loss Models

TS B.1. Individual Risk Model Properties

TS B.2. Relationship Between Probability Generating Functions of \(X_i\) and \(X_i^T\)

TS B.3. Example 5.3.8 Moment Generating Function of Aggregate Loss \(S_N\)