Chapter 9 Experience Rating Using Credibility Theory

Chapter Preview. This chapter introduces credibility theory which is an important actuarial tool for estimating pure premiums, frequencies, and severities for individual risks or classes of risks. Credibility theory provides a convenient framework for combining the experience for an individual risk or class with other data to produce more stable and accurate estimates. Several models for calculating credibility estimates will be discussed including limited fluctuation, Bühlmann, Bühlmann-Straub, and nonparametric and semiparametric credibility methods. The chapter will also show a connection between credibility theory and Bayesian estimation which was introduced in Chapter 4.

9.1 Introduction to Applications of Credibility Theory

What premium should be charged to provide insurance? The answer depends upon the exposure to the risk of loss. A common method to compute an insurance premium is to rate an insured using a classification rating plan. A classification plan is used to select an insurance rate based on an insured’s rating characteristics such as geographic territory, age, etc. All classification rating plans use a limited set of criteria to group insureds into a “class” and there will be variation in the risk of loss among insureds within the class.

An experience rating plan attempts to capture some of the variation in the risk of loss among insureds within a rating class by using the insured’s own loss experience to complement the rate from the classification rating plan. One way to do this is to use a credibility weight \(Z\) with \(0\leq Z \leq 1\) to compute

\[\begin{equation*} \hat{R}=Z\bar{X}+(1-Z)M, \end{equation*}\] \[\begin{eqnarray*} \hat{R}&=&\textrm{credibility weighted rate for risk,}\\ \bar{X}&=&\textrm{average loss for the risk over a specified time period,}\\ M&=&\textrm{the rate for the classification group, often called the manual rate.}\\ \end{eqnarray*}\]

For a large risk whose loss experience is stable from year to year, \(Z\) might be close to 1. For a smaller risk whose losses vary widely from year to year, \(Z\) may be close to 0.

Credibility theory is also used for computing rates for individual classes within a classification rating plan. When classification plan rates are being determined, some or many of the groups may not have sufficient data to produce stable and reliable rates. The actual loss experience for a group will be assigned a credibility weight \(Z\) and the complement of credibility \(1-Z\) may be given to the average experience for risk across all classes. Or, if a class rating plan is being updated, the complement of credibility may be assigned to the current class rate. Credibility theory can also be applied to the calculation of expected frequencies and severities.

Computing numeric values for \(Z\) requires analysis and understanding of the data. What are the variances in the number of losses and sizes of losses for risks? What is the variance between expected values across risks?

9.2 Limited Fluctuation Credibility


In this section, you learn how to:

  • Calculate full credibility standards for number of claims, average size of claims, and aggregate losses.
  • Learn how the relationship between means and variances of underlying distributions affects full credibility standards.
  • Determine credibility-weight \(Z\) using the square-root partial credibility formula.

Limited fluctuation credibility, also called “classical credibility”, was given this name because the method explicitly attempts to limit fluctuations in estimates for claim frequencies, severities, or losses. For example, suppose that you want to estimate the expected number of claims for a group of risks in an insurance rating class. How many risks are needed in the class to ensure that a specified level of accuracy is attained in the estimate? First the question will be considered from the perspective of how many claims are needed.

9.2.1 Full Credibility for Claim Frequency

Let \(N\) be a random variable representing the number of claims for a group of risks. The observed number of claims will be used to estimate \(\mu_N=\mathrm{E}[N]\), the expected number of claims. How big does \(\mu_N\) need to be to get a good estimate? One way to quantify the accuracy of the estimate would be a statement like: ``The observed value of \(N\) should be within 5\(\%\) of \(\mu_N\) at least 90\(\%\) of the time." Writing this as a mathematical expression would give \(\Pr[0.95\mu_N\leq N \leq1.05\mu_N] \geq 0.90\). Generalizing this statement by letting \(k\) replace 5\(\%\) and probability \(p\) replace 0.90 produces a confidence interval

\[\begin{equation} \Pr[(1-k)\mu_N\leq N \leq(1+k)\mu_N] \geq p. \tag{9.1} \end{equation}\]

The expected number of claims required for the probability on the left-hand side of (9.1) to equal \(p\) is called the full credibility standard.

If the expected number of claims is greater than or equal to the full credibility standard then full credibility can be assigned to the data so \(Z=1\). Usually the expected value \(\mu_N\) is not known so full credibility will be assigned to the data if the actual observed value of \(N\) is greater than or equal to the full credibility standard. The \(k\) and \(p\) values must be selected and the actuary may rely on experience, judgment, and other factors in making the choices.

Subtracting \(\mu_N\) from each term in (9.1) and dividing by the standard deviation \(\sigma_N\) of \(N\) gives

\[\begin{equation} \Pr\left[\frac{-k\mu_N}{\sigma_N}\leq \frac{N-\mu_N}{\sigma_N} \leq \frac{k\mu_N}{\sigma_N}\right] \geq p. \tag{9.2} \end{equation}\]

For large values of \(\mu_N=\mathrm{E}[N]\) it may be reasonable to approximate the distribution for \(Z=(N-\mu_N)/\sigma_N\) with the standard normal distribution.

Let \(y_p\) be the value such that \(\Pr[-y_p\leq Z \leq y_p]=\Phi(y_p)-\Phi(-y_p)=p\) where \(\Phi( )\) is the cumulative standard normal distribution. Because \(\Phi(-y_p)=1-\Phi(y_p)\), the equality can be rewritten as \(2\Phi(y_p)-1=p\). Solving for \(y_p\) gives \(y_p=\Phi^{-1}((p+1)/2)\) where \(\Phi^{-1}( )\) is the inverse of the cumulative normal.

Equation (9.2) will be satisfied if \(k\mu_N/\sigma_N \geq y_p\) assuming the normal approximation. First we will consider this inequality for the case when \(N\) has a Poisson distribution: \(\Pr[N=n] = \lambda^n\textrm{e}^{\lambda}/n!\). Because \(\lambda=\mu_N=\sigma_N^2\) for the Poisson, taking square roots yields \(\mu_N^{1/2}=\sigma_N\). So, \(k\mu_N/\mu_N^{1/2} \geq y_p\) which is equivalent to \(\mu_N \geq (y_p/k)^2\). Let’s define \(\lambda_{kp}\) to be the value of \(\mu_N\) for which equality holds. Then the full credibility standard for the Poission distribution is

\[\begin{equation} \lambda_{kp} = \left(\frac{y_p}{k}\right)^2 \textrm{with } y_p=\Phi^{-1}((p+1)/2). \tag{9.3} \end{equation}\]

If the expected number of claims \(\mu_N\) is greater than or equal to \(\lambda_{kp}\) then equation (9.1) is assumed to hold and full credibility can be assigned to the data. As noted previously, because \(\mu_N\) is usually unknown, full credibility is given if the observed value of \(N\) satisfies \(N \geq \lambda_{kp}.\)

Example 9.2.1. The full credibility standard is set so that the observed number of claims is to be within 5% of the expected value with probability \(p=0.95\). If the number of claims has a Poisson distribution find the number of claims needed for full credibility.

Show Example Solution

If claims are not Poisson distributed then equation (9.2) does not imply (9.3). Setting the upper bound of \(Z\) in (9.2) equal to \(y_p\) gives \(k\mu_N/\sigma_N=y_p\). Squaring both sides and moving everything to the right side except for one of the \(\mu_N\)’s gives \(\mu_N=(y_p/k)^2(\sigma_N^2/\mu_N)\). This is the full credibility standard for frequency and will be denoted by \(n_f\),

\[\begin{equation} n_f=\left(\frac{y_p}{k}\right)^2\left(\frac{\sigma_N^2}{\mu_N}\right)=\lambda_{kp}\left(\frac{\sigma_N^2}{\mu_N}\right). \tag{9.4} \end{equation}\]

This is the same equation as the Poisson full credibility standard except for the \((\sigma_N^2/\mu_N)\) multiplier. When the claims distribution is Poisson this extra term is one because the variance equals the mean.

Example 9.2.2. The full credibility standard is set so that the total number of claims is to be within 5\(\%\) of the observed value with probability \(p=0.95\). The number of claims has a negative binomial distribution

\[\begin{equation*} \Pr(N=x)={x+r-1\choose x} \left(\frac{1}{1+\beta}\right)^r \left(\frac{\beta}{1+\beta}\right)^x \end{equation*}\]

with \(\beta=1\). Calculate the full credibility standard.

Show Example Solution

We see that the negative binomial distribution with \((\sigma_N^2/\mu_N)>1\) requires more claims for full credibility than a Poission distribution for the same \(k\) and \(p\) values. The next example shows that a binomial distribution which has \((\sigma_N^2/\mu_N)<1\) will need fewer claims for full credibility.

Example 9.2.3. The full credibility standard is set so that the total number of claims is to be within 5\(\%\) of the observed value with probability \(p=0.95\). The number of claims has a binomial distribution

\[\begin{equation*} \Pr(N=x)={m\choose x}q^x(1-q)^{m-x}. \end{equation*}\]

Calculate the full credibility standard for \(q=1/4\).

Show Example Solution

Rather than use expected number of claims to define the full credibility standard, the number of exposures can be used for the full credibility standard. An exposure is a measure of risk. For example, one car insured for a full year would be one car-year. Two cars each insured for exactly one-half year would also result in one car-year. Car-years attempt to quantify exposure to loss. Two car-years would be expected to generate twice as many claims as one car-year if the vehicles have the same risk of loss. To translate a full credibility standard denominated in terms of number of claims to a full credibility standard denominated in exposures one needs a reasonable estimate of the expected number of claims per exposure.

Example 9.2.4. The full credibility standard should be selected so that the observed number of claims will be within 5\(\%\) of the expected value with probability \(p=0.95\). The number of claims has a Poisson distribution. If one exposure is expected to have about 0.20 claims per year, find the number of exposures needed for full credibility.

Show Example Solution

Frequency can be defined as the number of claims per exposure. Letting \(m\) represent number of exposures then the observed claim frequency is \(N/m\) which is used to estimate \(\mathrm{E}(N/m)\):

\[\begin{equation*} \Pr[(1-k)\mathrm{E}(N/m)\leq N/m \leq(1+k)\mathrm{E}(N/m)] \geq p. \end{equation*}\]

.

Because the number of exposures is not a random variable, \(\mathrm{E}(N/m)=\mathrm{E}(N)/m=\mu_N/m\) and the prior equation becomes

\[\begin{equation*} \Pr\left[(1-k)\frac{\mu_N}{m}\leq \frac{N}{m} \leq(1+k)\frac{\mu_N}{m}\right] \geq p. \end{equation*}\]

Mulitplying through by \(m\) results in equation (9.1) at the beginning of the section. The full credibility standards that were developed for estimating expected number of claims also apply to frequency.

9.2.2 Full Credibility for Aggregate Losses and Pure Premium

Aggregate losses are the total of all loss amounts for a risk or group of risks. Letting \(S\) represent aggregate losses then

\[\begin{equation*} S=X_1+X_2+\cdots+X_N. \end{equation*}\]

The random variable \(N\) represents the number of losses and random variables \(X_1, X_2,\ldots,X_N\) are the individual loss amounts. In this section it is assumed that \(N\) is independent of the loss amounts and that \(X_1, X_2,\ldots,X_N\) are iid.

The mean and variance of \(S\) are

\[\begin{equation*} \mu_S=\mathrm{E}(S)=\mathrm{E}(N)\mathrm{E}(X)=\mu_N\mu_X\textrm{ and} \end{equation*}\] \[\begin{equation*} \sigma^{2}_S=\mathrm{Var}(S)=\mathrm{E}(N)\mathrm{Var}(X)+[\mathrm{E}(X)]^{2}\mathrm{Var}(N)=\mu_N\sigma^{2}_X+\mu^{2}_X\sigma^{2}_N. \end{equation*}\]

where \(X\) is the amount of a single loss.

Observed losses \(S\) will be used to estimate expected losses \(\mu_S=\mathrm{E}(S)\). As with the frequency model in the previous section, the observed losses must be close to the expected losses as quantified in the equation

\[\begin{equation*} \Pr[(1-k)\mu_S\leq S \leq(1+k)\mu_S] \geq p. \end{equation*}\]

After subtracting the mean and dividing by the standard deviation,

\[\begin{equation*} \Pr\left[\frac{-k\mu_S}{\sigma_S}\leq Z \leq \frac{k\mu_S}{\sigma_S}\right] \geq p \end{equation*}\]

with \(Z = (S-\mu_S)/\sigma_S\). As done in the previous section the distribution for \(Z\) is assumed to be normal and \(k\mu_S/\sigma_S=y_p=\Phi^{-1}((p+1)/2)\). This equation can be rewritten as \(\mu_S^2=(y_p/k)^2\sigma_S^2\). Using the prior formulas for \(\mu_S\) and \(\sigma_{S}^2\) gives \((\mu_N\mu_X)^2=(y_p/k)^2(\mu_N\sigma^{2}_X+\mu^{2}_X\sigma^{2}_N)\). Dividing both sides by \(\mu_N\mu_X^2\) and reordering terms on the right side results in a full credibility standard \(n_S\) for aggregate losses

\[\begin{equation} n_S=\left(\frac{y_p}{k}\right)^2\left[\left(\frac{\sigma_N^2}{\mu_N}\right)+\left(\frac{\sigma_X}{\mu_X}\right)^2\right]=\lambda_{kp}\left[\left(\frac{\sigma_N^2}{\mu_N}\right)+\left(\frac{\sigma_X}{\mu_X}\right)^2\right]. \tag{9.5} \end{equation}\]

Example 9.2.5. The number of claims has a Poisson distribution. Individual loss amounts are independently and identically distributed with a Pareto distribution \(F(x)=1-[\theta/(x+\theta)]^{\alpha}\). The number of claims and loss amounts are independent. If observed aggregate losses should be within 5\(\%\) of the expected value with probability \(p=0.95\), how many losses are required for full credibility?

Show Example Solution

When the number of claims are Poisson distributed then equation (9.5) can be simplified using \((\sigma_N^2/\mu_N)=1\). It follows that \([(\sigma_N^2/\mu_N)+(\sigma_X/\mu_X)^2]=[1+(\sigma_X/\mu_X)^2]=[(\mu_x^2+\sigma_X^2)/\mu_X^2]=\mathrm{E}(X^2)/\mathrm{E}(X)^2\) using the relationship \(\mu_X^2+\sigma_X^2=\mathrm{E}(X^2)\). The full credibility standard is \(n_S=\lambda_{kp}\mathrm{E}(X^2)/\mathrm{E}(X)^2\).

The pure premium \(PP\) is equal to aggregate losses \(S\) divided by exposures \(m\): \(PP=S/m\). The full credibility standard for pure premium will require

\[\begin{equation*} \Pr\left[(1-k)\mu_{PP}\leq PP \leq(1+k)\mu_{PP}\right] \geq p. \end{equation*}\]

The number of exposures \(m\) is assumed fixed and not a random variable so \(\mu_{PP}=\mathrm{E}(S/m)=\mathrm{E}(S)/m=\mu_S/m\).

\[\begin{equation*} \Pr\left[(1-k)\left(\frac{\mu_S}{m}\right)\leq \left(\frac{S}{m}\right) \leq(1+k)\left(\frac{\mu_S}{m}\right)\right] \geq p. \end{equation*}\]

Multiplying through by exposures \(m\) returns the confidence interval for losses

\[\begin{equation*} \Pr[(1-k)\mu_S\leq S \leq(1+k)\mu_S] \geq p. \end{equation*}\]

This means that the full credibility standard \(n_{PP}\) for the pure premium is the same as that for aggregate losses

\[\begin{equation*} n_{PP}=n_S=\lambda_{kp}\left[\left(\frac{\sigma_N^2}{\mu_n}\right)+\left(\frac{\sigma_X}{\mu_X}\right)^2\right]. \end{equation*}\]

9.2.3 Full Credibility for Severity

Let \(X\) be a random variable representing the size of one claim. Claim severity is \(\mu_X=\mathrm{E}(X)\). Suppose that \({X_1,X_2, \ldots, X_n}\) is a random sample of \(n\) claims that will be used to estimate claim severity \(\mu_X\). The claims are assumed to be iid. The average value of the sample is

\[\begin{equation*} \bar{X}=\frac{1}{n}\left(X_1+X_2+\cdots+X_n\right). \end{equation*}\]

How big does \(n\) need to be to get a good estimate? Note that \(n\) is not a random variable whereas it is in the aggregate loss model.

In Section 9.2.1 the accuracy of an estimator was defined in terms of a confidence interval. For severity this confidence interval is

\[\begin{equation*} \Pr[(1-k)\mu_X\leq \bar{X} \leq(1+k)\mu_X ]\geq p \end{equation*}\]

where \(k\) and \(p\) need to be specified. Following the steps in Section 9.2.1, mean claim severity \(\mu_X\) is subtracted from each term and the standard deviation of the claim severity estimator \(\sigma_{\bar{X}}\) is divided into each term yielding

\[\begin{equation*} \Pr\left[\frac{-k\mu_X}{\sigma_{\bar{X}}}\leq Z \leq \frac{k\mu_X}{\sigma_{\bar{X}}}\right] \geq p \end{equation*}\]

with \(Z = (\bar{X}-\mu_X)/\sigma_X\). As in prior sections, it is assumed that \(Z\) is approximately normally distributed and the prior equation is satistifed if \(k\mu_X/\sigma_{\bar{X}}\geq y_p\) with \(y_p=\Phi^{-1}((p+1)/2)\). Because \(\bar{X}\) is the average of individual claims \(X_1, X_2,\dots, X_n\), its standard deviation is equal to the standard deviation of an individual claim divided by \(\sqrt{n}\): \(\sigma_{\bar{X}}=\sigma_X/\sqrt{n}\). So, \(k\mu_X/(\sigma_X/\sqrt{n})\geq y_p\) and with a little algebra this can be rewritten as \(n \geq (y_p/k)^2(\sigma_X/\mu_X)^2\). The full credibility standard for severity is

\[\begin{equation} n_X=\left(\frac{y_p}{k}\right)^2\left(\frac{\sigma_X}{\mu_X}\right)^2=\lambda_{kp}\left(\frac{\sigma_X}{\mu_X}\right)^2. \tag{9.4} \end{equation}\]

Note that the term \(\sigma_X/\mu_X\) is the coefficient of variation for an individual claim. Even though \(\lambda_{kp}\) is the full credibility standard for frequency given a Poisson distribution, there is no assumption about the distribution for the number of claims.

Example 9.2.6. Individual loss amounts are independently and identically distributed with a Pareto distribution \(F(x)=1-[\theta/(x+\theta)]^{\alpha}\). How many claims are required for the average severity of observed claims to be within 5\(\%\) of the expected severity with probability \(p=0.95\)?

Show Example Solution

9.2.4 Partial Credibility

In prior sections full credibility standards were calculated for estimating frequency (\(n_f\)), pure premium (\(n_{PP}\)), and severity (\(n_X\)) - in this section these full credibility standards will be denoted by \(n_{0}\). In each case the full credibility standard was the expected number of claims required to achieve a defined level of accuracy when using empirical data to estimate an expected value. If the observed number of claims is greater than or equal to the full credibility standard then a full credibility weight \(Z=1\) is given to the data.

In limited fluctuation credibility, credibility weights \(Z\) assigned to data are

\[\begin{equation*} Z=\quad \sqrt{\frac{n}{n_{0}}} \quad \textrm{if} \quad n < n_{0} \quad \textrm{and} \quad Z=\quad 1 \quad \textrm{for} \quad n \geq n_{0} \end{equation*}\]

where \(n_0\) is the full credibility standard. The quantity \(n\) is the number of claims for the data that is used to estimate the expected frequency, severity, or pure premium.

Example 9.2.7. The number of claims has a Poisson distribution. Individual loss amounts are independently and identically distributed with a Pareto distribution \(F(x)=1-[\theta/(x+\theta)]^{\alpha}\). Assume that \(\alpha=3\). The number of claims and loss amounts are independent. The full credibility standard is that the observed pure premium should be within 5\(\%\) of the expected value with probability \(p=0.95\). What credibility \(Z\) is assigned to a pure premium computed from 1,000 claims?

Show Example Solution

Limited fluctuation credibility uses the formula \(Z=\sqrt{n/n_0}\) to limit the fluctuation in the credibility-weighted estimate to match the fluctuation allowed for data with expected claims at the full credibility standard. Variance or standard deviation is used as the measure of fluctuation. Rather than derive the square-root formula an example is shown

Suppose that average claim severity is being estimated from a sample of size \(n\) that is less that the full credibility standard \(n_0=n_X\). Applying credibility theory the estimate \(\hat{\mu}_X\) would be

\[\begin{equation*} \hat{\mu}_X=Z\bar{X}+(1-Z)M_X \end{equation*}\]

with \(\bar{X}=(X_1+X_2+\cdots+X_n)/n\) and independent random variables \(X_i\) representing the sizes of individual claims. The complement of credibility is applied to \(M_X\) which could be last year’s estimated average severity adjusted for inflation, the average severity for a much larger pool of risks, or some other relevant quantity selected by the actuary. It is assumed that the variance of \(M_X\) is zero or negligible. With this assumption

\[\begin{equation*} \mathrm{Var}(\hat{\mu}_X)=\mathrm{Var}(Z\bar{X})=Z^2\mathrm{Var}(\bar{X})=\frac{n}{n_0}\mathrm{Var}(\bar{X}). \end{equation*}\]

Because \(\bar{X}=(X_1+X_2+\cdots+X_n)/n\) it follows that \(\mathrm{Var}(\bar{X})=\mathrm{Var}(X)/n\) where random variable \(X\) is one claim. So,

\[\begin{equation*} \mathrm{Var}(\hat{\mu}_X)=\frac{n}{n_0}\mathrm{Var}(\bar{X})=\frac{n}{n_0}\frac{\mathrm{Var}(X)}{n}=\frac{\mathrm{Var}(X)}{n_0}. \end{equation*}\]

The last term is exactly the variance of a sample mean \(\bar{X}\) when the sample size is equal to the full credibility standard \(n_0=n_X\).

9.3 Bühlmann Credibility


In this section, you learn how to:

  • Compute a credibility-weighted estimate for the expected loss for a risk or group of risks.
  • Determine the credibility \(Z\) assigned to observations.
  • Calculate the values required in Bühlmann credibility including the Expected Value of the Process Variance (EPV), Variance of the Hypothetical Means (VHM) and collective mean \(\mu\).
  • Recognize situations when the Bühlmann model is appropriate.

A classification rating plan groups policyholders together into classes based on risk characteristics. Although policyholders within a class have similarities, they are not identical and their expected losses will not be exactly the same. An experience rating plan can supplement a class rating plan by credibility weighting an individual policyholder’s loss experience with the class rate to produce a more accurate rate for the policyholder.

In the presentation of Bühlmann credibility it is convenient to assign a risk parameter \(\theta\) to each policyholder. Losses \(X\) for the policyholder will have a common distribution function \(F_{\theta}(x)\) with mean \(\mu(\theta)=\mathrm{E}(X|\theta)\) and variance \(\sigma^2(\theta)=\mathrm{Var}(X|\theta)\). In the prior sentence losses can represent pure premiums, aggegrate losses, number of claims, claim severities, or some other measure of loss. Parameter \(\theta\) can be continuous, discrete, or multivariate depending on the model.

If the policyholder had losses \(x_1, \ldots, x_n\) during \(n\) observation periods then we want to find E(\(\mu(\theta)|x_1,\ldots,\ldots, x_n)\), the conditional expectation of \(\mu(\theta)\) given \(x_1,\ldots, x_n\). Another way to view this is to consider random variable \(X_{n+1}\) which is the observation during period \(n+1\). Finding E\((X_{n+1}|x_1, x_2,\ldots, x_n)\) is equivalent to finding E(\(\mu(\theta)|x_1, x_2,\ldots, x_n)\) assuming that \(X_1,\ldots, X_n, X_{n+1}\) are iid.

The Bühlmann credibility-weighted estimate for E(\(\mu(\theta)|X_1,\ldots, X_n)\) for the policyholder is

\[\begin{equation} \hat{\mu}(\theta)=Z\bar{X}+(1-Z)\mu \tag{9.6} \end{equation}\]

with

\[\begin{eqnarray*} \theta&=&\textrm{a risk parameter that identifies a policyholder's risk level}\\ \hat{\mu}(\theta)&=&\textrm{estimated expected loss for a policyholder with parameter }\theta\\ & & \textrm{and loss experience } \bar{X}\\ \bar{X}&=&(X_1+\cdots+X_n)/n \textrm{ is the average of $n$ observations of the policyholder } \\ Z&=&\textrm{credibility assigned to $n$ observations } \\ \mu&=&\textrm{the expected loss for a randomly chosen policyholder in the class.}\\ \end{eqnarray*}\]

Random variables \(X_j\) are assumed to be iid for \(j=1,\ldots,n\). The quantity \(\bar{X}\) is the average of \(n\) observations and \(\mathrm{E}(\bar{X}|\theta)=\mathrm{E}(X_j|\theta)=\mu(\theta)\).

If a policyholder is randomly chosen from the class and there is no loss information about the risk then it’s expected loss is \(\mu=\mathrm{E}(\mu(\theta))\) where the expectation is taken over all \(\theta\)’s in the class. In this situation \(Z=0\) and the expected loss is \(\hat\mu(\theta)=\mu\) for the risk. The quantity \(\mu\) can also be written as \(\mu=\mathrm{E}(X_j)\) or \(\mu=\mathrm{E}(\bar{X})\) and is often called the overall mean or collective mean. Note that E(\(X_j\)) is evaluated with the “law of total expectation”: E(\(X_j\))=E(E(\(X_j|\theta)\)).

Example 9.3.1. The number of claims \(X\) for an insured in a class has a Poisson distribution with mean \(\theta>0\). The risk parameter \(\theta\) is exponentially distributed within the class with pdf \(f(\theta)=e^{-\theta}\). What is the expected number of claims for an insured chosen at random from the class?

Show Example Solution

The prior example has risk parameter \(\theta\) as a positive real number but the risk parameter can be a categorical variable as shown in the next example.

Example 9.3.2. For any risk (policyholder) in a population the number of losses \(N\) in a year has a Poisson distribution with parameter \(\lambda\). Individual loss amounts \(X_i\) for a risk are independent of \(N\) and are iid with Pareto distribution \(F(x)=1-[\theta/(x+\theta)]^{\alpha}\). There are three types of risks in the population as follows:

\[\begin{matrix} \begin{array}{|c|c|c|c|} \hline \text{Risk } & \text{Percentage} & \text{Poisson} & \text{Pareto} \\ \text{Type} & \text{of Population} & \text{Parameter} & \text{Parameters} \\ \hline A & 50\% & \lambda=0.5 & \theta=1000, \alpha=2.0 \\ B & 30\% & \lambda=1.0 & \theta=1500, \alpha=2.0 \\ C & 20\% & \lambda=2.0 & \theta=2000, \alpha=2.0 \\ \hline \end{array} \end{matrix}\] If a risk is selected at random from the population, what is the expected aggregate loss in a year?

Show Example Solution

Although formula (9.6) was introduced using experience rating as an example, the Bühlmann credibility model has wider application. Suppose that a rating plan has multiple classes. Credibility formula (9.6) can be used to determine individual class rates. The overall mean \(\mu\) would be the average loss for all classes combined, \(\bar{X}\) would be the experience for the individual class, and \(\hat{\mu}(\theta)\) would be the estimated loss for the class.

9.3.1 Credibility Z, EPV, and VHM

When computing the credibility estimate \(\hat{\mu}(\theta)=Z\bar{X}+(1-Z)\mu\), how much weight \(Z\) should go to experience \(\bar{X}\) and how much weight \((1-Z)\) to the overall mean \(\mu\)? In Bühlmann credibility there are three factors that need to be considered:

  • How much variation is there in a single observation \(X_j\) for a selected risk? With \(\bar{X}=(X_1+\cdots+X_n)/n\) and assuming that the observations are iid, it follows that Var(\(\bar{X}|\theta)\)=Var(\(X_j|\theta)/n\). For larger Var(\(\bar{X}|\theta)\) less credibility weight \(Z\) should be given to experience \(\bar{X}\). The Expected Value of the Process Variance, abbreviated EPV, is the expected value of Var(\(X_j|\theta\)) across all risks:
\[\begin{equation*} EPV=\mathrm{E}(\mathrm{Var}(X_j|\theta)). \end{equation*}\]

Because Var(\(\bar{X}|\theta)\)=Var(\(X_j|\theta)/n\) it follows that E(Var(\(\bar{X}|\theta)\))=EPV/\(n\).

  • How homogeneous is the population of risks whose experience was combined to compute the overall mean \(\mu\)? If all the risks are similar in loss potential then more weight \((1-Z)\) would be given to the overall mean \(\mu\) because \(\mu\) is the average for a group of similar risks whose means \(\mu(\theta)\) are not far apart. The homogeneity or heterogeneity of the population is measured by the Variance of the Hypothetical Means with abbreviation VHM:
\[\begin{equation*} VHM=\mathrm{Var}(\mathrm{E}(X_j|\theta))=\mathrm{Var}(\mathrm{E}(\bar{X}|\theta)). \end{equation*}\]

Note that we used \(\mathrm{E}(\bar{X}|\theta)=\mathrm{E}(X_j|\theta)\) for the second equality. *How many observations \(n\) were used to compute \(\bar{X}\)? More observations would infer a larger \(Z\).

Example 9.3.3. The number of claims \(N\) in a year for a risk in a population has a Poisson distribution with mean \(\lambda>0\). The risk parameters \(\lambda\) for the population are uniformly distributed over the interval (0,2). Calculate the EPV and VHM for the population.

Show Example Solution

The Bühlmann credibility formula includes values for \(n\), EPV, and VHM:

\[\begin{equation} Z=\frac{n}{n+K} \quad , \quad K =\frac{EPV}{VHM}. \tag{9.7} \end{equation}\]

If \(n\) increases then so does \(Z\). If the VHM increases then \(Z\) increases. If the EPV increases then \(Z\) gets smaller. Unlike limited fluctuation credibility where \(Z=1\) when the expected number of claims is greater than the full credibility standard, \(Z\) can approach but not equal 1 as the number of observations \(n\) goes to infinity.

If you multiply the numerator and denominator of the \(Z\) formula by (VHM/\(n\)) then \(Z\) can be rewritten as

\[\begin{equation*} Z=\frac{VHM}{VHM+(EPV/n)} . \end{equation*}\]

The number of observations \(n\) is captured in the term (EPV/\(n\)). As shown in bullet (1) at the beginning of the section, E(Var(\(\bar{X}|\theta)\))=EPV/\(n\). As the number of observations get larger, the expected variance of \(\bar{X}\) gets smaller and credibility \(Z\) increases so that more weight gets assigned to \(\bar{X}\) in the credibility-weighted estimate \(\hat{\mu}(\theta)\).

Example 9.3.4. Use the ``law of total variance" to show that Var(\(\bar{X}\)) = VHM + (EPV/n) and derive a formula for \(Z\) in terms of \(\bar{X}\).

Show Example Solution

The following long example and solution demonstrates how to compute the credibility-weighted estimate with frequency and severity data.

Example 9.3.5. For any risk in a population the number of losses \(N\) in a year has a Poisson distribution with parameter \(\lambda\). Individual loss amounts \(X\) for a selected risk are independent of \(N\) and are iid with exponential distribution \(F(x)=1-e^{x/\beta}\). There are three types of risks in the population as shown below. A risk was selected at random from the population and all losses were recorded over a five-year period. The total amount of losses over the five-year period was 5,000. Use Bühlmann credibility to estimate the annual expected aggregate loss for the risk.
\[\begin{matrix} \begin{array}{|c|c|c|c|} \hline \text{Risk } & \text{Percentage} & \text{Poisson} & \text{Exponential} \\ \text{Type} & \text{of Population} & \text{Parameter} & \text{Parameter} \\ \hline A & 50\% & \lambda=0.5 & \beta=1000 \\ B & 30\% & \lambda=1.0 & \beta=1500 \\ C & 20\% & \lambda=2.0 & \beta=2000 \\ \hline \end{array} \end{matrix}\]

Show Example Solution

In real world applications of Bühlmann credibility the value of \(K=EPV/VHM\) must be estimated. Sometimes a value for \(K\) is selected using judgment. A smaller \(K\) makes estimator \(\hat{\mu}(\theta)\) more responsive to actual experience \(\bar{X}\) whereas a larger \(K\) produces a more stable estimate by giving more weight to \(\mu\). Judgment may be used to balance responsiveness and stability. A later section in this chapter will discuss methods for determining \(K\) from data.

For a policyholder with risk parameter \(\theta\), Bühlmann credibility uses a linear approximation \(\hat{\mu}(\theta)=Z\bar{X}+(1-Z)\mu\) to estimate E(\(\mu(\theta)|X_1,\ldots,X_n\)), the expected loss for the policyholder given prior losses \(X_1,\ldots, X_n\). We can rewrite this as \(\hat{\mu}(\theta)=a+b\bar{X}\) which makes it obvious that the credibility estimate is a linear function of \(\bar{X}\).

If E(\(\mu(\theta)|X_1,\ldots,X_n\)) is approximated by the linear function \(a+b\bar{X}\) and constants \(a\) and \(b\) are chosen so that E[(E(\(\mu(\theta)|X_1,\ldots,X_n)-(a+b\bar{X}))^2\)] is minimized, what are \(a\) and \(b\)? The answer is \(b=n/(n+K)\) and \(a=(1-b)\mu\) with \(K=EPV/VHM\) and \(\mu=E(\mu(\theta))\). More detail can be found in references (Buhlmann 1967), (Buhlmann and Gisler 2005), (Klugman, Panjer, and Willmot 2012), and (Tse 2009).

Bühlmann credibility is also called least-squares credibility, greatest accuracy credibility, or Bayesian credibility.

9.4 Bühlmann-Straub Credibility


In this section, you learn how to:

  • Compute a credibility-weighted estimate for the expected loss for a risk or group of risks using the Bühlmann-Straub model.
  • Determine the credibility \(Z\) assigned to observations.
  • Calculate required values including the Expected Value of the Process Variance (EPV), Variance of the Hypothetical Means (VHM) and collective mean \(\mu\).
  • Recognize situations when the Bühlmann-Straub model is appropriate.

With standard Bühlmann or least-squares credibility as described in the prior section, losses \(X_1,\ldots,X_n\) for a policyholder are assumed to be iid. If the subscripts indicate year 1, year 2 and so on up to year \(n\), then the iid assumption means that the policyholder has the same exposure to loss every year. For commercial insurance this assumption is frequently violated.

Consider a commercial policyholder that uses a fleet of vehicles in its business. In year 1 there are \(m_1\) vehicles in the fleet, \(m_2\) vehicles in year 2, .., and \(m_n\) vehicles in year \(n\). The exposure to loss from ownership and use of this fleet is not constant from year to year. The annual losses for the fleet are not iid.

Define \(Y_{jk}\) to be the loss for the \(k^{th}\) vehicle in the fleet for year \(j\). Then, the total losses for the fleet in year \(j\) are \(Y_{j1}+\cdots+Y_{jm_j}\) where we are adding up the losses for each of the \(m_j\) vehicles. In the Bühlmann-Straub model it is assumed that random variables \(Y_{jk}\) are iid across all vehicles and years for the policyholder. With this assumption the means E(\(Y_{jk}|\theta)=\mu(\theta)\) and variances Var(\(Y_{jk}|\theta)=\sigma^2(\theta)\) are the same for all vehicles and years. The quantity \(\mu(\theta)\) is the expected loss and \(\sigma^2(\theta)\) is the variance in the loss for one year for one vehicle for a policyholder with risk parameter \(\theta\).

If \(X_j\) is the average loss per unit of exposure in year \(j\), \(X_j=(Y_1+\cdots+Y_{m_j})/m_j\), then E(\(X_j)=\mu(\theta)\) and Var(\(X_j)=\sigma^2(\theta)/m_j\) for policyholder with risk parameter \(\theta\). The average loss per vehicle for the entire \(n\)-year period is

\[\begin{equation*} \bar{X}= \frac{1}{m} \sum_{j=1}^{n} m_j X_{j} \quad , \quad m=\sum_{j=1}^{n} m_j. \end{equation*}\]

It follows that E\((\bar{X}|\theta)=\mu(\theta)\) and Var\((\bar{X}|\theta)=\sigma^2(\theta)/m\) where \(\mu(\theta)\) and \(\sigma^2(\theta)\) are the mean and variance for a single vehicle for one year for the policyholder.

Example 9.4.1. Prove that Var\((\bar{X}|\theta)=\sigma^2(\theta)/m\) for a risk with risk parameter \(\theta\).

Show Example Solution

The Bühlmann-Straub credibility estimate is:

\[\begin{equation}\hat{\mu}(\theta)=Z\bar{X}+(1-Z)\mu \tag{9.8} \end{equation}\]

with

\[\begin{eqnarray*} \theta&=&\textrm{a risk parameter that identifies a policyholder's risk level}\\ \hat{\mu}(\theta)&=&\textrm{estimated expected loss for one exposure for the policyholder}\\ & & \textrm{with loss experience } \bar{X}\\ \bar{X}&=& \frac{1}{m} \sum_{j=1}^{n} m_j X_j \textrm{ is the average loss per exposure for m exposures } \\ Z&=&\textrm{credibility assigned to $m$ exposures } \\ \mu&=&\textrm{expected loss for one exposure for randomly chosen}\\ & & \textrm{ policyholder from population.}\\ \end{eqnarray*}\]

Note that \(\hat{\mu}(\theta)\) is the estimator for the expected loss for one exposure. If the policyholder has \(m_j\) exposures then the expected loss is \(m_j\hat{\mu}(\theta)\).

In an example in the prior section it was shown that \(Z\)=Var(E(\(\bar{X}|\theta\)))/Var(\(\bar{X}\)) where \(\bar{X}\) is the average loss for \(n\) observations. In equation (9.8) the \(\bar{X}\) is the average loss for \(m\) exposures and the same \(Z\) formula can be used:

\[\begin{equation*} Z=\frac{\mathrm{Var}(\mathrm{E}(\bar{X}))}{\mathrm{Var}(\bar{X})}= \frac{\mathrm{Var}(\mathrm{E}(\bar{X}))}{\mathrm{E}(\mathrm{Var}(\bar{X}|\theta))+\mathrm{Var}(\mathrm{E}(\bar{X}|\theta))}. \end{equation*}\]

The denominator was expanded using ``the law of total variance." As noted above \(\mathrm{E}(\bar{X}|\theta)=\mu(\theta)\) so \(\mathrm{Var}(\mathrm{E}(\bar{X}|\theta))=\mathrm{Var}(\mu(\theta))=VHM\). Because Var\((\bar{X}|\theta)=\sigma^2(\theta)/m\) it follows that E(Var(\(\bar{X}|\theta\)))=E(\(\sigma^2(\theta))/m\)=EPV/m. Making these substitutions and a little algebra gives

\[\begin{equation} Z=\frac{m}{m+K} \quad , \quad K =\frac{EPV}{VHM}. \tag{9.9} \end{equation}\]

This is the same \(Z\) as for Bühlmann credibility except number of exposures \(m\) replaces number of years or observations \(n\).

Example 9.4.2.
A commercial automobile policyholder had the following exposures and claims over a three-year period: \[\begin{matrix} \begin{array}{|c|c|c|} \hline \text{Year} & \text{Number of Vehicles} & \text{Number of Claims} \\ \hline 1 & 9 & 5 \\ 2 & 12 & 4 \\ 3 & 15 & 4 \\ \hline \end{array} \end{matrix}\]

  • The number of claims in a year for each vehicle in the policyholder’s fleet is Poisson distributed with the same mean (parameter) \(\lambda\).
  • Parameter \(\lambda\) is distributed among the policyholders in the population with pdf \(f(\lambda)=6\lambda(1-\lambda)\) with \(0<\lambda<1\).

The policyholder has 18 vehicles in its fleet in year 4. Use Bühlmann-Straub credibility to estimate the expected number of policyholder claims in year 4.

Show Example Solution

9.5 Bayesian Inference and Bühlmann


In this section, you learn how to:

  • Use Bayes Theorem to determine a formula for the expected loss of a risk when given a likelihood and prior distribution.
  • Determine the posterior distributions for the Gamma-Poisson and Beta-Binomial Bayesian models and compute expected values.
  • Understand the connection between the Bühlmann and Bayesian estimates for the Gamma-Poisson and Beta-Binomial models.

Section 4.4 reviews Bayesian inference and it is assumed that the reader is familiar with that material. This section will compare Bayesian inference and Bühlmann credibility and show connections between the two models.

A risk with risk parameter \(\theta\) has expected loss \(\mu(\theta)=E(X|\theta)\) with random variable \(X\) representing pure premium, aggegrate loss, number of claims, claim severity, or some other measure of loss. If the risk had \(n\) losses \(x_1,\ldots, x_n\) then E(\(\mu(\theta)|x_1,\ldots, x_n)\) is the conditional expectation of \(\mu(\theta)\). The Bühlmann credibility formula \(\hat{\mu}(\theta)=Z\bar{X}+(1-Z)\mu\) is a linear function of \(\bar{X}=(x_1+\cdots+x_n)/n\) used to estimate \(E(\mu(\theta)|x_1,\ldots,x_n)\).

Expectation \(E(\mu(\theta)|x_1,\ldots,x_n)\) can be calculated from the conditional density function \(f(x|\theta)\) and the posterior distribution \(\pi(\theta|x_1,\ldots,x_n)\):

\[\begin{eqnarray*} \mathrm{E}(\mu(\theta)|x_1,\ldots,x_n)&=&\int \mu(\theta) \pi(\theta|x_1,..x_n) d\theta \\ \mu(\theta)&=&\mathrm{E}(X|\theta)=\int xf(x|\theta) dx .\\ \end{eqnarray*}\]

The posterior distribution comes from Bayes theorem

\[\begin{equation*} \pi(\theta|x_1,\ldots,x_n)=\frac{\prod_{j=1}^{n} f(x_j|\theta)}{f(x_1,..x_n)}\pi({\theta}). \end{equation*}\]

The conditional density function \(f(x|\theta)\) and the prior distribution \(\pi(\theta)\) must be specified. The numerator on the right-hand side is called the likelihood.

9.5.1 Gamma-Poisson Model

In the Gamma-Poisson model the number of claims \(X\) has a Poisson distribution Pr(\(X=x|\lambda)=\lambda^xe^{-\lambda}/x!\) for a risk with risk parameter \(\lambda\). The prior distribution for \(\lambda\) is gamma with \(\pi(\lambda)=\beta^\alpha\lambda^{\alpha-1}e^{-\beta\lambda}/\Gamma(\alpha)\). (Note that a rate parameter \(\beta\) is being used in the gamma distribution rather than a scale parameter.) The mean of the gamma is E(\(\lambda)=\alpha/\beta\) and the variance is Var(\(\lambda)=\alpha/\beta^2\). In this section we will assume that \(\lambda\) is the expected number of claims per year though we could have chosen another time interval.

If a risk is selected at random from the population then the expected number of claims in a year is E(\(N\))=E(E(\(N|\lambda\)))=E(\(\lambda\))=\(\alpha/\beta\). If we had no observations for the selected risk then the expected number of claims for the risk is \(\alpha/\beta\).

During \(n\) years the following number of claims by year was observed for the randomly selected risk: \(x_1,\ldots,x_n\). From Bayes theorem the posterior distribution is

\[\begin{equation*} \pi(\lambda|x_1,\ldots,x_n)=\frac{\prod_{j=1}^{n} (\lambda^{x_j}e^{-\lambda}/x_j!)}{\Pr(x_1,\ldots,x_n)}\beta^\alpha\lambda^{\alpha-1}e^{-\beta\lambda}/\Gamma(\alpha). \end{equation*}\]

Combining terms that have a \(\lambda\) and putting all other terms into constant \(C\) gives

\[\begin{equation*} \pi(\lambda|x_1,\ldots,x_n)=C\lambda^{(\alpha+\sum_{j=1}^{n}x_j)-1}e^{-(\beta+n)\lambda}. \end{equation*}\]

This is a gamma distribution with parameters \(\alpha'=\alpha+\sum_{j=1}^{n}x_i\) and \(\beta'=\beta+n\). The constant must be \(C={\beta'}^{\alpha'}/\Gamma(\alpha')\) so that \(\int_{0}^{\infty}\pi(\lambda|x_1,\ldots,x_n) d\lambda=1\) though we do not need to know \(C\). As explained in chapter four the gamma distribution is a conjugate prior for the Poisson distribution so the posterior distribution is also gamma.

Because the posterior distribtution is gamma the expected number of claims for the selected risk is

\[\begin{equation*} \mathrm{E}(\lambda|x_1,\ldots,x_n) = \frac{\alpha+\sum_{j=1}^{n}x_j}{\beta+n}=\frac{\alpha + \textrm{number of claims}}{\beta+\textrm{number of years}}. \end{equation*}\]

This formula is slightly different from chapter four because \(\beta\) is multiplied times \(\lambda\) in the exponential of the gamma pdf whereas in chapter four \(\lambda\) is divided by parameter \(\theta\).

Now we will compute the Bühlmann credibility estimate for the Gamma-Poisson model. The variance for a Poisson distribution with parameter \(\lambda\) is \(\lambda\) so EPV=E(Var(\(X|\lambda\)))=E(\(\lambda\))=\(\alpha/\beta\). The mean number claims for the risk is \(\lambda\) so VHM=Var(E(\(X|\lambda\)))=Var(\(\lambda\))=\(\alpha/\beta^2\). The credibility parameter is \(K\)=EPV/VHM=\((\alpha/\beta)/(\alpha/\beta^2)=\beta\). The overall mean is E(E(\(X|\lambda\)))=E(\(\lambda\))=\(\alpha/\beta\). The sample mean is \(\bar{X}=(\sum_{j=1}^{n}x_j)/n\). The credibility-weighted estimate for the expected number of claims for the risk is

\[\begin{equation*} \hat{\mu}=\frac{n}{n+\beta}\frac{\sum_{j=1}^{n}x_j}{n} +(1-\frac{n}{n+\beta})\frac{\alpha}{\beta}=\frac{\alpha+\sum_{j=1}^{n}x_j}{\beta+n} \end{equation*}\]

For the Gamma-Poisson model the Bühlmann credibility estimate equals the Bayesian analysis answer.

9.5.2 Exact Credibility

For the Gamma-Poisson claims model the Bühlmann credibility estimate for the expected number of claims exactly matches the Bayesian answer. The term exact credibility is applied in this situation. Exact credibility may occur if the probability distribution for \(X_j\) is in the linear exponential family and the prior distribution is a conjugate prior. Besides the Gamma-Poisson model other examples include Gamma-Exponential, Normal-Normal, and Beta-Binomial. More information about exact credibility can be found in (Buhlmann and Gisler 2005), (Klugman, Panjer, and Willmot 2012), and (Tse 2009).

The beta-binomial model is useful for modeling the probability of an event. Assume that random variable \(X\) is the number of successes in \(n\) trials and that \(X\) has a binomial distribution Pr(\(X=x|p)=\binom{n}{x}p^x(1-p)^{n-x}\). In the beta-binomial model the prior distribution for probability \(p\) is a beta distribution with pdf

\[\begin{equation*} \pi(p)=\frac{\Gamma(\alpha+\beta)}{\Gamma(\alpha)\Gamma(\beta)}p^{\alpha-1}(1-p)^{\beta-1} , \quad 0<p<1, \alpha>0, \beta>0. \end{equation*}\]

The posterior distribution for \(p\) given outcome \(x\) is

\[\begin{equation*} \pi(p|x)=\frac{\binom{n}{x}p^x(1-p)^{n-x}}{\Pr(x)}\frac{\Gamma(\alpha+\beta)}{\Gamma(\alpha)\Gamma(\beta)}p^{\alpha-1}(1-p)^{\beta-1}. \end{equation*}\]

Combining terms that have a \(p\) and putting everything else into the constant \(C\) yields

\[\begin{equation*} \pi(p| x)=Cp^{\alpha+x-1}(1-p)^{\beta+(n-x)-1}. \end{equation*}\]

This is a beta distribtuion with new parameters \(\alpha^\prime=\alpha+x\) and \(\beta^\prime=\beta+(n-x)\). The constant must be \(C=\frac{\Gamma(\alpha+\beta+n)}{\Gamma(\alpha+x)\Gamma(\beta+n-x)}\).

The mean for the beta distribution with parameters \(\alpha\) and \(\beta\) is E(\(p)=\frac{\alpha}{\alpha+\beta}\). Given \(x\) successes in \(n\) trials in the beta-binomial model the mean of the posterior distribution is E(\(p|x)=\frac{\alpha+x}{\alpha+\beta+n}\). As the number of trials \(n\) and successes \(x\) increase, the expected value of \(p\) approaches \(x/n\). The Bühlmann credibility estimate for E(\(p|x\)) is exactly the same as shown in the following example.

Example 9.5.1 The probability that a coin toss will yield heads is \(p\). The prior distribution for probability \(p\) is beta with parameters \(\alpha\) and \(\beta\). On \(n\) tosses of the coin there were exactly \(x\) heads. Use Bühlmann credibility to estimate the expected value of \(p\).

Show Example Solution

Parameter \(K\)=EPV/VHM=[E(\(p\))-E(\(p^2\))]/Var(\(p\)). With some algebra this reduces to \(K=\alpha+\beta\). The Bühlmann credibility-weighted estimate is

\[\begin{align*} \hat{p} &= \frac{n}{n+\alpha+\beta}\left(\frac{x}{n}\right)+\left(1-\frac{n}{n+\alpha+\beta}\right)\frac{\alpha}{\alpha+\beta} \\ \hat{p} & =\frac{\alpha+x}{\alpha+\beta+n}\\ \end{align*}\]

which is the same as the Bayesian posterior mean.

9.6 Estimating Credibility Parameters


In this section, you learn how to:

  • Perform nonparametric estimation with the Bühlmann and Bühlmann-Straub credibility models.
  • Identify situations when semiparametric estimation is appropriate.
  • Use data to approximate the EPV and VHM.
  • Balance credibility-weighted estimates.

The examples in this chapter have provided assumptions for calculating credibility parameters. In actual practice the actuary must use real world data and judgment to determine credibility parameters.

9.6.1 Full Credibility Standard for Limited Fluctuation Credibility

Limited-fluctuation credibility requires a full credibility standard. The general formula for aggregate losses or pure premium is

\[\begin{equation*} n_S=\left(\frac{y_p}{k}\right)^2\left[\left(\frac{\sigma_N^2}{\mu_N}\right)+\left(\frac{\sigma_X}{\mu_X}\right)^2\right] \end{equation*}\]

with \(N\) representing number of claims and \(X\) the size of claims. If one assumes \(\sigma_X=0\) then the full credibility standard for frequency results. If \(\sigma_N=0\) then the full credibility formula for severity follows. Probability \(p\) and \(k\) value are often selected using judgment and experience.

In practice it is often assumed that the number of claims is Poisson distributed so that \(\sigma_N^2/\mu_N=1\). In this case the formula can be simplified to

\[\begin{equation*} n_S=\left(\frac{y_p}{k}\right)^2\left[\frac{\mathrm{E}(X^2)}{(\mathrm{E}(X))^2}\right]. \end{equation*}\]

An empirical mean and second moment for the sizes of individual claim losses can be computed from past data, if available.

9.6.2 Nonparametric Estimation for Bühlmann and Bühlmann-Straub Models

Bayesian analysis as described previously requires assumptions about a prior distribution and likelihood. It is possible to produce estimates without these assumptions and these methods are often referred to as empirical Bayes methods. Bühlmann and Bühlmann-Straub credibility with parameters estimated from the data are included in category of empirical Bayes methods.

Bühlmann Model First we will address the simpler Bühlmann model. Assume that there are \(r\) risks in a population. For risk \(i\) with risk parameter \(\theta_i\) the losses for \(n\) periods are \(X_{i1},\ldots, X_{in}\). The losses for a risk are iid across periods as assumed in the Bühlmann model. For risk \(i\) the sample mean is \(\bar{X}_i=\sum_{j=1}^{n}X_{ij}/n\) and the unbiased sample process variance is \(s_i^2=\sum_{j=1}^{n}(X_{ij}-\bar{X}_i)^2/(n-1)\). An unbiased estimator for the EPV can be calculated by taking the average of \(s_i^2\) for the \(r\) risks in the population:

\[\begin{equation} \widehat{EPV}=\frac{1}{r}\sum_{i=1}^{r} s_i^2 = \frac{1}{r(n-1)} \sum_{i=1}^{r} \sum_{j=1}^{n}(X_{ij}-\bar{X}_i)^2 . \tag{9.10} \end{equation}\]

The individual risk means \(\bar{X}_i\) for \(i=1,\ldots, r\) can be used to estimate the VHM. An unbiased estimator of Var(\(\bar{X}_i\)) is

\[\begin{equation*} \widehat{\mathrm{Var}}(\bar{X}_i)=\frac{1}{r-1} \sum_{i=1}^{r}(\bar{X}_i-\bar{X})^2 \textrm{ and } \bar{X}=\frac{1}{r}\sum_{i=1}^{r} \bar{X}_i, \end{equation*}\]

but Var(\(\bar{X}_i\)) is not the VHM. The total variance formula is

\[\begin{equation*} \mathrm{Var}(\bar{X}_i)=\textrm{E(Var}(\bar{X}_i|\Theta=\theta_i))+\textrm{Var(E}(\bar{X}_i|\Theta=\theta_i)). \end{equation*}\]

The VHM is the second term on the right because \(\mu(\theta_i)=\mathrm{E}(\bar{X}_i|\Theta=\theta_i)\) is the hypothetical mean for risk \(i\). So,

\[\begin{equation*} VHM=\textrm{Var(E}(\mu(\theta_i)) = \mathrm{Var}(\bar{X}_i) - \textrm{E(Var}(\bar{X}_i|\Theta=\theta_i)). \end{equation*}\]

As discussed previously in Section 9.3.1, EPV/n = E(Var(\(\bar{X}_i|\Theta=\theta_i\))) and using the above estimators gives an unbiased estimator for the VHM:

\[\begin{equation} \widehat{VHM} = \frac{1}{r-1} \sum_{i=1}^{r}(\bar{X}_i-\bar{X})^2 - \frac{\widehat{EPV}}{n} . \tag{9.11} \end{equation}\]

Although the expected loss for a risk with parameter \(\theta_i\) is \(\mu(\theta_i)\)=E(\(\bar{X}_i|\Theta=\theta_i\)), the variance of the sample mean \(\bar{X}_i\) is greater that the variance of the hypothetical means: Var(\(\bar{X}_i)\geq\)Var(\(\mu(\theta_i)\)). The variance in the sample means Var(\(\bar{X}_i\)) includes both the variance in the hypothetical means plus a process variance term because for individual observations \(X_{ij}\), \(Var(X_{ij}|\Theta=\theta_i)>0\).

In some cases formula (9.11) can produce a negative value for \(\widehat{VHM}\) because of the subtraction of \(\widehat{EPV}/n\), but a variance cannot be negative. The process variance within risks is so large that it overwhelms the measurement of the variance in means between risks. We cannot use this method to determine the values needed for Bühlmann credibility.

Example 9.6.1. Two policyholders had claims over a three-year period as shown in the table below. Estimate the expected number of claims for each policyholder using Bühlmann credibility and calculating necessary parameters from the data.

\[\begin{matrix} \begin{array}{|c|c|c|} \hline \text{Year} & \text{Risk A} & \text{Risk B} \\ \hline 1 & 0 & 2 \\ 2 & 1 & 1 \\ 3 & 0 & 2 \\ \hline \end{array} \end{matrix}\]

Show Example Solution

Example9.6.2. Two policyholders had claims over a three-year period as shown in the table below. Calculate the nonparametric estimate for the VHM.

\[\begin{matrix} \begin{array}{|c|c|c|} \hline \text{Year} & \text{Risk A} & \text{Risk B} \\ \hline 1 & 3 & 3 \\ 2 & 0 & 0 \\ 3 & 0 & 3 \\ \hline \end{array} \end{matrix}\]

Show Example Solution

Bühlmann-Straub Model Empirical formulas for EPV and VHM in the Bühlmann-Straub model are more complicated because a risk’s number of exposures can change from one period to another. Also, the number of experience periods does not have to be constant across the population because exposure rather than time measures loss potential. First some definitions:

  • \(X_{ij}\) is the losses per exposure for risk \(i\) in period \(j\). Losses can refer to number of claims or amount of loss. There are \(r\) risks so \(i=1,\ldots,r\).
  • \(n_i\) is the number of observation periods for risk \(i\)
  • \(m_{ij}\) is the number of exposures for risk \(i\) in period \(j\) for \(j=1,\ldots,n_i\)

Risk \(i\) with risk parameter \(\theta_i\) has \(m_{ij}\) exposures in period \(j\) which means that the losses per exposure random variable can be written as \(X_{ij}=(Y_{i1}+\cdots+Y_{im_{ij}})/m_{ij}\). Random variable \(Y_{ik}\) is the loss for one exposure. For risk \(i\) losses \(Y_{ik}\) are iid with mean E(\(Y_{ik}\))=\(\mu(\theta_i)\) and process variance Var(\(Y_{ik}\))=\(\sigma^2(\theta_i)\). It follows that Var(\(X_{ij})\)=\(\sigma^2(\theta_i)/m_{i,j}\).

Two more important definitions are:

  • \(\bar{X}_i=\frac{1}{m_i}\sum_{j=1}^{n_i} m_{ij}X_{ij}\) with \(m_i = \sum_{j=1}^{n_i} m_{ij}\). \(\bar{X}_i\) is the average loss per exposure for risk \(i\) for all observation periods combined.
  • \(\bar{X}=\frac{1}{m}\sum_{i=1}^{r} m_i \bar{X}_i\) with \(m=\sum_{i=1}^r m_i\). \(\bar{X}\) is the average loss per exposure for all risks for all observation periods combined.

Random variable \(\bar{X}_i\) is the average loss for all \(m_i\) exposures for risk \(i\) for all years combined. Random variable \(\bar{X}\) is the average loss for all exposures for all risks for all years combined.

An unbiased estimator for the process variance \(\sigma^2(\theta_i)\) of one exposure for risk \(i\) is

\[\begin{equation*} {s_i}^2=\frac{\sum_{j=1}^{n_i} m_{ij}(X_{ij}-\bar{X}_i)^2}{n_i-1}. \end{equation*}\]

The \(m_{ij}\) weights are applied to the squared differences because the \(X_{ij}\) are the averages of \(m_{ij}\) exposures. The weighted average of the sample variances \({s_i}^2\) for each risk \(i\) in the population with weights proportional to the number of \((n_i-1)\) observation periods will produce the expected value of the process variance (EPV) estimate

\[\begin{equation*} \widehat{EPV}=\frac{\sum_{i=1}^r (n_i-1){s_i}^2}{\sum_{i=1}^r (n_i-1)}=\frac{\sum_{i=1}^r \sum_{j=1}^{n_i} m_{ij}(X_{ij}-\bar{X}_i)^2}{\sum_{i=1}^r (n_i-1)}. \end{equation*}\]

The quantity \(\widehat{*EPV*}\) is an unbiased estimator for the process variance of one exposure for a risk chosen at random from the population.

To calculate an estimator for the variance in the hypothetical means (VHM) the squared differences of the individual risk sample means \(\bar{X}_i\) and population mean \(\bar{X}\) are used. An unbiased estimator for the VHM is

\[\begin{equation*} \widehat{VHM}=\frac{\sum_{i=1}^r m_i(\bar{X}_i-\bar{X})^2 - (r-1)\widehat{*EPV*}}{m-\frac{1}{m}\sum_{i=1}^r m_i^2}. \end{equation*}\]

This complicated formula is necessary because of the varying number of exposures. Proofs that the EPV and VHM estimators shown above are unbiased can be found in several references mentioned at the end of this chapter including (Buhlmann and Gisler 2005), (Klugman, Panjer, and Willmot 2012), and (Tse 2009).

Example 9.6.3. Two policyholders had claims shown in the table below. Estimate the expected number of claims for each policyholder using Bü hlmann-Straub credibility and calculating parameters from the data.

\[\begin{matrix} \begin{array}{|c|c|c|c|c|c|} \hline \text{Policyholder} & & \text{Year 1} & \text{Year 2} & \text{Year 3} & \text{Year 4} \\ \hline \text{A} & \text{Number of claims} & 0 & 2 & 2 & 3 \\ \hline \text{A} & \text{Insured vehicles} & 1 & 2 & 2 & 2\\ \hline & & & & & \\ \hline \text{B} & \text{Number of claims} & 0 & 0 & 1 & 2\\ \hline \text{B} & \text{Insured vehicles} & 0 & 2 & 3 & 4\\ \hline \end{array} \end{matrix}\]

Show Example Solution

9.6.3 Semiparametric Estimation for Bühlmann and Bühlmann-Straub Models

In the prior section on nonparametric estimation, there were no assumptions about the distribution of the losses per exposure random variables \(X_{ij}\). Assuming that the \(X_{ij}\) have a particular distribution and using properties of the distribution along with the data to determine credibility parameters is referred to as semiparametric estimation.

An example of semiparametric estimation would be the assumption of a Poisson distribution when estimating claim frequencies. The Poisson distribution has the property that the mean and variance are identical and this property can simplify calculations. The following simple example comes from the prior section but now includes a Poisson assumption about claim frequencies.

Example 9.6.4. Two policyholders had claims over a three-year period as shown in the table below. Assume that the number of claims for each risk has a Poisson distribution. Estimate the expected number of claims for each policyholder using Bühlmann credibility and calculating necessary parameters from the data. \[\begin{matrix} \begin{array}{|c|c|c|} \hline \text{Year} & \text{Risk A} & \text{Risk B} \\ \hline 1 & 0 & 2 \\ 2 & 1 & 1 \\ 3 & 0 & 2 \\ \hline \end{array} \end{matrix}\]

Show Example Solution

We did not have to make the Poisson assumption in the prior example because there was enough data to use nonparametric estimation but the following example is commonly used to demonstrate a situation where semiparametric estimation is needed. There is insufficient data for nonparametric estimation but with the Poisson assumption estimates can be calculated.

Example 9.6.5. A portfolio of 2,000 policyholders generated the following claims profile during a five-year period: \[\begin{matrix} \begin{array}{|c|c|} \hline \text{Number of Claims} & \\ \text{In 5 Years} & \text{Number of policies}\\ \hline 0 & 923 \\ 1 & 682 \\ 2 & 249 \\ 3 & 70 \\ 4 & 51 \\ 5 & 25 \\ \hline \end{array} \end{matrix}\] In your model you assume that the number of claims for each policyholder has a Poisson distribution and that a policyholder’s expected number of claims is constant through time. Use Bühlmann credibility to estimate the annual expected number of claims for policyholders with 3 claims during the five-year period.

Show Example Solution

9.6.4 Balancing Credibility Estimators

The estimated loss for risk \(i\) in a credibility weighted model is \(\hat{\mu}(\theta_i)=Z_i\bar{X}_i+(1-Z_i)\bar{X}\) where \(\bar{X}_i\) is the loss per exposure for risk \(i\) and \(\bar{X}\) is loss per exposure for the population. The overall mean in the Bühlmann-Straub model is \(\bar{X}=\sum_{i=1}^r(m_i/m) \bar{X}_i\) where \(m_i\) and \(m\) are number of exposures for risk \(i\) and population, respectively. The same formula works for the simpler Bühlmann model by setting \(m_i=1\) and \(m=r\) where \(r\) is the number of risks.

For the credilility weighted estimators to be in balance we want

\[\begin{equation*} \bar{X}=\sum_{i=1}^r(m_i/m) \bar{X}_i=\sum_{i=1}^r(m_i/m) \hat{\mu}(\theta_i). \end{equation*}\]

If this equation is satisfied then the estimated losses for each risk will add up to the population total, an important goal in ratemating, but this may not happen if \(\bar{X}\) is used for the complement of credibility.

In order to find a complement of credibility that will bring the credibility-weighted estimators into balance we will set \(\hat{\mu}\) as the complement of crediblity:

\[\begin{equation*} \sum_{i=1}^r(m_i/m) \bar{X}_i=\sum_{i=1}^r(m_i/m) (Z_i\bar{X}_i+(1-Z_i)\hat{\mu}) . \end{equation*}\]

A little algebra gives

\[\begin{equation*} \sum_{i=1}^r m_i \bar{X}_i=\sum_{i=1}^r m_i Z_i\bar{X}_i + \hat{\mu}\sum_{i=1}^r m_i(1-Z_i), \end{equation*}\]

and

\[\begin{equation*} \hat{\mu}=\frac{\sum_{i=1}^r m_i(1-Z_i)\bar{X}_i}{\sum_{i=1}^r m_i(1-Z_i)}. \end{equation*}\]

This can be simplified using the following relationship

\[\begin{equation*} m_i(1-Z_i)=m_i\left(1-\frac{m_i}{m_i+K}\right)=m_i\left(\frac{(m_i+K)-m_i}{m_i+K}\right)=KZ_i . \end{equation*}\]

A complement of credibility that will bring the credibility-weighed estimators into balance with the overall mean loss per exposure is

\[\begin{equation*} \hat{\mu}=\frac{\sum_{i=1}^r Z_i \bar{X}_i}{\sum_{i=1}^r Z_i}. \end{equation*}\]

Example 9.6.6. An example from the nonparametric Bühlmann-Straub section had the following data for two risks. Find the complement of credibility \(\hat{\mu}\) that will produce crediblity-weighted estimates that are in balance.

\[\begin{matrix} \begin{array}{|c|c|c|c|c|c|} \hline \text{Policyholder} & & \text{Year 1} & \text{Year 2} & \text{Year 3} & \text{Year 4} \\ \hline \text{A} & \text{Number of claims} & 0 & 2 & 2 & 3 \\ \hline \text{A} & \text{Insured vehicles} & 1 & 2 & 2 & 2\\ \hline & & & & & \\ \hline \text{B} & \text{Number of claims} & 0 & 0 & 1 & 2\\ \hline \text{B} & \text{Insured vehicles} & 0 & 2 & 3 & 4\\ \hline \end{array} \end{matrix}\]

Show Example Solution

9.7 Further Resources and Contributors

Exercises

Here are a set of exercises that guide the viewer through some of the theoretical foundations of Loss Data Analytics. Each tutorial is based on one or more questions from the professional actuarial examinations, typically the Society of Actuaries Exam C.

Credibility Guided Tutorials

Contributors

  • Gary Dean, Ball State University is the author of the initial version of this chapter. Email: cgdean@bsu.edu for chapter comments and suggested improvements.

Bibliography

Buhlmann, Hans. 1967. “The Complement of Credibility.” ASTIN Bulletin, 199–207.

Buhlmann, Hans, and Alois Gisler. 2005. A Course in Credibility Theory and Its Applications. ACTEX Publications.

Klugman, Stuart A., Harry H. Panjer, and Gordon E. Willmot. 2012. Loss Models: From Data to Decisions. John Wiley & Sons.

Tse, Yiu-Kuen. 2009. Nonlife Actuarial Models: Theory, Methods and Evaluation. Cambridge University Press.