Читать книгу The Investment Advisor Body of Knowledge + Test Bank - IMCA - Страница 37

CHAPTER 3
Statistics and Methods
Part V Hypothesis Testing and Confidence Intervals

Оглавление

The Sample Mean Revisited

Imagine we take the output from a standard random number generator on a computer, and multiply it by 100. The resulting data generating process (DGP) is a uniform random variable, which ranges between 0 and 100, with a mean of 50. If we generate 20 draws from this DGP and calculate the sample mean of those 20 draws, it is unlikely that the sample mean will be exactly 50. The sample mean might round to 50, say 50.03906724, but exactly 50 is next to impossible. In fact, given that we have only 20 data points, the sample mean might not even be close to the true mean.

The sample mean is actually a random variable itself. If we continue to repeat the experiment – generating 20 data points and calculating the sample mean each time – the calculated sample mean will be different every time. As we proved, even though we never get exactly 50, the expected value of each sample mean is in fact 50. It might sound strange to say it, but the mean of our sample mean is the true mean of the distribution. Using our standard notation:

(3.71)

Instead of 20 data points, what if we generate 1,000 data points? With 1,000 data points, the expected value of our sample mean is still 50, just as it was with 20 data points. While we still don't expect our sample mean to be exactly 50, we expect our sample mean will tend to be closer when we are using 1,000 data points. The reason is simple: a single outlier won't have nearly the impact in a pool of 1,000 data points that it will in a pool of 20. If we continue to generate sets of 1,000 data points, it stands to reason that the standard deviation of our sample mean will be lower with 1,000 data points than it would be if our sets contained only 20 data points.

It turns out that the variance of our sample mean doesn't just decrease with the sample size; it decreases in a predictable way, in proportion to the sample size. In other words, if our sample size is n and the true variance of our DGP is σ2, then the variance of the sample mean is:

(3.72)

It follows that the standard deviation of the sample mean decreases with the square root of n. This square root is important. In order to reduce the standard deviation of the mean by a factor of 2, we need four times as many data points. To reduce it by a factor of 10, we need 100 times as much data. This is yet another example of the famous square root rule for independent and identically distributed (i.i.d.) variables.

In our current example, because the DGP follows a uniform distribution, we can easily calculate the variance of each data point. The variance of each data point is 833.33, (100 − 1)2/12 = 833.33. This is equivalent to a standard deviation of approximately 28.87. For 20 data points, the standard deviation of the mean will then be 28.87/ = 6.45, and for 1,000 data points, the standard deviation will be 28.87/ = 0.91.

We have the mean and the standard deviation of our sample mean, but what about the shape of the distribution? You might think that the shape of the distribution would depend on the shape of the underlying distribution of the DGP. If we recast our formula for the sample mean slightly, though:

(3.73)

and regard each of the ()xi's as a random variable in its own right, we see that our sample mean is equivalent to the sum of n i.i.d. random variables, each with a mean of μ/n and a standard deviation of σ/n. Using the central limit theorem, we claim that the distribution of the sample mean converges to a normal distribution. For large values of n, the distribution of the sample mean will be extremely close to a normal distribution. Practitioners will often assume that the sample mean is normally distributed.

Sample Problem

Question:

You are given 10 years of monthly returns for a portfolio manager. The mean monthly return is 2.3 percent, and the standard deviation of the returns series is 3.6 percent. What is the standard deviation of the mean?

The portfolio manager is being compared against a benchmark with a mean monthly return of 1.5 percent. What is the probability that the portfolio manager's mean return exceeds the benchmark? Assume the sample mean is normally distributed.

Answer:

There are a total of 120 data points in the sample (10 years × 12 months per year). The standard deviation of the mean is then 0.33 percent:

The distance between the portfolio manager's mean return and the benchmark is –2.43 standard deviations: (1.50 percent – 2.30 percent)/0.33 percent = –2.43. For a normal distribution, 99.25 percent of the distribution lies above –2.43 standard deviations, and only 0.75 percent lies below. The difference between the portfolio manager and the benchmark is highly significant.

Sample Variance Revisited

Just as with the sample mean, we can treat the sample variance as a random variable. For a given DGP if we repeatedly calculate the sample variance, the expected value of the sample variance will equal the true variance, and the variance of the sample variance will equal:

(3.74)

where n is the sample size, and κ is the excess kurtosis.

If the DGP has a normal distribution, then we can also say something about the shape of the distribution of the sample variance. If we have n sample points and is the sample variance, then our estimator will follow a chi-squared distribution with (n – 1) degrees of freedom:

(3.75)

where σ2 is the population variance. Note that this is true only when the DGP has a normal distribution. Unfortunately, unlike the case of the sample mean, we cannot apply the central limit theorem here. Even when the sample size is large, if the underlying distribution is nonnormal, the statistic in Equation 3.75 can vary significantly from a chi-squared distribution.

Confidence Intervals

In our discussion of the sample mean, we assumed that the standard deviation of the underlying distribution was known. In practice, the true standard deviation is likely to be unknown. At the same time we are measuring our sample mean, we will typically be measuring a sample variance as well.

It turns out that if we first standardize our estimate of the sample mean using the sample standard deviation, the new random variable follows a Student's t-distribution with (n – 1) degrees of freedom:

(3.76)

Here the numerator is simply the difference between the sample mean and the population mean, while the denominator is the sample standard deviation divided by the square root of the sample size. To see why this new variable follows a t-distribution, we simply need to divide both the numerator and the denominator by the population standard deviation. This creates a standard normal variable in the numerator, and the square root of a chi-square variable in the denominator with the appropriate constant. We know from discussions on distributions that this combination of random variables follows a t-distribution. This standardized version of the population mean is so frequently used that it is referred to as a t-statistic, or simply a t-stat.

Technically, this result requires that the underlying distribution be normally distributed. As was the case with the sample variance, the denominator may not follow a chi-squared distribution if the underlying distribution is nonnormal. Oddly enough, for large sample sizes the overall t-statistic still converges to a t-distribution. If the sample size is small and the data distribution is nonnormal, be aware that the t-statistic, as defined here, may not be well approximated by a t-distribution.

By looking up the appropriate values for the t-distribution, we can establish the probability that our t-statistic is contained within a certain range:

(3.77)

where xL and xU are constants, which, respectively, define the lower and upper bounds of the range within the t-distribution, and (1 – α) is the probability that our t-statistic will be found within that range. The right-hand side may seem a bit awkward, but, by convention, (1 – α) is called the confidence level, while α by itself is known as the significance level.

In practice, the population mean, μ, is often unknown. By rearranging the previous equation we come to an equation with a more interesting form:

(3.78)

Looked at this way, we are now giving the probability that the population mean will be contained within the defined range. When it is formulated this way, we call this range the confidence interval for the population mean. Confidence intervals are not limited to the population mean. Though it may not be as simple, in theory we can define a confidence level for any distribution parameter.

Hypothesis Testing

One problem with confidence intervals is that they require us to settle on an arbitrary confidence level. While 95 percent and 99 percent are common choices for the confidence level in risk management, there is nothing sacred about these numbers. It would be perfectly legitimate to construct a 74.92 percent confidence interval. At the same time, we are often concerned with the probability that a certain variable exceeds a threshold. For example, given the observed returns of a mutual fund, what is the probability that the standard deviation of those returns is less than 20 percent?

In a sense, we want to turn the confidence interval around. Rather than saying there is an x percent probability that the population mean is contained within a given interval, we want to know what the probability is that the population mean is greater than y. When we pose the question this way, we are in the realm of hypothesis testing.

Traditionally the question is put in the form of a null hypothesis. If we are interested in knowing if the expected return of a portfolio manager is greater than 10 percent, we would write:

(3.79)

where H0 is known as the null hypothesis. Even though the true population mean is unknown, for the hypothesis test we assume the population mean is 10 percent. In effect, we are asking, if the true population mean is 10 percent, what is the probability that we would see a given sample mean? With our null hypothesis in hand, we gather our data, calculate the sample mean, and form the appropriate t-statistic. In this case, the appropriate t-statistic is:

(3.80)

We can then look up the corresponding probability from the t-distribution.

In addition to the null hypothesis, we can offer an alternative hypothesis. In the previous example, where our null hypothesis is that the expected return is greater than 10 percent, the logical alternative would be that the expected return is less than or equal to 10 percent:

(3.81)

In principle, we could test any number of hypotheses. In practice, as long as the alternative is trivial, we tend to limit ourselves to stating the null hypothesis.

WHICH WAY TO TEST?

If we want to know if the expected return of a portfolio manager is greater than 10 percent, the obvious statement of the null hypothesis might seem to be μr > 10 percent. But there is no reason that we couldn't have started with the alternative hypothesis, that μr ≤ 10 percent. Finding that the first is true and finding that the second is false are logically equivalent.

Many practitioners construct the null hypothesis so that the desired result is false. If we are an investor trying to find good portfolio managers, then we would make the null hypothesis μr ≤ 10 percent. That we want the expected return to be greater than 10 percent but we are testing for the opposite makes us seem objective. Unfortunately, in the case where there is a high probability that the manager's expected return is greater than 10 percent (a good result), we have to say, “We reject the null hypothesis that the manager's returns are less than or equal to 10 percent at the x percent level.” This is very close to a double negative. Like a medical test where the good outcome is negative and the bad outcome is positive, we often find that the good outcome for a null hypothesis is rejection.

To make matters more complicated, what happens if the portfolio manager doesn't seem to be that good? If we rejected the null hypothesis when there was a high probability that the portfolio manager's expected return was greater than 10 percent, should we accept the null hypothesis when there is a high probability that the returns are less than 10 percent? In the realm of statistics, outright acceptance seems too certain. In practice, we can do two things. First, we can state that the probability of rejecting the null hypothesis is low (e.g., “The probability of rejecting the null hypothesis is only 4.2 percent”). More often we say that we fail to reject the null hypothesis (e.g., “We fail to reject the null hypothesis at the 95.8 per- cent level”).

Sample Problem

Question:

At the start of the year, you believed that the annualized volatility of XYZ Corporation's equity was 45 percent. At the end of the year, you have collected a year of daily returns, 256 business days' worth. You calculate the standard deviation, annualize it, and come up with a value of 48 percent. Can you reject the null hypothesis, H0: σ = 45 percent, at the 95 percent confidence level?

Answer:

The appropriate test statistic is:

Notice that annualizing the standard deviation has no impact on the test statistic. The same factor would appear in the numerator and the denominator, leaving the ratio unchanged. For a chi-squared distribution with 255 degrees of freedom, 290.13 corresponds to a probability of 6.44 percent. We fail to reject the null hypothesis at the 95 percent confidence level.

Application: VaR

Value at risk (VaR) is one of the most widely used risk measures in finance. VaR was popularized by J.P. Morgan in the 1990s. The executives at J.P. Morgan wanted their risk managers to generate one statistic at the end of each day, which summarized the risk of the firm's entire portfolio. What they came up with was VaR.

Figure 3.8 provides a graphical representation of VaR. If the 95 percent VaR of a portfolio is $100, then we expect the portfolio will lose $100 or less in 95 percent of the scenarios, and lose $100 or more in 5 percent of the scenarios. We can define VaR for any level of confidence, but 95 percent has become an extremely popular choice in finance. The time horizon also needs to be specified for VaR. On trading desks, with liquid portfolios, it is common to measure the one-day 95 percent VaR. In other settings, in which less liquid assets may be involved, time frames of up to one year are not uncommon. VaR is decidedly a one-tailed confidence interval.


FIGURE 3.8 Value at Risk Example


For a given confidence level, 1 – α, we can define value at risk more formally as:

(3.82)

where the random variable L is our loss.

Value at risk is often described as a confidence interval. As we saw earlier in this chapter, the term confidence interval is generally applied to the estimation of distribution parameters. In practice, when calculating VaR, the distribution is often taken as a given. Either way, the tools, concepts, and vocabulary are the same. So even though VaR may not technically be a confidence interval, we still refer to the confidence level of VaR.

Most practitioners reverse the sign of L when quoting VaR numbers. By this convention, a 95 percent VaR of $400 implies that there is a 5 percent probability that the portfolio will lose $400 or more. Because this represents a loss, others would say that the VaR is –$400. The former is more popular, and is the convention used throughout the rest of the book. In practice, it is often best to avoid any ambiguity by, for example, stating that the VaR is equal to a loss of $400.

If an actual loss exceeds the predicted VaR threshold, that event is known as an exceedance. Another assumption of VaR models is that exceedance events are uncorrelated with each other. In other words, if our VaR measure is set at a one-day 95 percent confidence level, and there is an exceedance event today, then the probability of an exceedance event tomorrow is still 5 percent. An exceedance event today has no impact on the probability of future exceedance events.

Sample Problem

Question:

The probability density function (PDF) for daily profits at Triangle Asset Management can be described by the following function:

Triangular Probability Density Function

What is the one-day 95 percent VaR for Triangle Asset Management?

Answer:

To find the 95 percent VaR, we need to find a, such that:

By inspection, half the distribution is below zero, so we need only bother with the first half of the function:

Using the quadratic formula, we can solve for a:

Because the distribution is not defined for π < –10, we can ignore the negative, giving us the final answer:

The one-day 95 percent VaR for Triangle Asset Management is a loss of approximately 6.84.

BACK-TESTING

An obvious concern when using VaR is choosing the appropriate confidence interval. As mentioned, 95 percent has become a very popular choice in risk management. In some settings there may be a natural choice for the confidence level, but most of the time the exact choice is arbitrary.

A common mistake for newcomers is to choose a confidence level that is too high. Naturally, a higher confidence level sounds more conservative. A risk manager who measures one-day VaR at the 95 percent confidence level will, on average, experience an exceedance event every 20 days. A risk manager who measures VaR at the 99.9 percent confidence level expects to see an exceedance only once every 1,000 days. Is an event that happens once every 20 days really something that we need to worry about? It is tempting to believe that the risk manager using the 99.9 percent confidence level is concerned with more serious, riskier outcomes, and is therefore doing a better job.

The problem is that, as we go further and further out into the tail of the distribution, we become less and less certain of the shape of the distribution. In most cases, the assumed distribution of returns for our portfolio will be based on historical data. If we have 1,000 data points, then there are 50 data points to back up our 95 percent confidence level, but only one to back up our 99.9 percent confidence level. As with any distribution parameter, the variance of our estimate of the parameter decreases with the sample size. One data point is hardly a good sample size on which to base a parameter estimate.

A related problem has to do with back-testing. Good risk managers should regularly back-test their models. Back-testing entails checking the predicted outcome of a model against actual data. Any model parameter can be back-tested.

In the case of VaR, back-testing is easy. Each period can be viewed as a Bernoulli trial. In the case of one-day 95 percent VaR, there is a 5 percent chance of an exceedance event each day, and a 95 percent chance that there is no exceedance. Because exceedance events are independent, over the course of n days, the distribution of exceedances follows a binomial distribution:

(3.83)

In this case, n is the number of periods that we are using to back-test, k is the number of exceedances, and (1 – p) is our confidence level.

Sample Problem

Question:

As a risk manager, you are tasked with calculating a daily 95 percent VaR statistic for a large fixed income portfolio. Over the past 100 days, there have been four exceedances. How many exceedances should you have expected? What was the probability of exactly four exceedances during this time? Four or less? Four or more?

Answer:

The probability of exactly four exceedances is 17.81 percent:

Remember, by convention, for a 95 percent VaR the probability of an exceedance is 5 percent, not 95 percent.

The probability of four or fewer exceedances is 43.60 percent. Here we simply do the same calculation as in the first part of the problem, but for zero, one, two, three, and four exceedances. It's important not to forget zero:

For the final result, we could use the brute force approach and calculate the probability for k = 4, 5, 6, … , 99, 100, a total of 97 calculations. Instead we realize that the sum of all probabilities from 0 to 100 must be 100 percent; therefore, if the probability of K ⩽ 4 is 43.60 percent, then the probability of K > 4 must be 100 percent – 43.60 percent = 56.40 percent. Be careful, though, as what we want is the probability for K ⩾ 4. To get this, we simply add the probability that K = 4, from the first part of our question, to get the final answer, 74.21 percent:


EXPECTED SHORTFALL

Another criticism of VaR is that it does not tell us anything about the tail of the distribution. Two portfolios could have the exact same 95 percent VaR, but very different distributions beyond the 95 percent confidence level.

More than VaR, then, what we really want to know is how big the loss will be when we have an exceedance event. Using the concept of conditional probability, we can define the expected value of a loss, given an exceedance, as follows:

(3.84)

we refer to this conditional expected loss, S, as the expected shortfall.

If the profit function has a probability density function given by f(x), and VaR is the VaR at the α confidence level, we can find the expected shortfall as:

(3.85)

In most cases the VaR for a portfolio will correspond to a loss, and Equation 3.85 will produce a negative value. As with VaR, it is common to reverse the sign when speaking about the expected shortfall.

Expected shortfall does answer an important question. What's more, expected shortfall turns out to be subadditive, thereby avoiding one of the major criticisms of VaR. As our discussion on back-testing suggests, though, the reliability of our expected shortfall measure may be difficult to gauge.

Sample Problem

Question:

In a previous example, the probability density function of Triangle Asset Management's daily profits could be described by the following function:

We calculated Triangle's one-day 95 percent VaR as a loss of . For the same confidence level and time horizon, what is the expected shortfall?

Answer:

Because the VaR occurs in the region where π < 0, we need to utilize only the first half of the function. Using Equation 3.85, we have:

Thus, the expected shortfall is a loss of 7.89. Intuitively this should make sense. The expected shortfall must be greater than the VaR, 6.84, but less than the minimum loss of 10. Because extreme events are less likely (the height of the PDF decreases away from the center), it also makes sense that the expected shortfall is closer to the VaR than it is to the maximum loss.

The Investment Advisor Body of Knowledge + Test Bank

Подняться наверх