Estimating Credit Scores with Logit
Typically, several factors can affect a borrower’s default probability. In the retail segment, one would consider salary, occupation, age and other characteristics of the loan applicant; when dealing with corporate clients, one would examine the firm’s leverage, profitability or cash flows, to name but a few. A scoring model specifies how to combine the different pieces of information in order to get an accurate assessment of default probability, thus serving to automate and standardize the evaluation of default risk within a financial institution.
In this chapter, we will show how to specify a scoring model using a statistical technique called logistic regression or simply logit. Essentially, this amounts to coding information into a specific value (e.g. measuring leverage as debt/assets) and then finding the combination of factors that does the best job in explaining historical default behavior.
After clarifying the link between scores and default probability, we show how to estimate and interpret a logit model. We then discuss important issues that arise in practical applications, namely the treatment of outliers and the choice of functional relationship between variables and default.
An important step in building and running a successful scoring model is its validation. Since validation techniques are applied not just to scoring models but also to agency ratings and other measures of default risk, they are described separately in Chapter 7.
LINKING SCORES, DEFAULT PROBABILITIES AND OBSERVED DEFAULT BEHAVIOR
A score summarizes the information contained in factors that affect default probability. Standard scoring models take the most straightforward approach by linearly combining those factors. Let x denote the factors (their number is K) and b the weights (or coefficients) attached to them; we can represent the score that we get in scoring instance i as:
It is convenient to have a shortcut for this expression. Collecting the b’s and the x’s in
column vectors b and x we can rewrite (1.1) to:
If the model is to include a constant b1, we set xi1 =1 for each i.
Assume, for simplicity, that we have already agreed on the choice of the factors x – what is then left to determine is the weight vector b. Usually, it is estimated on the basis of the observed default behavior. Imagine that we have collected annual data on firms with factor values and default behavior. We show such a data set in Table 1.1.
Note that the same firm can show up more than once if there is information on this firm for several years. Upon defaulting, firms often stay in default for several years; in such cases, we would not use the observations following the year in which default occurred. If a firm moves out of default, we would again include it in the data set.
The default information is stored in the variable yi. It takes the value 1 if the firm defaulted in the year following the one for which we have collected the factor values, and zero otherwise. The overall number of observations is denoted by N.
The scoring model should predict a high default probability for those observations that defaulted and a low default probability for those that did not. In order to choose the appropriate weights b, we first need to link scores to default probabilities. This can be done by representing default probabilities as a function F of scores:
Models that link information to probabilities using the logistic distribution function are called logit models.
In Table 1.2, we list the default probabilities associated with some score values and illustrate the relationship with a graph. As can be seen, higher scores correspond to a higher default probability. In many financial institutions, credit scores have the opposite property: they are higher for borrowers with a lower credit risk.
In addition, they are often constrained to some set interval, e.g. 0 to 100. Preferences for such characteristics can easily be met. If we use (1.4) to define a scoring system with scores from −9 to 1, but want to work with scores from 0 to 100 instead (100 being the best), we could transform the original score to myscore = −10 × score + 10.
2. The Structural Approach to Default Prediction and Valuation
Structural models of default risk are cause-and-effect models. From economic reasoning, we identify conditions under which we expect borrowers to default and then estimate the probability that these conditions come about to obtain an estimate of the default probability.
For limited liability companies, default is expected to occur if the asset value (i.e. the value of the firm) is not sufficient to cover the firm’s liabilities. Why should this be so? From the identity
Asset value = Value of equity + Value of liabilities
and the rule that equity holders receive the residual value of the firm, it follows that the value of equity is negative if the asset value is smaller than the value of liabilities. If you have something with negative value, and you can give it away at no cost, you are more than willing to do so.
This is what equity holders are expected to do. They exercise the walk-away option that they have because of limited liability and leave the firm to the creditors. As the asset value is smaller than the value of liabilities, creditors’ claims are not fully covered, meaning that the firm is in default. The walk-away option can be priced with standard approaches from option pricing theory.
This is why structural models are also called option-theoretic or contingent-claim models. Another common name is Merton models because it was Robert C. Merton (1974) who first applied option theory to the problem of valuing a firm’s liabilities in the presence of default and limited liability.
In this chapter, we first explain how structural models can be used for estimating default probabilities and valuing a firm’s liabilities. We then show how to implement structural models in the spirit of the original Merton model. We focus on the estimation of default probabilities rather than valuation.
3. Transition Matrices
A credit-rating system uses a limited number of rating grades to rank borrowers according to their default probability. Ratings are assigned by rating agencies such as Fitch, Moody’s and Standard & Poor’s, but also by financial institutions. Rating assignments can be based on a qualitative process or on default probabilities estimated with a scoring model (cf. Chapter 1), a structural model (cf. Chapter 2) or other means.
To translate default probability estimates into ratings, one defines a set of rating grade boundaries, e.g. rules that borrowers are assigned to grade AAA if their probability of default is lower than 0.02%, to grade AA if their probability of default is between 0.02% and 0.05% and so on.
In this chapter, we introduce methods for answering questions such as ‘With what prob- ability will the credit risk rating of a borrower decrease by a given degree?’ In credit risk lingo, we show how to estimate probabilities of rating transition or rating migration. They are usually presented in transition matrices.
Consider a rating system with two rating classes A and B, and a default category D. The transition matrix for this rating system is a table listing the probabilities that a borrower rated A at the start of a period has rating A, B or D at the end of the period; analogously for B-rated companies. Table 3.1 illustrates the transition matrix for this simple rating system.
4. Prediction of Default and Transition Rates
Default and transition rates are essential to pricing or risk management. Based on a forecast for next year’s default rate, for example, a bank can set appropriate loan rates for short-term loans.
In Chapter 3, we showed how to estimate average transition rates based on data extending over several years. If such rates are used to estimate next year’s transition rates, one would implicitly assume the next year to be a typical or average year.
This may be an appropriate assumption in some situations; in others, however, we may have good reason to believe that the following year should be relatively good or bad for credits. If the economy is just moving into a recession, for example, we should expect default rates to be relatively high.
In this chapter, we show how to use readily available information to predict default and transition rates for corporates rated by a major rating agency. The fact that default and transition rates can indeed be predicted might cast doubt on the efficiency of agency ratings.
If there were good reasons to believe, say at the end of 2001, that the default rate of BB-rated issuers was to be relatively high in 2002, why did the agency not downgrade more BB-rated issuers? To understand this, it is crucial to know that agencies do not aim at assigning ratings in such a way that the one-year default probability of a rating category is constant across time.
By contrast, ratings are meant to be relative assessments of credit quality. If overall economic conditions have deteriorated, affecting all borrowers in a similar way, the previous relative ordering would still be correct, even though the default probability of a given rating category may substantially deviate from its average in the past.
5. Modeling and Estimating Default Correlations with the Asset Value Approach
The previous chapters have focused on the measurement of individual default probabilities. For a financial institution that wants to assess the default risk of its loan portfolio, however, individual default probabilities are not enough. Consider the simplest case: a portfolio comprises only two borrowers, and the bank would like to know the probability that both borrowers default in the next period.
This cannot be measured with the default probabilities alone. We could assume that the two borrowers are independent. The probability that both of them default would then equal the product of the two individual default probabilities. Default rates of firms, however, fluctuate with macroeconomic or industry-specific conditions, so we should not rely on defaults being independent.
What we need to know in this case is the joint default probability. As we will see in this chapter, this will lead us directly to the default correlation. We will also examine a widely used way of modeling default correlations, the so-called asset-value approach. We show how to estimate the relevant parameters based on historical default experience and how to assess the quality of the parameter estimates. The two estimation methods that we consider are the method of moments approach and the maximum likelihood approach.