In a departure from the Basel standard, the Basel Finalization Proposal intends to prohibit the use of internal models for setting credit risk capital requirements for banks operating in the U.S. The agencies contend that banks’ internal credit models, since they may rely on subjective modeling assumptions, have resulted in “unwarranted” variability of credit risk capital requirements across banks. To support this view, the proposal cites two Basel studies. In this note, we review the evidence the agencies have cited along with some other relevant academic evidence and conclude that there is no justification for prohibiting the use of internal models for credit risk capital and good reasons for permitting their use.
Agency Concerns on Internal Models in Credit
The use of internal models for credit capital is allowed under the current capital standard and would be allowed for banks outside of the U.S. under Basel finalization. In the internal models framework, banks may use their own data and modeling techniques to estimate key parameters for estimating credit risk losses, which are then input into regulatory capital formulas. For example, using their own internal data as well as external data sets, banks can estimate the probability of default (PD) of companies to which they provide financing. Under Basel rules, the PD is required to be a long-term average default probability, i.e., a rate of default that would be expected to be observed on average over one or several business cycles. Banks are also allowed to estimate loss given default (LGD), the fraction of loss upon default that would not be recovered in bankruptcy court. In contrast to PD, the capital rules require the LGD, termed the downturn LGD, to be an estimate that would be observed during a downturn rather than the average value that would be seen over a long period. Thus, the LGD estimates are overstated. The internal models approach allows other parameters, such as exposure at default (EAD), to be estimated as well, but in this note we will focus on PD and LGD.
In 2013, the Basel Committee conducted a study on how banks estimate PDs and LGDs in the banking book under the internal models approach. Entitled “Analysis of risk-weighted assets for credit risk in the banking book,”, the study analyzed how bank estimates of internal model credit risk parameters can vary across banks. The 2013 Basel study discussed the reasons for the variability and proposed some policy responses. In 2016, the Basel Committee followed up with “Analysis of risk-weighted assets for credit risk in the banking book”, which focused on PD and LGD estimation for retail and small and medium enterprise (SME) banking book exposures. In the 2016 Basel study, the variability of PD and LGD estimates across banks was also assessed and some policy options were offered.
The motivation for these studies is the expectation that two banks that are estimating credit parameters for the same risk should obtain roughly the same values. For example, if two banks are making the same $100 loan to company A, which has an average long-term default probability of 1 percent, both banks should estimate a PD of about 1 percent. Similarly, if company A did default during a downturn and 40 percent of the loan would be recovered in bankruptcy court, then both banks should estimate a downturn LGD of about 60 percent. Although two banks would not have to estimate exactly the same parameters for similar risks, they should be reasonably comparable. The goal of the Basel studies was to examine to what extent credit parameter estimates vary across banks and to understand the reasons for any differences.
The 2013 and 2016 Basel studies show that in practice there has been a lot of variation in banks’ estimates of credit risk parameters under the internal models approach. The proposal’s interpretation of the studies is that variation in internal credit risk models is caused by bank reliance on “subjective” model assumptions coupled with the difficulty of verifying bank estimates of credit parameters empirically, since credit events happen very infrequently. The subjectivity of model assumptions coupled with empirical validation challenges allow banks to obtain a range of plausible estimates, producing variations in internal model credit capital between banks for similar risks. Although the proposal does not define what it means by “unwarranted”, it nonetheless asserts that the variation is “unwarranted,” requiring the prohibition of the use of internal models.
Do the Cited Basel Studies Support the Agencies’ View That Variation is Unwarranted?
The Basel Finalization Proposal cites the 2013 and 2016 Basel studies to support its view that internal models have produced “unwarranted” variability of credit exposures. Yet, the 2013 Basel study clearly states in its introduction that “The study did not attempt to identify an appropriate or acceptable level of variation of RWA in the banking book and its findings are sensitive to a number of assumptions.”
The 2016 Basel study concludes its analysis of PD and LGD variability of retail and SME exposures by noting that “the nature of the analysis used in this study, combined with material data limitations, does not allow for a definitive measure of how much observed variation in retail and SME RWAs is driven by differing practices or differences in risk.” In other words, the 2016 Basel study was unable in its analysis to separate the effects of modeling assumptions and other practices from the effects of differences in portfolio risk.
Thus, neither the 2013 nor 2016 Basel studies support the central argument made in the proposal that the observed variability is unwarranted. Instead, the Basel references present a significantly more nuanced and heavily caveated view on the reasons for internal model variability. The Basel references simply do not say what the proposal claims they say.
Significant Caveats in the Cited References Were Not Heeded by the Proposal
Variability Can Be Misleading
Results must be caveated because focusing on variability exclusively can produce misleading results. For example, consider a set of banks that always underestimates PDs by 50 percent, but in exactly the same way, so that they all assign the same underestimated default probability to the same risk. In that case, there would be no variability and presumably internal models would be allowed. In reality, of course, internal models would not be appropriate without correcting the underestimates. On the other hand, the same set of banks, for model validation reasons, might impose different levels of conservatism in their PD estimates, so that PD estimates are variable but always overstate the actual default rates. In that case, variability could be high and so presumably internal models would not be permitted. In reality, though, internal models would be appropriate, since credit capital estimates would not understate risk.
The problem with using variability as a metric is that without also considering the underlying risks, differences in variability can be very misleading. The 2013 Basel study is aware of this problem in using variability, noting in section 1.3 that
A significant challenge for this work is the fact that ‘true’ levels of underlying risk are unknown. As a result, in many cases, the analysis is able to identify differences in RWA across banks, but cannot definitively determine whether these differences correspond to differences in underlying risk. Taken together, these caveats suggest that a degree of caution should be exercised when interpreting the results in this section. 
Difficult To Disentangle Differences in Risk From Other Factors in Explaining Variability
In footnote 22, the 2013 Basel study explains:
Two banks may have identical RWAs, both of which are based on faulty estimates, or they may report very different RWAs for superficially similar portfolios that are in fact different in risk (e.g. due to different credit risk mitigation). An additional complication is that risk depends in part on risk management practices at the level of the bank or portfolio (e.g., collection practices for problem loans) that are difficult or impossible to identify and assess based on available data.
Despite these important caveats on interpreting the Basel studies’ results on variability, the proposal relied on the cited references to assert that variability is “unwarranted.”
The Proposal Mischaracterizes the Reasons for Variability
As the Basel references show, the reasons for variability are not as simple as presented in the proposal. The proposal asserts that “These internal models rely on a banking organization’s choice of modeling assumptions and data. Such model assumptions include a degree of subjectivity, which can result in varying risk-based capital requirements for similar exposures,” implying that banks’ model assumption subjectivity is the only or the chief reason for variability. The Basel references show that the reasons for variability are much more complex, however, involving differences in both bank and supervisory practices as well as other factors. For example, the 2013 Basel study describes differences in supervisory practices that contribute to variability:
- Rollout of various IRB approaches on different timeframes across jurisdictions, leading to variability across banks
- Basel rules that permit some discretion in the definition of default, producing differences in estimates
- Basel rules that are “deliberately flexible” in the definition of downturn LGD
- Model validation rules that encourage conservatism in estimates
Model validation requirements are likely a significant source of internal model variability. In the U.S., banks must perform model validation according to the requirements of SR 11-7. SR 11-7 requires the use of higher estimates and add-ons to deal with model uncertainty, saying:
In either case, it can be prudent for banks to account for model uncertainty by explicitly adjusting model inputs or calculations to produce more severe or adverse model output in the interest of conservatism. Accounting for model uncertainty can also include judgmental adjustments to model output to avoid understating risks, placing less emphasis on that model’s output, or ensuring that the model is only used when supplemented by other models or approaches.
Different amounts of internal credit data can lead to variability. For example, banks that lend regularly in emerging markets will have more and better internal data on emerging market credit events, such as defaults, than banks that do not. Banks with fewer data points will generally be required by their internal model validation departments to produce more conservative estimates in compliance with SR 11-7. Banks’ internal audit departments will review the model validation process, also required by SR 11-7, and then the national supervisory authorities will also review the results. At each step of the review process, more conservatism could be imposed. Because banks have different tolerances for model risk, model validation and internal audit departments have different procedures, and national supervisors also differ in their views and practices, it is not at all surprising to find that estimates of PDs and LGDs could vary substantially, simply because of differences in applications of model validation requirements. This variation would not imply that internal models are not working, since credit capital estimates would be very likely to overstate risk on average. The 2016 Basel study discusses harmonization of model validation practices as one way to reduce variability of internal model estimates.
The Proposal Does Not Account for Contrary Evidence
The only references cited by the proposal to support its analysis are the two Basel references already discussed. These references do not say what the proposal represents, but even if they did conclude that variability is unwarranted, they would still not be relevant since they are not focused specifically on U.S. banks. The proposal, however, does not mention two recent studies that analyze variability of PDs and LGDs of large U.S. banks. Firestone and Rezende examine the variability of estimated PDs and LGDs of syndicated loans made by nine large U.S. banks, finding that, while PD estimates can vary substantially, most banks PD estimates do not vary systematically from the median bank. On the other hand, Firestone and Rezende find that LGD estimates do vary systematically across banks. However, they also find that LGD variation can be partially explained by differences in bank practices that comply with Basel rules. Similarly, Covas and Stepankova find that using internal PD estimates results in little variation in risk weights across banks under the standardized approach for 12 large banks with headquarters in the U.S. or with significant exposure to U.S. firms. These studies are not an exhaustive analysis of the universe of U.S. banks, but they do show that the proposal may be exaggerating the problem of PD variability for corporates. Turning to retail and small and medium-sized enterprises (SME), the 2016 Basel study finds that actual default rates line up with estimated PDs, although that result does not hold for LGDs. Summing up, then, to the extent there is a concern about variability, it seems to be more of a problem with LGDs than with PDs.
How Much of a Problem is LGD Variation?
Capital varies linearly with LGD and so large differences in LGD can cause large differences in capital estimates. Thus, LGD variation can be material. LGD variation is caused by differences in supervisory practices and differences in bank practices. If supervisors want to reduce LGD variability caused by their own practices, they merely need to change those practices. For example, supervisors could allow less discretion on the definition of downturn default or in default itself. Imposition of conservatism in LGD estimates produced by model validation requirements likely creates significant variability, as the levels of conservatism that should be applied are vaguely defined. Model validation groups as a consequence may apply different standards of conservatism to similar LGD risks across banks. Supervisory examination groups may also not treat banks equally when reviewing credit risk models. Supervisors could reduce variability of LGD caused by model validation requirements by setting standards for conservatism. They could also better harmonize the standards used by supervisory groups who cover different banks. Moving to a standardized approach is not necessary to solve this problem.
Some LGD variation is likely caused by differences in bank practices. If these differences are produced by variations in risk management competence of banks, then they are legitimate. We illustrate with an example.
A Tale of Two Banks
Consider two hypothetical banks that are identical in every respect except in their risk management practices. Bumble Bank and Trust has generally weak credit risk management. Its risk management department is not able to challenge its business units and so Bumble tends not to get good collateral terms from its counterparties. Its credit risk management department does not anticipate credit events well and therefore does not manage down exposures or get enhanced mitigants before default occurs. In addition, its workout department is ineffective and its legal agreements are weak. Bumble Bank estimates relatively high LGDs as a result.
Gallant Bank, on the other hand, has a highly effective credit risk management department. It tends to get strong collateral terms, negotiates effective legal agreements, works down exposure at the first sign of problems, and has an extremely effective workout group. As a result, when it estimates its LGDs, Gallant obtains relatively lower LGDs than does Bumble.
In this case, differences in bank practices—disparate levels of risk management competence—have produced differences in LGD estimates for similar risks. From a risk management perspective, this is the right answer. Bumble really does recover less in default than does Gallant and Bumble’s LGD estimates should reflect the higher risk.
The agencies seem to see this situation very differently. The implicit assumption in the proposal is that the discrepancy between Gallant’s and Bumble’s estimates is evidence of unwarranted variation—Bumble and Gallant should have computed roughly the same LGD estimates in this view. The reason the estimates are different, according to the proposal, is that Bumble and Gallant were using subjective modeling assumptions, and given how rare defaults and recoveries occur, it is difficult for supervisors to determine whether Bumble or Gallant have the right or wrong estimates. Since the banks should have obtained roughly the same estimate, the agencies propose to force the same estimate by imposing a standardized approach.
The standardized approach is the wrong solution, of course, since the LGD estimates should have been different. The proposal implies that a further motivation for using the standardized model is that supervisors cannot easily evaluate the validity of internal credit risk models. What is perhaps most surprising here is the agencies’ implicit lack of confidence in the ability of its staff to effectively supervise Bumble’s and Gallant’s internal credit risk models. It is true that estimation of rare credit events can be difficult given limited data and that banks’ models sometimes must use assumptions that have limited empirical support. And it is also true that empirical validation of credit parameters can be challenging. But these are general problems that exist everywhere in bank risk management; they are not peculiar to credit risk. Regulators have spent decades developing very effective tools to ensure that the Bumbles and Gallants of the world get the appropriate answers.
Regulatory staff already assess the quality of risk management in banks and condition approval of credit estimates on deficiencies in risk practices. There is a large amount of third-party credit data and models available for both supervisors and banks to supplement limited internal credit data if necessary. To deal with model assumptions that have limited empirical support, SR 11-7 ensures that model risk management departments will require add-ons to avoid understating risks. Regulators can also set standards or guidelines for model estimates. Ironically enough, the variations in internal models that worry the agencies have been materially caused by their own effective supervision. Variations caused by other factors can be reduced by policy changes if the regulatory community wants to do that.
Despite these successes, the agencies want to throw the supervisory toolbox away and rely on a standardized approach for credit that requires minimal regulatory supervision. That may be an easier approach in the short run for a regulatory community that seems to feel it cannot effectively supervise internal credit models, but it does not come without longer term costs, most of which will be hidden. A standardized approach will reduce Gallant Bank’s incentive to maintain effective credit risk management. Since every bank will be assessed capital charges for what the standardized approach implicitly assumes are worse credits no matter what the actual risk, banks with good risk management practices will be incentivized to finance worse credits to earn the extra return to defray the capital charge. There will also be a tendency for credit risk management practices to atrophy, as they will become less important when not tied to capital.
The agencies may dismiss this concern since they expect to maintain vigorous supervision of bank risk management under a standardized model approach. However, the use of standardized models will also disincentivize their supervisory activities. Just as the most effective bank risk personnel will tend to migrate away from risk areas that are perceived to be less important, the same effect will happen in the supervisory community. Removing the connection between capital and bank risk models lowers the stakes, making supervision of risk models less important. The long run effect will be less effective bank risk management and less effective supervision, even with the best of intentions. As the proverb says, the road to hell is paved with good intentions.
What if the Variability is Caused By Banks Gaming Internal Models?
Of course, using standardized models would be justified if banks are gaming internal models to underestimate credit risk. Neither the proposal nor the Basel studies claim that banks are gaming internal models. If banks have been, that would represent a serious and ongoing failure of bank supervision. Nonetheless, since some recent academic papers have suggested evidence of gaming or underestimation of credit risk, we review those results.
Berg and Koziol provide econometric tests of the hypothesis that banks with the greatest incentives to report lower PDs and LGDs should be observed to do so if banks are attempting to game internal models. As proxies for bank incentives, Berg and Koziol use 1) the tier 1 capital ratio; 2) recapitalization needs as determined by the European Banking Authority; and 3) return on assets. Under their hypothesis for the first proxy, banks with lower capital ratios, i.e., weakly capitalized banks, would want to report lower PDs and LGDs to maintain lower capital ratios. The other proxies would have a similar interpretation.
The obvious problem with the analysis, as the authors acknowledge, is the endogeneity between estimates of PDs and LGDs and the proxies for bank incentives. Do lower capital ratios cause lower PDs and LGDs or is that causation reversed? Berg and Koziol attempt to address endogeneity and other statistical issues using econometric methods such as instrumental variables and bank and time fixed effects. As the authors note, this approach cannot account for bank characteristics that vary with time and at the same time are correlated with capital increases. In addition, these econometric techniques cannot cope with loan-specific factors or any of the other causes of variations we have discussed earlier.
Mariathasan and Merrouche perform a similar analysis, arguing that banks with lower capital ratios have an incentive to underestimate credit risk parameters. They also attempt to control for endogeneity and other statistical problems. However, the same critiques that apply to Berg and Koziol apply as well to Mariathasan and Merrouche. There are simply too many potential causes of variation that cannot be easily accounted for in econometric models.
More direct evidence would compare bank estimates of PDs to realized defaults. A recent paper by Behn, Haselman, and Vig attempts to provide such an analysis. For a subset of 45 German banks that adopted the IRB approach, Behn, Haselman, and Vig obtained loan-level estimates of PDs and compared them to realized default rates during the years 2008-2012. They find that banks underestimated PDs during those years. However, their analysis suffers from a serious difficulty. Bank estimates of PDs are required to represent long-term average default risk, implying default risk over at least one business cycle. Unfortunately, Behn, Haselman, and Vig limit their comparison of bank PD estimates and default rates to the highly unusual and stressed period of the Great Recession and its aftermath and the European sovereign debt crisis. This five-year period is too short to make a fair comparison. Behn, Haselman, and Vig would need to have included at least seven more years of realized default experience, from 2013-2019, when default rates were more benign. Besides that, the analysis focused exclusively on German banks and thus provides little evidence relevant to banks operating in the U.S.
In a very recent paper, Baena, using European data from 2018-2022, reports preliminary econometric evidence that when the countercyclical capital buffer (CCyB) increases, so that capital requirements rise, banks tend to reduce estimates of PDs. Any attempt to infer a causal relationship between changes in the CCyB and changes in PDs is of course bedeviled by econometric problems, which the paper goes to great lengths to try to resolve. Unfortunately, proving that one factor causes another and that there is not a third cause explaining both is incredibly difficult using econometric methods. But even if causation could be established, the paper suffers from the same fundamental problem that most studies in this literature have: demonstrating that PD estimates have changed is not the same as demonstrating that they were underestimated. The paper does not show that PDs were underestimated, which would have required an empirical analysis of credit data over long periods. As noted previously, model validation requirements incentivize banks to report conservative estimates of credit parameters and so parameter estimates could change without being necessarily underestimated. Notably, this paper focuses on European banks, thus limiting its application to U.S. institutions.
The 2016 Basel study, on the other hand, finds some evidence that PD estimates have exceeded actual default rates. The 2016 Basel study compared PDs to realized defaults, finding that “for most banks, on average, the PD parameters are higher than the actual default rates.” The 2016 Basel study also found that “on a per-bank basis, the LGD parameters appear generally higher than actual losses.” Hence, the 2016 Basel study finds evidence that bank internal model estimates of PDs for retail and SME exposures overstate actual losses. The 2016 Basel study also found evidence that estimated LGDs exceeded realized LGDs, which are required to be conditioned on a downturn, so they should naturally be higher than average losses. However, the 2016 Basel study did not find any evidence that downturn LGDs are underestimated.
Overall, the academic evidence does not establish that banks in general, much less U.S. banks, have systematically gamed the internal model framework by underestimating credit risk. This is not a surprising result given the robust regulatory supervision of U.S. banks.
The Basel standard is intended to harmonize capital practices across jurisdictions, but it does not require absolute consistency. The agencies might reasonably decide to depart from Basel if they believe there is some compelling defect in the agreement. However, the references cited in the proposal in no way support its justification for departing from Basel. Moreover, the academic evidence provides no reason to suppose that banks in general or U.S. banks in particular have underestimated credit risk using internal models, demonstrating that regulatory supervision of internal bank credit risk models is effective.
Viewed from that vantage point, the agencies’ proposal to depart from the Basel standard on internal models appears to be arbitrary, sacrificing the benefits and motivation for entering into an international agreement for no apparent reason. Removing the use of internal models to set credit capital would damage U.S. banks’ international competitiveness and incentivize worse credit risk management practices for both banks and supervisors. The nuclear option of using standardized models is unnecessary and counterproductive. The agencies should permit U.S. banks to use internal models for credit capital calculations under Basel finalization.
 “Regulatory capital rule: Amendments applicable to large banking organizations and to banking organizations with significant trading activity,” available at https://www.govinfo.gov/content/pkg/FR-2023-09-18/pdf/2023-19200.pdf
 Regulatory Consistency Assessment Programme (RCAP), “Analysis of risk-weighted assets for credit risk in the banking book,” July 2013, available at https://www.bis.org/publ/bcbs256.pdf
 Regulatory Consistency Assessment Programme (RCAP), “Analysis of risk-weighted assets for credit risk in the banking book,” April 2016, available at https://www.bis.org/bcbs/publ/d363.pdf
 “Some amount of variation would be expected in any regime based on internal models and especially for low-default portfolios.”, 2013 Basel study, pg 8
 Basel Finalization Proposal, pg 64031
 2013 Basel study, pg 4
 2013 Basel study, pg 12
 2013 Basel study, pg 12
 Basel Finalization Proposal, pg 64031
 2013 Basel study, pg 42
 “Supervisory Guidance on Model Risk Management,” April 4, 2011, Federal Reserve Board of Governors and Office of the Comptroller of the Currency, available at https://www.federalreserve.gov/boarddocs/srletters/2011/sr1107a1.pdf
 Supervisory Guidance, pg 8
 2016 Basel study, pg 36
 Firestone, S and Rezende, M, (2016), “Are Banks’ Internal Risk Parameters Consistent? Evidence from Syndicated Loans,” Journal of Financial Services Research
 Covas, F and Stepankova, B, (2022), “Consistency in Risk Weights for Corporate Exposures Under the Standardized Approach,” available at https://bpi.com/wp-content/uploads/2022/01/Consistency-in-Risk-Weights-for-Corporate-Exposures-Under-the-Standardized-Approach.pdf
 2016 Basel study, pg 9
 Berg, T and Koziol, P (2017), “An analysis of the consistency of bank ratings,” Journal of Banking and Finance
 Mariathasan, M and Merrouche, O (2014), “The Manipulation of Basel Risk Weights,” Journal of Financial Intermediation
 Behn, M, Haselman, R and Vig, V (2022), “The Limits of Model-Based Regulation,” Journal of Finance
 Baena, A, (2023), “Do Capital Requirements Really Reduce the Riskiness of Banks,” available at https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4612544
 2016 Basel study, pg 12
 2016 Basel study, pg 15