## A critical assessment of “Modelling your Stress Away” by Niepmann and Stebunovs (2018)

A recent paper by Niepmann and Stebunovs (2018) (henceforth N&S) presents data suggesting that some European banks may have “gamed” the European Banking Authority’s (EBA) stress tests by adjusting their models to offset an increase in the severity of the 2016 adverse scenario. In this blog post, we show that N&S’s findings are likely driven by (i) a misspecification of the relationship between macroeconomic variables and banks’ credit losses and (ii) not explicitly controlling for differences in loss rates at the start of the stress tests and changes in macroeconomic scenarios. As a result, it is highly unlikely that N&S findings are driven by systematic changes to banks’ own models.

The role of banks’ own models in determining stress losses is an important question because supervisory stress tests are the foundation of post-crisis capital regulations. Models utilized in supervisory stress tests are a key pillar because they estimate losses and revenues under stress, and thus the mapping between supervisory scenarios and a bank’s post-stress regulatory capital ratios. In the EU, post-stress regulatory capital ratios are determined using models developed by each participating bank. In contrast, in the U.S. stress tests, banks’ post-stress regulatory capital ratios are determined by supervisory models developed by the Federal Reserve. Meanwhile, the stress testing regime developed by the Bank of England employs a hybrid approach by using both supervisory and banks’ own models to determine banks’ post-stress capital ratios. Overall, these different approaches elevate even further the importance of the debate on the use of banks’ own models in stress tests.

As part of their stress tests disclosure, the EBA publishes granular information on exposure amounts and impairment rates over the stress horizon for each bank as well as bank-specific impairment rates for the top 10 countries of the bank’s counterparties. N&S use the published results to estimate the sensitivity of credit losses to the macroeconomic factors over the last two stress testing cycles. They find that the model coefficients estimated using 2014 stress test results imply larger credit losses compared with the estimated model coefficients obtained using the 2016 stress test results. According to the EBA, the adverse scenario in the 2016 stress tests was stricter. Thus, N&S conclude that banks changed their models in order to generate lower losses when the scenario got harsher and, as a result, supervisors should not rely on banks’ own internal models to assess their capital adequacy.^{1}

In the remainder of this blog post, we explain why the lower estimated model coefficients based on EBA’s stress test 2016 results reported by N&S are likely driven by several factors that are unrelated to the estimation of banks’ own models and, therefore, why the lower estimated coefficients do not reflect banks’ adjustments to their own models. We focus on the following two main reasons:

- The model used by N&S to estimate the sensitivity of credit losses to macroeconomic variables should be dynamic instead of static; that is, their model omits the lagged value of credit losses from the set of explanatory variables;
- Banks’ loan loss rates at the start of the 2014 and 2016 stress tests were quite different as well as the path of the macroeconomic variables in the 2014 and 2016 adverse scenarios. Thus, N&S results are also driven by differences in initial conditions that were not explicitly controlled for in their econometric specification.

The regression model used in the N&S paper omits a crucial explanatory variable—the lagged value of credit losses. As noted by Covas et al (2014) and Hirtle et al (2015), projections for credit losses are highly persistent and are importantly determined by their most recent historical values. This is particularly true for so called “top-down” models, which relate aggregate loan losses to macroeconomic outcomes.^{2} The role of this dynamic structure of credit losses is illustrated in the chart below, which depicts annualized net charge-offs for U.S. banks during the 2007-2009 financial crisis (black line). The blue line in Figure 1 shows predicted (fitted values) credit losses using a dynamic model. The red dashed line shows the predictions of credit losses using a static model (omitting the lagged value of credit losses as an explanatory variable). As shown by the difference between the black and blue lines, the dynamic model does a good job in matching the realized value of credit losses, perhaps only slightly understating the path of net charge-offs during 2009. In contrast, the static model, would have understated loan losses by a sizable amount during the 2007-2009 financial crisis.

The fact that a static model understates credit losses is consistent with the observation N&S make with respect to the projected loss rates obtained from their model and noted on page 12 of the paper:

…Of note, observed loss rates are significantly higher than projected loss rates. This is not only true for the loss rates that follow from the 2014 and the 2016 models, but also for the loss rates that the banks reported for the baseline scenario. The likely reason for this is that SNL uses a different definition of provisions and gross loans than the banks themselves. We therefore attribute the discrepancy to external factors and do not think there is a flaw in the banks’ or our projections in that respect.

As a result, it would be important for the N&S paper to use a dynamic panel data model and control explicitly for the lagged value of credit losses. In addition, as shown by Hirtle et al (2015), when one includes the lagged value of credit losses in the regression, it is no longer the *level* of the unemployment rate that is used as an explanatory variable, instead it is the *change* in the unemployment rate.^{3}

The use of a static model by N&S and the inclusion of the level of the unemployment rate as an explanatory variable, could also explain why they find the 2014 adverse scenario to be more severe than the 2016 adverse scenario on average, which is the opposite of what the EBA states in their website:

Compared to 2014 this year’s adverse scenario is stricter as it contains more conservative elements. Moreover, shocks have been frontloaded, so that the adverse impact materialises earlier during the stress test horizon. As bank losses follow macroeconomic stress with a lag, this feature increases the expected impact of the scenario…

Another important limitation of the N&S analysis is the lack of controls for differences (i) in banks’ loss rates that existed at the start of the stress tests, as well as (ii) in the stress scenarios themselves. In a dynamic model, the projections for credit losses are highly dependent on credit losses at the start of the stress tests used to initialize the projections. For example, a bank that starts the stress tests with a high value for loan losses will have that initial observation affect its entire path of loan losses over the stress test horizon.

We illustrate the importance of initial conditions in the estimated coefficients of the model via a Monte Carlo simulation. In the exercise, we simulate loan losses assuming they are generated by a dynamic model. Specifically, we assume that credit losses depend on its lagged value and the change in the unemployment rate.^{3} We use the same model parameters across all simulations, and make only changes to the set of initial conditions, namely:

- Loss rates at the start of the stress tests;
- Allow for differences in the path of macroeconomic series in the supervisory scenarios;
- Unemployment rate at the start of the stress tests.

We picked these three initial conditions because they are evident in the EBA data. In particular, banks’ defaulted exposures were lower at the start of the 2016 stress tests relative to the 2014 stress tests; the dispersion of unemployment rates across countries was also lower during the stress horizon in the 2016 adverse scenario, and the (average) unemployment rate at the start of the stress tests declined between 2014 and 2016. In the last step of the Monte Carlo exercise, we use the simulated data to estimate the models following the N&S procedure and report the average coefficient values across all simulations.

**Table 1: Monte Carlo Exercise**

Table 1 shows the impact of the initial conditions on the estimated coefficients of a static loss model. Note that N&S interpret changes in model coefficients as a change in banks’ own models. For example, if the constant rises and the sensitivity to the unemployment rate remains unchanged, N&S would interpret it as a conservative model change because a higher constant in the regression leads to higher projected loan losses. The first column represents the baseline estimate of the constant and the sensitivity to the unemployment rate of the misspecified model which are very different from the “true” model coefficients (see footnote 2).

The second column, denotes the sensitivity of estimated model coefficients to the loss rate at the start of the tests. Intuitively, a lower loss rate at the start of the stress tests would have reduced the projected path of loss rates over the stress horizon, thus the estimated constant in the model would have been lower, which is what we find. The third column reports the impact of a lower variance of the unemployment rate path in the adverse scenario. Because we only included positive shocks to the unemployment rate in the generation of the “adverse” scenario, a lower variance reduces the increase in the unemployment rate over the stress horizon, but also lessens the dispersion of the unemployment rate shocks across banks (note that banks have exposures to different countries, thus the difference in shock to the unemployment rate). The lower variance of the unemployment rate shock increases the constant and reduces the sensitivity of credit losses to the level of the unemployment rate. The fourth column shows the impact of the level of the unemployment rate at the start of the stress tests. This change is just a scaling to the explanatory variable and only impacts the estimated constant of the model. The last column combines all previous changes into an overall impact.

As shown in the last column of the table, we were able to generate lower coefficients for the constant term of the regression and the sensitivity of loan losses to the unemployment rate by changing all three initial conditions simultaneously. These lower coefficients would have reduced the projected path of losses during the stress horizon. In the Monte Carlo exercise, we kept the true model coefficients unchanged across all simulations, therefore these results show that the reduction in model coefficients found by N&S were not necessarily driven by adjustments to banks’ own models. That said, changes in model coefficients are not a sufficient condition to prove N&S findings, but the simulation exercise clearly illustrates the variability of results that can be attained when the model is incorrectly specified.

In summary, this critique of N&S boils down to the following two elements: (i) The relationship between macroeconomic outcomes and credit losses is misspecified in their paper; and (ii) the analysis does not explicitly control for changes in loss rates at the start of stress tests and changes in the supervisory scenarios themselves. Putting these two together, our Monte Carlo simulation suggests that lower historical loss rates at the start of the 2016 stress tests and differences in supervisory scenarios – not model changes by banks – were likely the main driver of N&S’s findings.

^{1} In particular, N&S find that banks whose losses increase the most under the 2016 adverse scenario were also the ones that seemed to have adjusted their model coefficients the most to reduce losses under the adverse scenario.

^{2} Banks and regulators typically use a “bottom-up” approach, which rely on detailed information about individual loan characteristics, to estimate credit losses. The type of dynamic structure we discuss in this blogpost is likely less relevant for bottom-up approaches since those models control directly for the characteristics of the loan portfolio.

^{3} The model we use is as follows:

*Disclaimer: The views expressed in this post are those of the author(s) and do not necessarily reflect the position of the Bank Policy Institute or its membership, and are not intended to be, and should not be construed as, legal advice of any kind.*