In addition to extending loans to households and businesses, the largest banks earn revenue from trading, investment banking, and other advisory services as part of their market-making and underwriting activities on behalf of customers. These sources of noninterest income help these banks diversify against loan losses during economic recessions and reductions in net interest income typically registered during periods of correspondingly low interest rates.
In this note, we examine whether the benefit of this diversification is adequately recognized in the U.S. stress tests. In reality, mark-to-market losses on these activities—losses that banks’ trading assets experience when prices fall (for long positions) or rise (for short positions)—are being “double-counted” in the U.S. stress tests. The double-count arises from the inclusion of trading losses in both the global market shock (GMS) and, separately, in trading revenue. This duplication mitigates the benefits of diversification.
Banks with large trading operations and private equity exposures are subject to the GMS, which applies to 11 of the 33 banks subject to the stress tests. The GMS is a set of instantaneous, hypothetical shocks to a large set of risk factors, intended to capture mark-to-market losses that banks with large trading operations could experience during a sudden market stress. Generally, these shocks involve large and sudden changes in asset prices, interest rates and spreads, reflecting general market distress and heightened uncertainty.
In addition, the stress tests also include a separate projection for trading revenue, which is a subcomponent of noninterest income. In a perfect world, the trading revenue series would only include fees, commissions and trading gains reflecting bid-ask spreads associated with further client activity. However, the trading revenue series available in the regulatory reports (specifically, the FR Y-9C trading revenue line item) includes both mark-to-market gains and losses and trading-activity-based revenue. Because of this combination, the supervisory stress test projections include the mark-to-market losses as both a component of the GMS and then as a component of trading revenue. Therefore, these projections for trading revenue for banks subject to GMS double-count mark-to-market losses on banks’ trading inventories that are expected to be incurred under a stress event.
In this note, we demonstrate that the double counting of mark-to-market losses for banks subject to the GMS is sizable and significantly overstates banks’ capital needs. More precisely, we re-estimate the model specification for trading revenue based on the available regulatory data and show the positive correlation between the trading revenue series and the level of stock market volatility (VIX). The positive correlation between trading revenue and the VIX supports the view that periods of heightened volatility increase trading activity at large banks. This results in higher market making fees and trading gains reflecting bid-ask spreads (both because trading on behalf of customers is more frequent and because bid-ask spreads are wider). Next, we show that the positive correlation between the VIX and trading revenue would have increased projected noninterest income by another $40 billion cumulatively for the 11 banks subject to the GMS under the macroeconomic assumptions assumed in the Federal Reserve’s severely adverse scenario in the June 2020 stress tests.
Lastly, we use BPI’s capital calculator to estimate the effect of the additional trading revenue on the peak-to-trough decline in the common equity Tier 1 (CET1) ratio for each of the banks subject to the GMS. Our results show that if the double-counting error were corrected, the aggregate minimum CET1 ratio would be about 50 basis points higher, on average, in the June 2020 stress tests for those banks. Moreover, we estimate that the resulting overstatement of the maximum decline in the CET1 ratio of the banks subject to the GMS varies between 7 and 98 bps, mainly driven by differences in bank holdings of trading assets.
Although we show a positive correlation between trading revenue and the level of stock market volatility, we also rely on the series available through the regulatory reports. Unfortunately, the data available from these reports are not sufficiently granular—they combine mark-to-market gains and losses as well as trading revenues from increased client activity. To overcome this challenge, the Federal Reserve could update the reporting forms that banks provide. These forms could include historical series for trading revenue that separate the mark-to-market profit and losses from trading inventory positions from fees, commissions and gains reflecting bid-ask spreads arising from new transactions, and then use the revised series to re-estimate the supervisory stress testing model for trading revenue.
We believe banks already break out these two sources of trading revenue as part of their compliance with the Volcker Rule, although we do not know if banks can also extend these series going all the way back to the 2007–2009 crisis. The use of more granular data would improve the stress testing methodology results and lead to a more accurate representation of capital requirements associated with market-making activities. This greater accuracy would, in turn, make banks more willing to hold higher trading inventory positions on behalf of their customers and could increase market liquidity, including during periods of market stress.
Background on the Federal Reserve’s Trading Revenue Model
The revenues banks generate over the stress planning horizon are the first line of defense against the losses projected over the stress planning horizon. Specifically, pre-provision net revenue (PPNR) allows a bank to fund the increase in provisions for loan losses. Such revenue also helps offset the increase in mark-to-market and counterparty losses and operational risk losses banks experience over the stress planning horizon. PPNR has three components: net interest income, noninterest income and noninterest expense. For typical commercial banks, net interest income accounts for most of the revenues. But for the largest market maker banks and other less typical banks (e.g., a bank that runs an important payment network), the noninterest income component is also extremely important.
In the June 2020 stress tests, the Federal Reserve projected that, in the aggregate, the 33 subject banks would generate $790 billion in net interest income, $794.5 billion in noninterest income and $1,154.7 billion in noninterest expense cumulatively over the nine quarters of the stress planning horizon. As a result, the Fed projected that the banks in the stress tests would generate $430 billion in PPNR cumulatively over the planning horizon. Just to offer some perspective, the Fed projected loan losses of $433 billion for the 33 banks cumulatively over the projection horizon, so PPNR effectively covers all the projected loan losses for all the banks under stress. Moreover, 2020 is a representative year in that this is fairly typical result looking back on prior stress test disclosures.
A significant subcomponent of noninterest income is trading revenue. The Federal Reserve’s stress test results do not reveal trading revenue projections separately from noninterest income. However, after 2010, on average, trading revenue accounts for approximately 20 percent of noninterest income based on the FR Y-9C reports of the banks subject to the GMS. In addition, in 2019 the Federal Reserve published additional descriptions of the models used to generate the projections of banks’ regulatory capital ratios in the stress tests, describing the trading revenue model as follows:
For firms subject to the global market shock, the Federal Reserve models trading revenues in the aggregate as a function of stock market returns and changes in stock market volatility and allocates revenues to each firm based on a measure of the firm’s market share. Firms’ trading revenues include both changes in the market value of trading assets and fees from market-making activities. Trading revenue for this group of firms is modeled using a median regression approach to lessen the influence of extreme movements in trading revenues and thereby mitigate the double-counting of trading losses that are captured under the global market shock. Trading revenues for remaining firms are modeled in an autoregressive framework similar to that of other PPNR components.
This description is critical to understanding some of the limitations of the Federal Reserve’s trading revenue projections studied in this note. First, the trading revenue series used by the Federal Reserve to estimate the trading revenue model and generate stress projections includes both trading-activity-based revenues and mark-to-market losses. However, since the Federal Reserve also applies a GMS to the trading portfolio of 11 firms with large trading exposures that also generates mark-to-market losses on the same portfolio, there is effectively a double count of such losses in the stress tests.
As explained in the passage quoted, the Federal Reserve attempts to lessen the double-counting effects by using a median regression, but the real challenge lies in the lack of granularity of the series available in the regulatory reports. A median regression estimates the median of the dependent variable, and it is robust to the presence of outliers in the data. In the case of trading revenues, the median regression is intended to dampen the correlation between the large movements in trading revenue registered in 2007–2009 and the behavior of the macroeconomic variables included in the regression. However, the dampening effect is limited, because historical data points with volatility levels similar to CCAR projections are dominated by the historical 2007–2009 crisis (GFC) period. Another concern is that banks no longer generally hold material amounts of the structured credit and mortgage inventories that drove mark-to-market losses during the GFC. As a result, the current trading revenue projection is based on historical data disconnected from the current risk profile and produces distortive outcomes, regardless of the intended mitigation impacts of the median regression. Rather than trying to reduce the correlations between trading revenue and macroeconomic factors, we recommend that the Federal Reserve improve the granularity of the trading revenue data so that the stress test projections of trading revenue can exclude the trading losses accounted for in the GMS.
Another important fact in the Federal Reserve’s quote above is that the macroeconomic drivers of trading revenues are stock market returns and the change in stock market volatility. We will show that the Federal Reserve’s model builds in a negative correlation between trading revenue and the change in such volatility. That is, when stock market volatility goes up, trading revenue goes down. This estimated negative correlation arises because, in the GFC, mark-to-market trading losses occurred while stock market implied volatility went up. Aside from that episode, increases in stock market volatility are usually associated with higher trading revenue as a result of increased client activity, inconsistent with the negative correlation obtained using the specification of the model as described by the Federal Reserve. If these banks were not subject to the GMS, the negative correlation between the change in the VIX and trading revenues would be less of a problem, because there would not be double counting. Although we recognize that a stress test should cover the unusual, tough event—rather than the norm, especially when the unusual tough event has recent precedent—there is no need to double count losses based on the unusual tough event.
Meanwhile, it could be hypothesized that if PPNR projections were to include trading-activity-based revenues, it would also be important to account for mark-to-market gains and losses of the new positions. We believe the following three reasons significantly ameliorate the need to add mark-to-market losses of positions acquired over the planning horizon: (i) Although GMS losses are booked immediately, in practice firms would be holding a portion of those inventories for a longer period of time and would be able to offset some of the mark-to-market losses as markets recover over the planning horizon; (ii) a firm is more likely to book market-to-market gains from new positions (or, alternatively losses are insignificant) because the second half of the stress scenario typically assumes a gradual recovery in financial markets; and (iii) as noted earlier, banks don’t hold the type of structured products that drove mark-to-market losses during the GFC and as a result the historical data should not be used to project losses associated with new trading activity.
A Better Way to Mitigate the Double-Counting of Mark-to-Market Losses
A natural way to eliminate the double-counting of trading losses in the stress tests would be for the trading revenue model to use a series that only includes trading-activity-based revenues and excludes mark-to-market gains and losses, since those gains and losses are already included in the GMS. Unfortunately, that series is not publicly available, and the Federal Reserve would have to ask firms to create historical data with such information. We believe this would be a good approach in the medium term if the Federal Reserve decides to update the regulatory reporting forms to require firms to provide more granular data.
In the meantime, it is possible to mitigate some of the lack of granularity in the trading revenue series and reduce the double-counting of mark-to-market losses on banks’ trading inventory positions by estimating the median regression using the level of the stock market volatility index instead of the change in stock market volatility. By using the VIX in levels in the regression, we find a positive correlation between the trading revenue series reported in the regulatory report and volatility in equity markets. Also, it is the lagged value of the VIX that is positively correlated with trading revenue, probably because trading revenues as reported in the regulatory forms offer an imperfect proxy for trading-activity-based revenues.
According to the entries in Table 1, the coefficients on the change and level of stock market volatility have opposite signs and are both statistically different from zero at conventional levels. When the macroeconomic factor is the change in stock market volatility, we find a negative correlation between trading revenue and heightened periods of volatility. This is the opposite of what we would expect if the behavior of the trading revenue series were being driven by an increase in client activity as a result of increased volatility in capital markets. However, when the macroeconomic factor is the level of the VIX, we find that trading revenue is positively correlated with market volatility. Moreover, in the latter specification, the coefficient associated with stock market returns is no longer statistically different from zero. For that reason, it is dropped from the regression.
Please note that we have excluded the data from the current pandemic in the sample used to estimate the coefficients of the two regressions. The positive correlation between VIX and trading revenue therefore cannot be attributed to the Federal Reserve’s actions during COVID-19. The reason we did not include the period covering the events of the spring of 2020 and the associated Federal Reserve support in our sample is to make clear that our results are not being driven by those events, which makes these data even more robust. However, we think this event was informative and, as a general matter, the Federal Reserve should include that sample period in the estimation of its own supervisory model.
In Table 2, we compare the out-of-sample predictability of the two models using a rolling window of 30 quarters (between 3Q01 and 2Q20). We use traditional evaluation metrics such as the mean absolute error (MAE) and the mean squared error (MSE) to evaluate the one-step ahead forecasts produced by each model. The adjusted model performs better than the supervisory model, because both the MAE and the MSE are lower. However, the difference in forecasting performance is modest, because the Diebold-Mariano test statistic indicates the difference in performance between the two models is not statistically significant. This finding is not very surprising: the median regression lessens the influence of extreme movements in trading revenues, for the reasons we discussed earlier. Other models have higher forecasting performance but would only exacerbate the problem of double-counting trading losses.
As we will see, the differences in model specifications have a sizable impact on trading revenue projections. Because the VIX increases sharply at the onset of economic downturns in the Federal Reserve’s severely adverse scenario, trading revenue rises during those initial quarters of the stress tests. In the next section, we quantify the effect of the differences in the two model specifications for trading revenue projections and the peak-to-trough declines in CET1 ratios under stress.
Trading Revenue Projections under the Federal Reserve’s Severely Adverse Scenario
To assess the impact of the change in the specification of trading revenue, we must redo the projections of trading revenue and the common equity Tier 1 ratio of each of the 11 banks subject to the global market shock over the stress planning horizon. To simulate the path of the CET1 ratio under stress of each bank, we use BPI’s version of the CLASS model developed by Hirtle, Kovner, Vickery, and Bhanot (2016).
The path of the VIX in the severely adverse scenario released by the Federal Reserve in February 2020 is shown in Exhibit 1. In the first quarter of 2020, the level of VIX increases sharply to 69 and stays above 60 for the entire year. At the end of the nine quarters of the planning horizon in the stress scenario, the VIX is at 33. The Fed’s trading revenue model specification also depends on stock market returns, which fall 50 percent through the end of 2020. As noted earlier, when VIX enters into the specification in levels, the coefficient associated with stock market returns is no longer statistically different from zero.
The projected path of trading revenue over the stress planning horizon is shown in Exhibit 2. The forecast that the Federal Reserve uses—with stock market returns and changes in stock market volatility as the macroeconomic factors—is shown by the dashed red line. The alternative specification that uses the level of stock market volatility is plotted in green. As expected, the model that shows a positive correlation between trading revenue and VIX results in significantly higher projections of trading revenue in each of the nine quarters of the stress planning horizon. More precisely, we project aggregate trading revenue to be $107 billion under the supervisory specification for trading revenue and $146 billion under the alternative specification. The difference between the two projections is about $40 billion.
The difference between our projections shows that the double-counting of mark-to-market losses in the stress tests is sizable. As noted earlier, these 11 banks are already subject to the GMS. In the June 2020 stress tests, GMS losses were about $81.8 billion for these banks.
The results in Exhibit 2 suggest that trading revenue is understated by about $40 billion for all the 11 banks cumulatively as a result of the lack of granularity of the trading revenue series available in the regulatory reports. We try to correct for the underestimation of trading revenue by using an alternative model specification with a positive correlation between trading revenue and the level of stock market volatility. A better approach would be to construct a trading revenue series that only captures trading-activity-based revenue and excludes mark-to-market losses, since those losses are already included in the GMS.
The projected decline in the aggregate CET1 ratios under the two different trading revenue model specifications is shown in Exhibit 3. We maintain the Fed’s assumptions in DFAST that both share repurchases and dividends are set to zero over the planning horizon. The red line shows the path of the CET1 ratio under the supervisory specification of trading revenue, which declined from 12.42 percent in the fourth quarter of 2019 to a minimum of 10.55 percent in the third quarter of 2021. Similarly, the green line shows the aggregate CET1 ratio path under the adjusted model for trading revenue. In this case, the minimum CET1 ratio was 11.04 percent, or about 50 basis points above the projections that suffer from the double-counting of mark-to-market losses.
It is worth noting that we cannot perfectly replicate the minimum CET1 ratio as reported by the Federal Reserve, because the Fed’s projections of loan losses and other subcomponents of pre-provision net revenue use confidential and much more granular data compared with BPI’s top-down models. We also have no way of estimating the size of operational risk losses based on publicly available data. As a result, the difference in the minimum aggregate CET1 ratio between the BPI and Federal Reserve projections is about 70 bps in aggregate for the 11 banks subject to the GMS.
Despite the differences in the minimum CET1 ratio, we believe the variation in the projections of the two trading revenue models and the implications for the aggregate CET1 ratio give a reasonable estimate of the double-counting of mark-to-market losses on the CET1 ratio of each bank. These estimates are probably the best we can do without more transparency into the Federal Reserve’s projections. Indeed, bank-by-bank estimates of the double-counting of trading losses on banks’ CET1 ratios are reported in Table 3. More precisely, the difference between the minimum CET1 ratios in the two cases considered here is reported in column (4). The median impact on the minimum CET1 ratio is 44 basis points across the 11 banks subject to the GMS. The highest estimated effect for a bank is a change of 98 basis points, and the lowest a change of 7 basis points. The wide variability in the effect reflects the level of trading assets of each bank and the quarter in which the bank reaches the trough in its CET1 capital ratio.
The effect of the results on capital requirements of banks can be inferred from the analysis of columns (4) and (5). Obviously, the trading revenue adjustment has no effect on the stress capital buffer of a bank subject to the 2.5-percent floor. However, even for banks slightly above the 2.5-percent floor, the impact would be significant.
The lack of granular data in regulatory reports poses a substantial challenge for the projections of pre-provision net revenue. This results in multiple income items known to move in opposite directions reported in the same line item. In this note, we show that the problem is particularly acute for the trading revenue projections of the 11 banks subject to the GMS because of the double-counting of mark-to-market losses. More precisely, we estimate that noninterest income was underestimated by about $40 billion for the banks subject to the GMS in the June 2020 stress tests. Moreover, BPI’s top-down model shows that the minimum CET1 ratio is understated by about 50 basis points on average across all GMS banks.
Although we believe our approach could better mitigate the double-counting of mark-to-market losses in the stress tests relative to the current approach and could serve as a stopgap measure, we believe the collection of more granular historical series for trading revenue would be the best approach to update the supervisory stress testing methodology. This would provide more accurate stress capital buffers and offer stronger incentives for market maker banks to hold higher trading inventory positions on behalf of customers, thereby helping improve market liquidity during a period of market stress. Furthermore, more accurate SCBs would also enable firms to allocate capital across their businesses more efficiently and reduce the potential for pricing distortions or migration of activity to the shadow banking sector.
It is worth noting that the GMS methodology also has problems. In particular, rather than stressing the trading book by combining the results of a high number of internally plausible shocks (as sophisticated financial institutions do), it uses a single exceptionally implausible shock. The single-shock approach is not only inaccurate, but it also disincentivizes banks from holding a diversified portfolio.
We believe our recommendations are in the spirit of remarks made by Federal Reserve’s Vice Chair for Supervision Randal K. Quarles. He addressed the ongoing importance of continuing to refine stress testing to maintain the credibility of the stress test so that the Federal Reserve can achieve its goal of providing relevant information to investors, counterparties and markets about the capital condition of banks:
Stress testing has evolved, and must continue to evolve, to take on what we as supervisors learn from our work and what we can learn from others. Each year, we have refined both the substance and the process of the stress tests, guided by our own experience and by critiques and suggestions from others. . . . Without such adjustments, regulators, banks, and the broader public cannot get a clear and dynamic view of the capital positions of the largest banks. . . . Stress tests results should allow investors, counterparties, analysts and markets to make more informed judgments about the condition of banks.
More broadly, we believe our results continue to demonstrate that it would be better to set capital standards for banks based on models that are publicly disclosed through continuing to improve transparency, a goal toward which the Federal Reserve has taken some initial, commendable steps over the past few years. Although the limited information provided makes this constructive feedback possible, our analysis would be more accurate if we saw better transparency in the supervisory stress test models.
 Forthcoming research by Federal Reserve economists indicates that the post-crisis regulatory regime significantly curtailed risk-taking by banks and that trading activities at the largest U.S. banks are a source of revenues when volatility in capital markets rises as a result of repricing events and increased client activity. See Abboud et al., “COVID-19 as a Stress Test: Assessing the Bank Regulatory Framework,” Federal Reserve Board (forthcoming).
 The global market shock currently applies to any domestic BHC or U.S. IHC subject to supervisory stress tests and with aggregate trading assets and liabilities of $50 billion or more, or aggregate trading assets and liabilities equal to 10 percent or more of total consolidated assets, and that is not a large and noncomplex bank holding company as the term is used in 12 CFR 225.8. See 12 CFR 252.54(b). There is an outstanding proposal to replace the term “large and noncomplex” with Category IV as defined in the Fed’s tailoring framework. The 11 firms subject to the global market shock are Bank of America Corporation; Barclays US LLC; Citigroup Inc.; Credit Suisse Holdings (USA), Inc.; DB USA Corporation; The Goldman Sachs Group, Inc.; HSBC North America Holdings Inc.; JPMorgan Chase & Co.; Morgan Stanley; UBS Americas Holding LLC; and Wells Fargo & Company. Though not addressed here, we continue to believe that the Fed should improve alignment of the scoping of firms subject to the GMS with its recently finalized tailoring framework.
 With regard to the overly conservative risk factor shocks included in the global market shock, see “Global Market Shock and Large Counterparty Default Study: Recommendations for Reforms Based on a Statistical Analysis of Stress Testing Scenarios,” SIFMA (August 2019), and this post.
 See Federal Reserve Board, Dodd-Frank Act Stress Test 2020: Supervisory Stress Test Results at 7–9 (June 2020), available at https://www.federalreserve.gov/publications/files/2020-dfast-results-20200625.pdf (hereinafter DFAST 2020 Results).
 Consolidated Financial Statements for Holding Companies—FR Y-C, Schedule H-I, line item 5.c; see also Instructions for Preparation of Consolidated Financial Statements for Holding Companies, HI-9-10.
 It is not even clear that a bank should normally be expected to make mark-to-market losses when markets are stressed. Over the post-crisis period, bank trading revenue and other revenue (including loan losses) are negatively correlated, as they have been during the COVID-19 crisis.
 Abboud et al. (2020) were the first to document the positive correlation between the level of the VIX and activity-based trading revenue using Volcker Rule metrics data (confidential supervisory data).
 DFAST 2020 Results at 1, 21, and 23.
 This is a good, rough proxy, but an oversimplification. For example, typically provisions for loan losses exceed net charge-offs, and we are omitting losses associated with the global market shock, credit losses on investment securities, and other losses.
 Federal Reserve Board, Dodd-Frank Act Stress Test 2019: Supervisory Stress Test Methodology at 21 (March 2019), available at Dodd-Frank Act Stress Test 2019: Supervisory Stress Test Methodology, March 2019 (federalreserve.gov) (emphasis added).
 The null of a unit root in the VIX is rejected using the Augmented Dickey-Fuller (ADF) test with a drift term. A model that includes a constant and two lags of the change in the VIX yields the ADF t-statistic of –3.5. The 5-percent critical value is –2.9, so at the 95-percent level the null hypothesis of a unit root is rejected.
 See Hirtle, B., A. Kovner, J. Vickery, and M. Bhanot, 2016, “Assessing Financial Stability: The Capital and Loss Assessment under Stress Scenarios (CLASS) Model”, Journal of Banking and Finance, 69(S1), pp. S35–S55. Available at https://www.sciencedirect.com/science/article/abs/pii/S0378426615002940
BPI’s own version of the CLASS model uses different model specifications to generate the projections of loan losses and pre-provision net revenue. For example, the projections for pre-provision net revenue include bank-specific fixed effects (more specifically, a trailing multiyear fixed effect) to capture each bank’s average performance in recent years, and loan losses are modeled using quantile regressions. The percentiles of the quantile regressions are chosen to match the level of losses projected by supervisory models.
 We excluded Bank of New York Mellon and State Street Corporation, because they are only subject to the large counterparty default (LCD) component. As a result, their GMS trading losses are zero. Aggregate trading and counterparty losses were $83.2 billion when banks subject to the LCD component are also included.
 Making the trading revenue series more granular for the non-GMS banks would also improve the performance of trading revenue projections for those banks.
 Vice Chair for Supervision Randal K. Quarles, Federal Reserve Board, “Stress Testing: A Decade of Continuity and Change,” at Stress Testing: A Discussion and Review, a research conference sponsored by the Federal Reserve Bank of Boston, Boston, Massachusetts (July 9, 2019).