Ensuring Stress Tests Remain Effective

“Fed Vice Chairman for Supervision Randal Quarles said Friday he isn’t concerned that ‘if you tell people what the rules are, they will game the rules,’ adding that, instead, ‘they will comply with the rules.’”
                                                      Ryan Tracy, Wall Street Journal, December 1, 2017.

Last month, the Federal Reserve Board published proposed refinements to its annual Comprehensive Capital Analysis and Review (CCAR) exercise—the supervisory stress test that evaluates the capital adequacy of the largest U.S. banks (34 in the 2017 test). The changes include both increases in transparency and adjustments to the macroeconomic scenarios. The first of these calls for additional disclosure about the models that the Fed uses. The second concerns the method used for constructing the path for housing prices and the unemployment rate, and would add a path for short-term wholesale funding costs.

Stress testing assesses the losses a financial institution would suffer under very adverse conditions. The practice has been around for decades. A risk manager might choose a specific historical episode like the stock market crash of October 1987 or fall 1998 collapse of Long Term Capital Management, and simulate the impact on the value of the bank’s assets. The bank’s management could use the resulting report for both planning its capital levels and setting its risk tolerance.

But, as Schuermann describes, comprehensive stress testing that encompassed all the financial risks of a private bank only emerged during and after the financial crisis. In 2009, U.S. authorities employed stress tests as a way to restore credibility to the financial system. Today, these tests have three primary objectives: guaranteeing that banks have rigorous internal risk management processes; ensuring that banks’ management and boards of directors are attentive to the risks their enterprises face; and providing the authorities with a comprehensive map of the risks and vulnerabilities in the financial system.  (For more on the history and uses of stress tests, see our earlier post here. And, for a survey of their current widespread use, see the Basel Committee’s description here.)

The Case of the GSEs. We can summarize any stress testing regime by its characteristics along three dimensions: transparency, flexibility and severity. The mix of these features determines the regime’s effectiveness.

To understand the tradeoffs and pitfalls, consider the case of Fannie Mae and Freddie Mac, the government-sponsored mortgage lenders (the GSEs). Unlike private banks, the GSEs were subject to an annual government stress test before the financial crisis. Following a decade of development, the Office of Federal Housing Enterprise Oversight (OFHEO) began conducting tests in 2001. The GSEs always passed—until they collapsed at the height of the crisis in September 2008. Frame, Gerardi and Willen trace the ineffectiveness of these early stress tests to their mix of transparency, flexibility and severity. First, there was complete transparency: OFHEO published the models and scenarios in the Federal Register prior to initiating the tests. Second, there was no flexibility: from year to year, neither the parameters nor the macro conditions changed. And third, the stress applied was insufficiently severe: house prices rose for the first 10 quarters of the scenario, before falling only modestly over the full 8-year horizon.

Is any of these three dimensions (transparency, flexibility and severity) more critical to success than the others? The answer is yes. First, if the scenarios are insufficiently dire, there is no point to the test. Second, as we will argue in detail below, flexibility is essential. Without it, the tests are useless. Third, there is considerable room for transparency, but there are limits. Because models change slowly, and the banks can glean considerable information from past tests, disclosure of the Fed’s models is unlikely to be a problem. Premature disclosure of the scenarios is another matter: in contrast to the GSE tests, and in line with the Fed’s current CCAR practice, scenarios should change frequently with disclosure only after the banks’ portfolios are determined. The alternative invites gaming (as it did for the GSEs).

So, stress tests need to be flexible and (unsurprisingly) stressful.

Transparency. This brings us to the details of proposed changes to CCAR. We applaud the Federal Reserve’s decision to put significant resources into maintaining and improving its own models, which already are state of the art and the envy of supervisors around the world. This commitment assures the integrity of the process, allowing authorities to check that banks’ own systems are producing sensible results.

One advantage of the Fed’s efforts is that they push everyone toward best practice. Transparency about modeling (rather than scenarios) creates a forum for sharing enhancements in risk-management frameworks, creating a race to the top. Given that banks have already reverse engineered key parts of the Fed’s models (see Glasserman and Tangirala), the benefits of increased disclosure almost surely outweigh the costs.

At the same time, authorities could improve how they use information about the banks’ models that they are able to collect in the course of the tests. As one part of the December proposal, the Fed plans to publish a set of hypothetical portfolios together with the corresponding losses implied by their model. In a post several years ago, we extolled the virtues of hypothetical portfolio exercises. We argued that banks as well as supervisors would benefit from knowing each other’s model-implied risk-weighted assets (RWA) from a set of standardized, hypothetical portfolios. Internally, risk managers would know when their models were far from the norm. Externally, supervisors could identify institutions that are routinely doing a poor job of risk assessment. And, from a macro perspective, it would be simple to tell if the banks’ own models were starting to mirror the Fed’s. The solution here is straightforward and low cost: require the banks to follow the lead of the Fed, and publish their loss rates for the exact same hypothetical portfolios.

Flexibility and Severity. The Fed proposal includes small improvements that both reduce the tendency of the scenarios to induce pro-cyclical behavior and allows them to reflect the likelihood of higher funding costs in a period of stress. To reduce pro-cyclicality, the Fed will modify the unemployment and residential price scenarios so that they vary with the state of the cycle: the higher the unemployment rate or the lower the property prices at the start of the test, the smaller their further increase or drop. Separately, the Fed has announced that it will now incorporate short-term wholesale funding rates into the scenarios.

Looking beyond these modest improvements, we encourage the Fed to face more squarely two big challenges to the formulation of the stress test scenarios: incorporating events that are not in the historical record and allowing the list of variables included to change. The authorities are clearly aware of these issues. On the first, they assert that the scenarios are not limited to historical episodes. But, this claim seems inconsistent with their decision to cap the rise of the unemployment rate to four percentage points—or less than the rise in the most recent cycle.

With respect to the addition of new variables, the Board writes in the December 15, 2017 proposal,

“If scenario variables do not capture material risks to capital, or if historical relationships between macroeconomic variables change such that one variable is no longer an appropriate proxy for another, the Board may add variables to a supervisory scenario. The Board may also include additional scenario components or additional scenarios that are designed to capture the effects of different adverse events on revenue, losses, and capital.”

This is critical. Greenwood, Hanson, Stein and Sunderam propose a simple mechanism for figuring out what to add. In essence, they suggest that when you observe a sudden increase in either profitability or the level of activity, put stress on it! And doing so is especially important when the activity is large enough to be of broad importance for the economy. An especially relevant example is asset-backed commercial paper (ABCP) in the mid-2000s. From the beginning of 2001 to the end of 2004, ABCP grew by a cumulative 15% from $600 to $690 billion: that is an annual average nominal growth rate of 2.7%. Then things took off. In 2005 and 2006, the annual growth rate went to 25%. At its peak in mid-2007, there was over $1.2 trillion outstanding. It would surely have made sense to stress ABCP in early 2006.

There is a simple way to make systematic the introduction of new variables and of scenarios that are outside of historical experience. Following the example of the Bank of England, the Board should introduce exploratory scenarios in each year’s exercise (possibly including examples already used by regulators in other advanced economies). Much as standardized testing services incorporate experimental questions to develop benchmark data for future tests, the Fed should be using the stress tests to identify and investigate areas for closer future scrutiny.

Before wrapping up, we note our opposition to shifting the CCAR from its current annual frequency to one that is every two years. In our view, financial conditions can change too quickly for this to work. Imagine that a stress test had been completed using end-2004 data. The next test at end-2006 would have been too late to catch the rapid ABCP buildup (as well as the worst of what happened in the mortgage market). This is one reason that the Dodd-Frank Act mandates stress tests (DFAST) every six months, albeit using the banks’ internal models.

Conclusion. The Federal Reserve has an effective framework for carrying out its all-important stress tests of the largest U.S. banks. Having started in 2011, the Fed is now embarking on only the seventh CCAR exercise. That means that everyone is still learning how to best structure and execute the tests. The December proposals are clearly in this spirit.

With this same goal in mind, we make the following proposals for enhancing the stress tests and preserving their effectiveness:

  1. Change the scenarios more aggressively and unexpectedly, continuing to disclose them only after banks’ exposures are fixed.
  2. Introduce an experimental scenario (that will not be used in “grading” the bank’s relative performance or capital plans) to assess the implications of events outside of historical experience and to probe for weaknesses in the system.
  3. As a way to evaluate banks’ internal models, require publication of loss rates or RWA for the same hypothetical portfolios for which the Fed is disclosing its estimates.
  4. Stick with the annual CCAR cycle.

In closing, we return to Vice Chairman Quarles’s remarks cited above. We agree that transparency is important. For people to obey rules, they need to understand them. But there are limits. When we give our students an exam, we provide an enormous amount of information in advance. We disclose the material covered, the length of the exam, and the method we will use to evaluate performance. We even give students practice exams that mimic the types of questions we are likely to include and the kinds of answers that merit full credit. However, we do not disclose the new test questions, much less their answers. And, we retain the flexibility to change the questions up until the time when we administer the test.

For stress testing, the analog is the scenarios. So long as the Fed does not disclose these before the banks’ portfolios are set, and remains flexible to change them up to the last minute, the test will yield useful information and encourage the banks to maintain a prudent framework for capital planning. Otherwise, as we learned from the pre-crisis experience with Fannie Mae and Freddie Mac, when stress tests are transparent, inflexible and lax, they are useless.

Acknowledgement: We thank our friends Dick Berner, Peter Fisher, Anil Kashyap, Andy Kuritzkes, Til Schuermann, and Jeremy Stein for taking the time to discuss the design of stress tests with us.

Disclosure: Steve took part in a roundtable on supervisory stress test transparency for academics and Federal Reserve representatives in Washington D.C. on January 19, 2018. Vice Chairman Quarles was among the participants at the roundtable.

Mastodon