MSCI: Backtesting Risk Models
Posted: 1 September 2016 | Source: MSCI
In this half-year update of the Backtesting Review, MSCI began by analyzing how each of four types of simulation models available in RiskMetrics RiskManager—Monte Carlo, historical, filtered historical and weighted historical— performed over the year ended June 30, 2016. These models were tested on 10 indexes, representing different segments of the U.S. and global equity and bond markets.
Risk measures, such as Expected Shortfall and Value at Risk, are designed to calculate the risk level of a portfolio. But some risk models may work better than others for different asset classes and for different periods of time. We ranked four types of models using the MSCI Model Scorecard, an innovative tool that measures how well a model has predicted risk, either with Expected Shortfall (ES) or Value at Risk (VaR).
We also performed a traditional VaR backtest, by counting the number of times the realized loss of the index exceeded the VaR forecasts for the four models. A model that has too many “VaR exceedances” has underestimated risk, while a model with too few exceedances overestimated it. This analysis was complemented by a number of conditional backtesting measures, designed to detect inappropriate clustering of VaR exceedances. In addition to the traditional VaR backtest, we conducted a formal backtest of Expected Shortfall, based on a framework recently developed by MSCI.We validated the entire forecast distribution through therealized p-values.
Our July 2016 model backtests found:
• When any risk model failed the VaR and ES backtests, it was usually the result of either a mild underestimation (yellow zone in Exhibits 5-8) or a mild overestimation of risk (light blue zone). Only in one instance did a severe underestimation of risk occur—the mc_fhist5y97 model failed the backtest for 99% VaR for the MSCI Emerging Market index.
• As in the previous backtesting report (January 2016), we observed that, as we calculated VaR or ES at higher confidence levels and got deeper into the tails of the distribution, the Gaussian assumption may not have been appropriate. This was when the filtered historical models tended to perform better. At lower confidence levels, however, this seemed less of an issue and the Monte Carlo based models also performed well.
• Although it was difficult to generalize about each model’s performance over different indexes, the MSCI Scorecard suggested that the historical models tended to perform better for the higher confidence levels, whereas the Monte Carlo models performed more strongly at the lower confidence level.