RiskTech Forum

IHS Markit: FRTB - Sparking New Approaches for Big Data Analytics

Posted: 21 March 2017  |  Author: Paul Jones  |  Source: IHS Markit


The introduction of the Basel Committee’s Fundamental Review of the Trading Book (FRTB) standards involves a comprehensive overhaul of banks’ market risk capital frameworks. The move from value-at-risk (VaR) to scaled expected shortfall (ES) in order to capture tail risk will significantly increase the number and complexity of the capital calculations that banks need to undertake, as well as the sheer volume of data to be managed.

From a computation perspective, this means that P&L vectors need to be generated per risk class, per liquidity horizon and per risk set. Removing the redundant permutations brings the total number of P&L runs to 63 (some of which can be done weekly), compared to two (VaR and Stress VaR) in the current approach.

Firms are faced with the challenge of performing a significantly increased range of FRTB capital calculations at scale while also managing their costs and risk. The question is: are banks’ current IT risk infrastructures up to the task ahead?

If banks want to achieve proactive and intraday risk management while also effectively managing their capital over the long-term, they will require high-performing IT infrastructure that can handle the intensive calculations required. However, many banks today rely on technologies such as relational databases and in-memory data grids (IMDGs) to conduct risk analytics, aggregation and capital calculations.

IMDGs work by replicating data or logging updates across machines. This requires copying large amounts of data over the cluster network, which has a far lower bandwidth than that of RAM. As a result, IMDGs incur substantial storage overheads, are sub-optimal when applied to pure analytics use cases, such as FRTB analytics, and are expensive to run.

In short, banks’ legacy IT architectures will need a significant overhaul when it comes to FRTB and firms are looking for alternative options. One of those options is Apache Spark, an open source processing engine built around speed, ease of use and sophisticated analytics.

Spark has a distributed programming model based on an in-memory data abstraction called Resilient Distributed Datasets (RDDs) which is purpose built for fast analytics. RDDs are immutable, support coarse-grained transformations and keep track of which transformations have been applied to them. RDD immutability rules out a big set of potential problems due to updates from multiple threads at once and lineages that can be used for RDD reconstruction. As a result, check pointing requirements are low in Spark. This makes caching, sharing and replication easy. These are significant design wins. There are other advantages over IMDGs too:

In fact, our own studies have demonstrated many of these capabilities, highlighting the power of Spark in terms of performance, scalability and flexibility. For example, we recently completed a proof-of-concept with a European bank, which showed that our capital analytics and aggregation engine can support the FRTB capital charges for IMA and SA in single digit seconds. This is based on a portfolio of one million trades with 9 million sensitivities, 18 million P&L vectors and on hardware costing just USD20k.

As one of the most active projects on the Apache platform, Spark benefits from thousands of contributors continuously enhancing the platform. In fact, we’ve seen a 20% improvement in Spark aggregation performance year-on-year since we started building our solutions on the platform in 2016. We’re excited to see the improvements that are bound to come in the year ahead!