Big data analytics in the financial sector around the world has become increasingly crucial to improve business efficiency, reduce operational costs and address long-standing business challenges.
The term big data was coined in the early 1990s and, as the name suggests, it is “big.” Not just in size, but also in terms of speed of generation and diversity, which is why traditional computer algorithms are often not able to process big data as efficiently as conventional data.
For example, a massive increase in the number of sales records for revenue calculation in one file does not necessarily make data “big.” On the other hand, a large number of sales records, along with real-time data on customer requests, past purchases and market trends all continually updated from multiple sources in a variety of methods, can lead to availability of big data for goals, such as predicting customer behavior (like the data, goals can also evolve).
Big data means big opportunities that come with big risks, needing big measures
As one of the most data-intensive industries, the financial sector is witnessing great opportunities for new and innovative algorithms in the field of big data analytics. This sector also happens to be one of the most regulated in terms of information security and data protection, given customer interest is increasingly at stake with present day offerings (e.g., Open Banking). Furthermore, according to the IBM Security Intelligence Index, the financial sector has been the topmost industry targeted by threat actors in the last five years.
Big data opportunities in the financial sector are diverse and many. Examples include consumer analytics from processing high volumes of personal data to predict customer demand and expectations, risk analysis from monitoring purchases, credit scores and other financial parameters in real time with limited latency issues, fraud detection through analysis of huge volumes of transactions to identify purchase irregularities and anomalies, and algorithmic trading to formulate new trading techniques by processing complex mathematical equations at high speed.
In this article, we identify five key topics to keep in mind when implementing big data analytics in the financial sector, based on the U.S. National Institute of Standards and Technology, the European Union Agency for Cybersecurity, European Supervisory Authorities reports, and the current threat landscape of the financial sector. These considerations should help organizations strike a balance between business enablement and data protection.
Confidentiality of data and privacy of data subjects
Confidentiality covers the protection of data in rest, transit and use. Compromises to confidentiality can be detrimental. This was evident during a major breach in the financial sector in 2017, caused by a series of misconfigurations and unhandled vulnerabilities, including usernames and passwords sorted in plain text that the hackers found and used to escalate privileges to achieve deeper access.
Technology and methods for the encryption of data at rest and in transit have evolved over the years, while encryption of data in use is more nascent. The challenges with protecting data in use become more severe with big data because unlike conventional data:
- Big data processing strongly depends on shared computing environments (not local).
- Big data is processed continually (not just when the machine and program are switched on).
- Big data has more longevity — evolving and reincarnating.
Privacy is the right of data subjects to have control over how their personal data is processed. Protection of personal data becomes more challenging with big data as the sheer volume can lead to a higher likelihood of indirect identifiability from data aggregation.
Privacy-enhancing technologies available today, whether hardware-based (e.g., confidential computing), cryptography-based (e.g., homomorphic encryption) or data-based (e.g., pseudonymization), can enable personal data processing with adequate confidentiality safeguards even during processing of data. PETs are cutting-edge, but literature on their applicability to big data analytics appears to be sparse. Organizations should proactively check with their service providers on the applicability of privacy and confidentiality-enhancing solutions to big data analytics prior to adoption.
Governance and compliance
Financial regulation refers to the laws that institutions, such as banks, credit unions, insurance companies, financial brokers and asset managers must follow. Each country establishes its own regulations (e.g., New York Department of Financial Services, Monetary Authority of Singapore, etc.) and committees (e.g., the Financial Policy Committee in the U.K.) to support them.
There are also industrial regulations, such as the Payment Card Industry Data Security Standard — a global set of standards for handling credit card information — and the Bank Secrecy Act, which is aimed at preventing financial organizations from being used for money laundering. Moreover, regulations such as the EU General Data Protection Regulation and ePrivacy Directive impact the general data protection requirements that all relevant institutions need to comply with, within or beyond the financial sector.
Cross-border data flow has witnessed rigorous and evolving restrictions over the past decades, including the recent turbulence caused by the Court of Justice of the European Union’s “Schrems II” ruling on the framework for international data transfers. The European Banking Federation, together with five other financial industry associations, co-signed a letter in October 2020 calling attention to the impact of the ruling.
The magnitude of regulatory requirements means financial institutions need to be very vigilant when processing confidential and personal data, and even more so with big data.
Industry and global standards such as NIST, Center for Internet Security, International Organization for Standardization, and Statement on Standards for Attestation Engagements help define best practices for the security of conventional data. However, the standards do not explicitly cover big data processing; this could lead to organizations not adopting big data analytics due to a fear of the unknown or a forced mapping of conventional data guidelines to big data, which may not always work.
In January 2020, the European Banking Authority submitted their Final Report on Big Data and Advanced Analytics, highlighting the increased use of big data in the financial sector across Europe. NIST also has a few publications to offer insight, but this is far less than what exists for conventional data.
Organizations should leverage such materials, invest in relevant expertise and develop a robust strategy for big data so benefits are unlocked while ensuring compliance with regulations and standards.
Infrastructure security and management
Security and management of infrastructure includes measures for ensuring availability and integrity of data (through the implementation of vulnerability management, malware protection, network boundaries, resource management, distributed denial-of-service, etc.), which complements the data-centric approach for confidentiality mentioned above.
Security vulnerabilities can lead to damaging cyberattacks. In 2015 a hacker managed to take control of 90 servers in a large financial institution, causing a breach of over 80 million accounts. Infrastructure management requires well-designed control processes to avoid errors. A global capacity assessment command error caused a social media platform outage in 2021 that lasted 14 hours. Infrastructure security and management is even more important for big data due to the domino effect of risks from the higher complexity of processing.
The 2020 European Banking Authority’s report on big data also featured an entire chapter to infrastructure security and management. There is a host of technical tools available in the market to achieve secure and manageable infrastructure. Organizations need knowledge and expertise assessing the applicability of such technology to their big data implementations.
Identification, authentication, authorization, accountability
Identification, authentication, authorization, and accountability is about giving the right person (identification-authentication) the right level of access (authorization) in a timely manner and able to prove due care in that (accountability). IAAA misconfigurations can result in major breaches.
Depending chiefly on manual provisioning for access management in the world of big data is impractical, and financial institutions that opt for big data analytics must invest in an identity and access management solution and a suitable log management solution. A lot of the big players in IAM offer AI solutions referred to as identity analytics that can be used for such purposes.
Moreover, accountability is one of the GDPR principles and log management is a requirement by many of the financial regulations, including the U.S. Gramm-Leach-Bliley Act and the Sarbanes-Oxley Act.
Data provenance
Data provenance, the description of the origin, creation and propagation process of data, is key in financial risk analysis. For example, data scientists may need to explain the outputs of algorithms from tens of thousands of inputs to regulators in a feasible manner.
Automation of big data analytics can make data provenance more efficient but can be challenging (though not impossible) given there is often no clear end to the processing of big data, particularly when coupled with machine learning. In 2021, a multinational insurance firm based in Europe was fined 1.75 million euros by the data protection authority for keeping personal data of millions for an excessive period of time.
In their 2020 final report, the European Banking Authority also included an entire chapter titled “Elements of Trust in Big Data and Advanced Analytics” that covers topics such as interpretability, traceability and data quality. Organizations can leverage such materials to establish and maintain a clear plan for managing big data attributes throughout the data life cycle.
Big data analytics provide immense opportunities for business generation and customer experience, particularly in the financial sector. However, big data needs its own curated standards and guidance, (instead of conventional data standards being force-fit onto it) to enable faster and safer adoption by organizations. Financial institutions need to take into account privacy and security considerations that are specific to big data to be able efficiently manage inherent risks and unlock the myriad benefits of big data.
Photo by Sean Pollock on Unsplash