What Can We Learn From Payment System Failures and Global IT Outage?

As global outages hit banking firms and payments worldwide, are we to be forever doomed by glitches, power outages and cyberattacks? Possible solutions may exist, but what are they?

It’s been a bad week for global payment and IT systems, as the Clearing House Automated Payment System (CHAPS) experienced a major outage causing serious issues in the UK, swiftly followed by a Microsoft outage that caused havoc for banking firms across the globe.

On Thursday 18 July, the CHAPS system, used by UK high-street banks and lenders to send money to one another, experienced an outage, caused by a glitch related to the global network Swift. On average, CHAPS enables around 200,000 payments per day. According to the Bank of England, the average daily value of CHAPS payments in February 2021 was £345billion.

While the Bank of England confirmed that Swift had “restored service following earlier issues” and that “CHAPS payments are settling as normal”, later that day, the disruption could still prove very costly.

By Friday 19 July, a completely separate issue saw a huge IT outage causing chaos for travel companies, alongside the healthcare and banking sectors. Problems caused by another ‘glitch’ in a content update for devices running Microsoft Windows, originating from a cybersecurity service provider Crowdstrike.

With so many firms grinding to a halt, across a wide range of sectors, the general level of preparedness appears to have significant shortcomings. Because of this, many will wonder how they can better equip themselves with tools to ensure they can mitigate the impact of problems, even if something on this scale happens again.

Better balancing risk

“Things will always go wrong: it’s a question of when, not if,” says Dafydd Vaughan, CTO at Public Digital and co-founder of the UK Government Digital Service.

Dafydd Vaughan, CTO at Public Digital

“Companies and national governments need to be prepared and take mitigating actions to minimise their exposure. This crisis could have been avoided by companies rolling out computer updates on a few machines first to check they work, rather than sending them to all machines at the same time.

“The government needs to consider the risk that comes with so few companies controlling so much of our essential infrastructure. In all industries, the government should see the value of more competition in their supply chains, and work to increase the number of companies that provide these essential services and avoid monopolies controlling our national infrastructure.

“We get a lot of benefits from systems being connected and sharing information, but that does introduce risk too. We need to balance the gains against the risk and be aware that issues like this can – and increasingly will –  happen.”

Could DORA be a turning point?

While these system outages and disruptions have crippled firms worldwide, even if it is just in the short term, it’s hardly a surprise. The need for operational resilience is something the European Union is aiming to address with its Digital Operational Resilience Act (DORA).

The act aims to set new standards for financial sector enterprise service resilience; requiring firms to ensure compliance by 17 January 2025.

Alex Reddish, managing director of Tribe Payments

Alex Reddish, managing director of Tribe Payments, discusses the impact this regulation could have in the future, and whether further action will be needed: “We are now in a period where we have seen large institutional payment rails fail more frequently than we have ever seen before. I cannot imagine a time when technology oversight was more important.

“Although CrowdStrike is not a payment processor, the indirect consequences of its outage impact brand value and reputation for various businesses using its service. What we’re seeing today shows an incredible need for us to look at infrastructure beyond payments.

“I think the DORA regulation is a foundation for solving this problem, though it won’t solve it entirely. DORA covers some aspects, but it will never be enough as we continually push the boundaries of efficiency and economic value. Digital and operational resilience should be a top priority for everyone, regardless of whether they’re critical infrastructure or not.”

Regulation, regulation, regulation

Regulations may well be key in ensuring these types of issues aren’t quite as impactful in the future.

Alina Timofeeva, board member at BCS

For Alina Timofeeva, board member at BCS, the chartered institute for IT, it is key that regulators and firms take action equally seriously: “Regulators like the Financial Conduct Authority and European Banking Authority do have regulations in place that call out concentration risk in the market and the fact is that companies should be doing much more to mitigate it.

“I believe, going forward, there will be a much bigger push from regulators to mitigate concentration risk, for companies and providers. I anticipate both tighter regulations, but also tighter scrutiny from the regulator should companies prioritise the cost and efficiency over the safety and security of their operations.”

“It is key for companies to invest in operational resilience. This would cover technology, data, third parties, processes and people. It is also key to test out the disaster recovery plans, instead of having these as a paper exercise and ensure that all the people, processes and data (and not only the technology) are tried and tested at scale, and there is sufficient preparation in place should such an outage happen in future. It is key to do simulation scenarios and testing of these.”

Time to modernise

While regulations and operational resilience will be crucial in mitigating the risk of future outages, there are also questions about whether existing payment infrastructure in the UK is outdated.

Carol Grunberg, chief business officer at Yuno

Carol Grunberg, chief business officer at Yuno, a global payment orchestration platform, believes an overhaul is required: “Existing payment infrastructure is showing its age, exemplified by the CHAPS system outage in the UK. These disruptions indicate that many payment systems, designed decades ago, struggle with today’s transaction complexity and volume.

“Solutions to modernise payment systems include upgrading existing systems — regular updates with more resilient software and cloud technologies can enhance performance and reliability. Partnering with modern fintechs specialising in payments infrastructure should also help to effectively manage today’s global payment volumes and complexities. These companies utilise advanced technology stacks and methodologies to ensure seamless and scalable operations.

“A complete overhaul may be necessary for long-term sustainability. This involves re-engineering the architecture towards modular, microservices-based frameworks, enhancing interoperability, and investing in robust cybersecurity. Blockchain technology can offer decentralised alternatives. Robust, stress-tested systems can serve as blueprints for these upgrades and overhauls, ensuring smoother transitions and greater reliability.”

Do we need to quicken implementation?

However, it might not be time to despair completely. As Michael Levens, head of data, technology, automation and testing at Delta Capita, explains, moves to update our systems have been taking place for years.

Michael Levens, head of data, technology, automation and testing at Delta Capita

“While it might not be obvious, modernisation efforts have been underway for many years to ensure payment systems are able to cope with the future demands. The Bank of England’s RTGS Renewal Programme started in 2017, aiming to improve the resiliency and flexibility of the RTGS system. This has included using new technology and messaging standards (ISO 20022) to improve performance, data quality and operational efficiency.

“While focused more on retail payments, the concept of the UK’s New Payment Architecture (NPA) started in 2015 with its main aim to modernise the existing payment infrastructure and provide a more resilient, efficient, and innovative payment system.

“Unfortunately, the development of NPA has been delayed many times and we are still awaiting the delivery of NPA to really propel the UK’s payments systems into the new digital age. So, in summary, it has been acknowledged for some time our existing payment infrastructure is outdated and action is required. Unfortunately, implementations of these initiatives have been slower than hoped.”

Failing to prepare is preparing to fail

Kate Needham-Bennett, senior director of resilience innovation at Fusion Risk Management

“There is a tendency to see the perfect storms like this as implausible, but after the past five years, I think we need to treat almost everything you can think of as plausible,” says Kate Needham-Bennett, senior director of resilience innovation at Fusion Risk Management.

“Financial services firms defend against cyberattacks every second – eventually one will get through; energy supplies can be disrupted by weather, geopolitics, manufacturing errors, etc, so there will be power outages; and there have been system glitches since the industrial revolution.

“All they can do is prepare for when they do happen; get a single pane of glass view of their data, establish what is important (to customers, the firm and the market), map out dependencies, exercise against impact tolerances or recovery time objectives, and then remediate where possible.”

‘The financial industry needs to adopt a multifaceted approach’

Many firms may also look to see how they can enhance or implement new technical solutions to ensure they don’t also fall victim to power outages or cyberattacks.

Matt Williamson, SVP and industry principal at Endava

Matt Williamson, SVP and industry principal at Endava, suggests some ways in which firms can safeguard themselves against these threats: “To safeguard against cyberattacks, power outages, and system glitches similar to those we have seen this week, the financial industry needs to adopt a multifaceted approach.

“The first step in improving cybersecurity is implementing sophisticated threat detection and response systems, such as AI-driven solutions, which can assist in quickly identifying and eliminating such threats. Regular security audits, penetration testing, and multi-factor authentication are all methods to further bolster defences.

“Secondly, preparedness for power outages is critical. Implementing uninterruptible power supply (UPS) systems and backup generators will ensure critical operations continue uninterrupted. Integrating cloud-based solutions for data storage and operations adds another layer of resilience, allowing for quick recovery and data redundancy. Patching and system updates on a regular basis remain crucial for vulnerability protection.

“In addition, the financial sector needs to continually promote cooperation and adherence to regulatory standards like GDPR and PCI DSS.

“Ultimately being prepared for anything should be ensured by creating a thorough crisis management strategy, practising frequently, and keeping open lines of communication with all relevant parties.”

The post What Can We Learn From Payment System Failures and Global IT Outage? appeared first on The Fintech Times.