top of page
Ray Delany

Cloud on the ground: Nothing is perfect

Updated: Aug 29

​On Friday 19 July, the world witnessed chaos unfold across essential services due to a major bug in a software update designed to protect them. Supermarkets struggled to process transactions, internet banking was taken offline, and airlines couldn’t check passengers in. The widespread disruption, exacerbated by initial media reports that were incomplete and often inaccurate, underscored a fundamental truth: in the fog of war, truth is the hardest thing to find.


 

Military officers use that term to describe the confusion and uncertainty that envelops any crisis. The reality is that we are all fighting a war against cyber-crime. My phone regularly rings with suspicious numbers; warnings about innocent-looking texts and emails that are traps cleverly disguised as something routine are ubiquitous; the news is filled with tragic stories of good people being conned out of their life’s savings by international criminals who can’t be touched by local law enforcement even if they can be traced.


In this silent war, CrowdStrike is one of the heroes, or at least they were until Friday night. CrowdStrike uses similar techniques to those employed by cyber criminals to combat them, making a significant differentiator for the product which is one of the most highly-rated cyber-security products in the world.


Part of that protection process is constantly rolling out updates to their Falcon sensor, which is deployed on thousands of devices worldwide. These updates quietly take care of emerging risks as the crooks come up with bright new ideas and most of us aren’t even aware that it’s there. It is considerably less conspicuous than more traditional anti-virus software.


What happened on Friday was one of these updates was defective, and broke the operating system on Windows devices, causing what the IT industry cheerfully calls the “blue screen of death” - so-called because when this happens a rebuild of the device is usually the only fix.


Except it wasn’t in this case. By the time the 6pm news was talking about an unexplained outage, CrowdStrike were already deploying the fix. Out of four computers in our office, three were affected and the fourth wasn’t - purely by chance - because it received the corrected update before it failed. The others were speedily recovered with a simple fix. Hats off to the local providers who facilitated this rapidly even to the smallest of their clients.


Because we have people who monitor this area, we were pretty sure of what was happening long before the journalists found their favourite buddy who “knows all about this stuff” and got them to give the wrong information.





Search for perfection


Last week I would not have counselled any of my clients against using CrowdStrike, and I still wouldn’t.


As we’ve written before, it’s mathematically impossible to guarantee software will be free of bugs. There is always a risk of something unexpected, and that risk is exacerbated by speed. Cyber security software providers need to get updates out quickly to keep ahead of the criminals who are at least as numerous as the good guys, and not tethered to the same standard of ethical behaviour and financial performance as corporations are.

Needless to say, the internet is now full of technical reckons talking about how they could have done it better. But from a customer perspective, dealing with a mistake of this nature while enormously damaging and costly to everyone (including CrowdStrike) is still better than dealing with the consequences of a major successful cyber-attack.  


Try asking the people engaged to repair Waikato DHB's systems in 2021. The public still hasn't been fully informed about exactly what happened there and still nobody seems willing to talk about it three years and a major restructure later. Just one example of many that slide below the notice of mainstream media but keep cyber-security engineers awake at night.


I constantly counsel my clients against looking for the perfect product in any category. It doesn’t exist and never will. There’s only a balance of risk; what is the likelihood that this product will be less risky than the other one. 


Events such as last Friday are often used as arguments against cloud-based systems and services, but no internal IT department that I’ve run would have been able to come close to the remedial response that CrowdStrike displayed. In a dangerous world the size of your army matters.


Navigating in the fog: Finding truth and resilience


When a crisis happens, clarity is often the first casualty. Breaking news across major media outlets suggested that Microsoft was the problem, which incorrect information was re-broadcast at one stage by the acting Prime Minister. 


This incident emphasises the importance of resilience over perfection. Building systems that can adapt to and recover rapidly from failures is crucial. Redundancies, regular testing, and robust contingency plans are essential components of a resilient infrastructure. Perfection is an elusive goal; instead, the focus should be on continuous improvement and adaptability.


Ultimately, this global event is a reminder that there’s no perfect solution to anything. In our highly interconnected world, the complexity of systems means that errors, however small, can have widespread consequences. The fog of war—whether in the context of cyber warfare or the battleground of technological crises—makes finding the truth and the perfect response challenging.


What we can do is embrace the uncertainty, learn from each experience, and strive to build systems and strategies that are robust and flexible. As we navigate through the confusion, it’s our collective resilience and adaptability that will guide us to more stable and reliable solutions, even if they are never perfect.


In the end, acknowledging our limitations and preparing for the unexpected can transform moments of crisis into opportunities for growth and improvement.


Get in touch with the CIO Studio team to discuss how a systematic and tech-informed approach to digital transformation can help your organisation deliver the very best in patient care.


5 views0 comments

Comments


Sign up for our monthly Digital Digest

Get industry updates, tech news, and CIO Studio blogs free to your inbox!

bottom of page