Highlights:

  • While it’s clear that CrowdStrike’s update caused the outage, there are questions about whether Microsoft should share some of the blame.
  • Microsoft is working with cloud providers like Google Cloud and Amazon Web Services to share industry impact updates and inform discussions with CrowdStrike and customers.

Microsoft Corp. has disclosed that CrowdStrike Holdings Inc. impacts 8.5 million Windows PCs. It resulted in widespread global outages. Both companies are continuing to support affected customers.

The disruption from a CrowdStrike Falcon security software update spread globally on Friday, July 19, affecting banks, airlines, government services, and more, often resulting in the Windows “blue screen of death.” While the issue was not a cybersecurity breach but rather a sign of incompetence, its effects—including ongoing delays and system problems—persisted through Sunday and could continue into the week.

In a blog post, Microsoft stated that although it was “not a Microsoft incident,” the company is assisting its customers with technical guidance and support to restore disrupted systems safely. Alongside its collaboration with CrowdStrike, Microsoft has deployed hundreds of engineers and experts to help customers directly.

Microsoft also mentioned that it is collaborating with other cloud providers and stakeholders, such as Google Cloud Platform and Amazon Web Services Inc., to share insights on the impact observed across the industry. This collaboration aims to “inform ongoing conversations with CrowdStrike and customers.”

“We’re working around the clock and providing ongoing updates and support. Additionally, CrowdStrike has helped us develop a scalable solution that will help Microsoft’s Azure infrastructure accelerate a fix for CrowdStrike’s faulty update,” David Weston, Vice President of Enterprise and OS Security at Microsoft, said in the blog post.

Regarding the affected PCs, Weston pointed out that while software updates sometimes cause disruptions, major incidents like the CrowdStrike outage are “infrequent.” He also mentioned that although fewer than 1% of all Windows machines were impacted, the widespread economic and societal effects are due to CrowdStrike being used in many critical services.

“This incident demonstrates the interconnected nature of our broad ecosystem — global cloud providers, software platforms, security vendors and other software vendors, and customers. It’s also a reminder of how important it is for all of us across the tech ecosystem to prioritize operating with safe deployment and disaster recovery using the mechanisms that exist,” Weston added.

While it’s clear that CrowdStrike’s update caused the outage, there are emerging questions about whether some of the blame should also be directed at Microsoft.

“This incident is Microsoft’s fault, not CrowdStrike’s fault. Yes, CrowdStrike pushed a kernel-level update that causes widespread blue screens. Yes, that should have been caught during QA and I’m sure we will get an after-action report that details why release procedures didn’t catch it. But software bugs happen. They are unavoidable — even for top-tier shops like CrowdStrike,” J.J. Guy, Chief Executive Officer of Exposure Management Company Sevco Security Inc., told in an interview.

“This is a high-impact incident not because there was a blue screen, but because it causes repeated blue screens on reboot and [appears as of now] to require manual, command-line intervention on each box to remediate, and it’s even harder if BitLocker is enabled. That is the result of poor resiliency in the Microsoft Windows operating system. Any software causing repeated failures on boot should not be automatically reloaded. We’ve got to stop crucifying CrowdStrike for one bug, when it is the OS’ behavior that is causing the repeated, systemic failures,” Guy added.