SOC Mistake #1: You're Over-reliant on Protective and Detective Controls (Part 2)

Continuing on with the topic of being overly reliant on protective and detective controls from SOC Mistake #1: You're Over-reliant on Protective and Detective Controls (Part 1) in this post I’m going to talk about several other issues related to this posture and how adding additional controls is not really moving the cyber risk needle that much.

All of these protective and detective controls, whether they are traditional rules-based or using blackbox ML-Voodoo/Magic ( a vendor recently couldn’t describe how it worked when presenting to me but assured me it was capable of detecting and preventing all possible attacks) , need maintenance. This may take the form of the ingestion and contextualisation of threat intelligence information to assess whether there is current coverage of a new attack technique and the creation of associated rules generic enough to capture variants of the attack without drowning the analysts in a sea of false positives or it may take the form of tuning false positives/false negatives created by the rules or models to tack along with changes in the organisation or adversary behaviour.

Castle Walls and Moats

We’ve focused on building higher walls and wider moats, while adversaries have just built taller ladders and better boats - the adversary always has first mover advantage. The walls will get breached at some point, don’t set the expectation with the organisation that they won’t.

We need agility to be able to adapt to changes in adversary behaviour, they way we deliver IT and the business environment

We need agility to be able to adapt to changes in adversary behaviour, they way we deliver IT and the business environment

Whatever happens, you’re going to need people and processes to do this, the more tools the more people and the more iterations of the tuning process. With an average of 130-different security tools installed in the enterprise, this is a formidable amount of infrastructure, just when organisations are getting used to using cloud services because they want to focus on delivery of their products and services, rather than on managing infrastructure. CISOs are setting expectations that they can prevent incidents “if only” they had more people…they had more tools…they had more budget. Then when they have a significant incident, it’s a surprise to the organisation - they’ve had the belief that the 85 - 95% of security budget spending directed at prevention and detection would stop impacts to the organisations: no wonder so many CISOs have very short tenures. I’ve said it before, over a certain bare minimum there starts to be no linear correlation between budget, headcount and operational cyber risk management: approach (closely related to politics and culture), operationalisation od what you do have and continual improvement is usually the mark of those with effective and highly efficient risk management.

Impact forms half of the risk calculation (but not half the risk) yet it only receives 5 - 15% of the spend due to the moat-and-wall rather than resiliency mindset.

CISOs that can have collaborative conversations with IT and business leaders around resilience are the one who are living in reality,. The cyber risk equation has two factors: likelihood and risk, yet we spend a vast majority of our budgets on only one half. This obsession with likelihood only and seeing a large part of response, and all of recovery, as “IT’s problem” doesn't provide the operational resilience an organisation needs in an era of ransomware and wiper attacks.

So these teams are swelling to deploy and manage these different tools, all at a time of a global infosec skills shortage causing inflation in salaries. The headcount related to preventative and detective controls infrastructure, even before we get to triage, investigation and tuning of alerts, is significant.

Then we have the annual hamster wheel game of needing to “spend-the-budget-or-loose it”, we do our annual risk assessment, we identify a gap in our controls that is conveniently budget sized, we issue an RFI, we down select, we do a proof-of-concept, we deploy…..and it’s now two months out from our next risk assessment cycle - leaving little time for operationalisation and integration of the tool - more swivel chairing and more noise in more consoles.

The average enterprise has over 130 security tools installed

All these tools add noise, they add complexity, they create friction with users, they add attack surface, they reduce agility and can we demonstrate how much each tool is moving the needle, most CISOs can’t. Most of the problems we face as CISOs is debt from not dealing with foundational issues, and these can’t be fixed in annual budget cycles. We need to move the conversation away from a fiction of a cyber militia that defends the unscalable and indestructible walls of a castle to to building processes, technology and relationships that allow the walls to be scaled and destroyed without loosing the castle’s keep or disrupting life too badly inside the castle while the attack is going on. We know the marauding hordes are out there, eventually they’re going to see the silhouette of our ramparts in the distance.

The ability to handle incidents as a business-as-usual activity is the adult conversation to have with the business: it’s not “give me more budget so I can stop the attack”, it is “adversaries have first mover advantage and no control, or combination of controls prevents complete protection even with unlimited budget, how are we going to prepare for and whether this attack with the minimal amount of impact”. The impacts we’re seeing across the globe from ransomware isn’t a cause, it’s a symptom: it’s a symptom of a lack of resilience.

CISOs historically haven’t had to worry about primary losses, only secondary.

Historically CISOs really only had to worry about secondary losses: damage to reputation, litigation and regulatory fines. These are largely a sunk cost in that they really don’t get much worse over time after an incident has been discovered: the records are lost and you normally pull the plug and prevent more getting lost. Yes reputation gets worse over time slightly if you completely balls up the incident response, but not to a significantly impactful degree when looking at customer acquisition and retention a year down-the-line. For years vendors have used FUD against these three secondary losses to scare boards and CISOs into spending on their magical prevention and detection magic-bullets but now CISOs are actually facing primary losses: the inability of an organisation to deliver its core services and products.

With the advent of ransomware, the CISO is left holding the toilet chain for the incident, yet it’s IT that provide much of the plumbing to have resilience and minimise impact.

With the advent of ransomware, cyber events are now having the same impacts as historical business continuity and disaster recovery scenarios on business operations - not a place the CISO is used to being in. Now the CISO is holding the toilet chain for ransomware events, but he has historically relied on the plumbing of IT to help with response and recovery. Simply throwing a text ticket at IT to recover a system doesn't cut it when there are tens, or hundreds-of-thousands of machines to restore - oh and your backup is encrypted, so is your SIEM and your VoIP systems don’t work….and you don’t have any IDAM servers. Cyber resilience needs adult conversations and adult relationships between IT and cyber security.

In order to be cyber resilient organisations need to pay attention to all functions of the NIST Cyber Security Framework.

So what is the answer? Well my belief is that already have too many security tools installed, organisations should go back to the principles of threat intelligence and threat modelling and map their threats to MITRE ATT&CK and then identify which controls offer the greatest benefit across the relevant techniques. The cost of purchasing and maintaining each control should be considered in this calculation - I’m a big fan of using quantitative or semi-quantitative methods for this and can’t recommend Doug Hubbard’s book enough that introduces pragmatic ways to do this.

Once you’ve got your organisation pointed in this direction, it’s time to address the likelihood and impact imbalance. Adult conversations with the business about responses to incidents and efforts to reduce impact deliver much greater movement on the cyber risk needle and quick time-to-value than the purchasing of your 131 likelihood magic bullet.

In a cyber resilience scenario, in most organisations additional spending on prevention and detection will only bring small incremental changes in cyber risk and still leave the organisation exposed to primary losses, spending on impact mitigation move the cyber risk needle more and delivers quicker time-to-value without the complexity, friction, headcount and budget demands of further likelihood-reduction tools.

I’m going to put a post up to about this journey and its overlap with operational resilience soon.

Previous
Previous

My article (in French) on the importance of Backup in Cyber Resilience for World Backup Day

Next
Next

Some of my quotes in Verdict article on the effects of Covid and the War in Ukraine on Cyber Security