SOC Mistake #10: You confuse your SOC with your NOC
Network Operations Centres (NOCs) are responsible for the operational monitoring of infrastructure and services. Their function is to identify, investigate, prioritise and escalate/resolve issues that could, or do, effect performance or availability. A Security Operation Centre (SOC) shares much in common with a NOC, it’s function is to identify, investigate, prioritise and escalate/resolve issues that could, or do, effect the security of an organisation’s information assets.
It is no surprise then that I am frequently asked by customers looking to build a SOC “Why can’t we use our NOC for this function?”. I can understand the motivation behind this question, once you’ve stood up your Security Information & Event Management (SIEM) platform, identified your use cases, got the right event sources feeding events into the SIEM and then got your SOC procedures nailed, the largest cost of running a SOC is typically headcount.
There are, however, a few reasons why a combined SOC and NOC isn’t always a good idea:
They serve different, often conflicting, masters.
Within organisations there is often a conflict between operations and information security teams – information security want to pull the plug on an compromised server that happens to be hosting a critical service; they want vulnerabilities patched as soon as they are available, often without fully testing the impact on operations; they can’t understand why dealing with an incident isn’t always the top priority for the operations team. Likewise, operations often stand-up new pieces of infrastructure without notifying the security team or going through change control; they may not fully harden platforms prior to deployment to “meet a tight deadline”, we’ll come back and patch it later; they may not apply critical patches through lack of a testing environment.
The NOC is often measured and compensated for its ability to meet Service Level Agreements (SLAs) for network and application availability, Mean Time Between Failures and application response time. In contrast SOCs are measured on how well they protects against malware; their protection intellectual property and customer data; and ensuring that corporate information assets aren’t misused. The business driver behind both of these is to manage business risks – in a NOC, for instance, the loss of revenue or compensation for breach of an SLA; in a SOC, regulatory fines or loss of customer confidence.
NOCs are about availability and performance, SOCs are about security. Even with the best intentions, having the team responsible for availability and performance make decisions about incident response and the application of controls that will, invariably, impact on the availability and performance of services (even if it is just through the diversion of human resources), is never going to work well/
NOCs and SOCs certainly should be in close co-ordination. One of the best ways of achieving this is to ensure the NOC has a view on of the SIEM platform. I’ve seen SOCs react to “large scale Distributed Denial of Service attacks” that have been the result of legitimate traffic after the launch of a new service, and I’ve seen subtle patterns detected by alert NOC analysts result in uncovering wide-scale penetrations within organisations. When it comes to actually responding to a confirmed incident, operations and information security must work hand-in-hand to investigate, contain, eradicate and recover from the attack with appropriate and proportionate responses. Working together in a collaborative manner as a part of an incident response team, a SOC and NOC help ensure that right balance.
A well-implemented collaboration strategy between a NOC and SOC should identify that the SOC’s function is to analyse security issues and to recommend fixes and then the NOC analyses the impacts of those fixes on the business, makes recommendations on whether to apply the fix, makes the appropriate approved changes and then documents those changes.
The skills needed in, and the responses required from, a NOC analyst and SOC analyst are vastly different
NOC analysts require a proficiency in network, systems and application engineering, whereas SOC analysts require skills in security engineering. The tools and processes used for monitoring and investigating events also differ, as does the interpretation of the data they produce: A NOC analyst may interpret a device outage as an indicator of hardware failure, while a SOC analyst may interpret that same event as evidence of a compromised device. Likewise, using the example I gave above, high bandwidth utilisation will cause the NOC to take steps to ensure availability, in contrast the SOC may first question the cause of the traffic spike, the reputation of it’s origin and correlations against other known attacks.
One of the biggest differences between a SOC and a NOC is that a SOC is looking for “intelligent adversaries” as opposed to naturally occurring system events such as network outages, system crashes and disk failures. While these naturally occurring system events can, in fact, be caused by the actions of “intelligent adversaries”, their concern is about the restoration of the quality of service as soon as possible – even if this involves the destruction of evidence that would allow the investigation of the cause.
Staff attrition is waaaaaay worse in a SOC
Level 1 SOC Analysts, those responsible for the triage of incoming events burn out with often alarming regularity. The average tenure of a Level 1 SOC Analyst is typically less than two years and can be as high as 20% per annum. In contrast the tenure and turnover of NOC staff is typically much better.
This attrition within a SOC needs to be planned for with a suitable feeder pool of new candidates and an effective on-boarding training scheme to teach them about the use of the SIEM platform, the analytical skills need to investigate incidents and internal procedures. Developing a career progression plan for your analysts will also allow you to retain these valuable resources within your business, potentially moving them to security engineering or incident response positions.
Despite everything I’ve said above it is possible to run an effective coverage SOC/NOC, but it can take more effort, operational expense and better governance than running them as separate functions. The potential benefits can lie through the introduction of a single point-of-contact for all security and operational issues, as well as the tight integration between those who discover and react to information security incidents, and those who have to deploy and manage the mitigations post event. Whether you choose to keep the functions separate or integrate them, it is important to understand the differences between the functions.
Image: Justin Baeder