Security Operations Center - A Complete Guide to Building and Running a SOC
Security Operations Center - SOC models, SIEM/SOAR/XDR stack, L1-L3 team structure, MTTD/MTTR metrics, and OT SOC specifics.
The 2021 ransomware attack on Colonial Pipeline did not paralyze the pipeline because technical defenses failed. Security systems detected anomalies - the problem was that no one responded quickly enough. This is exactly why a Security Operations Center exists: it provides continuous monitoring, rapid analysis, and coordinated incident response before events escalate into a crisis.
According to the SANS Institute, a SOC is a combination of people, processes, and technology that protect an organization’s information systems through proactive design, configuration, continuous state monitoring, detection of unwanted activities, and mitigation of incident consequences.
This guide was created as a consolidation of a five-part article series, supplemented with current data from 2025-2026, with special emphasis on OT environment specifics - where a SOC requires a different approach than in traditional IT.
Security Operations Center Models
The choice of SOC model depends on organization size, budget, staff availability, and strategic priorities. No single solution fits every company.
| Model | Characteristics | Best for | Drawbacks |
|---|---|---|---|
| SOC-as-a-Service | Full outsourcing to an MSSP | Organizations without their own team and infrastructure | Limited control, vendor dependency |
| Hybrid (co-managed) | In-house team + external MSSP | Companies with limited staff needing 24/7 coverage | Requires clear division of responsibilities |
| Dedicated (in-house) | Own center, own 24/7 staff | Large organizations subject to regular attacks | High cost of maintenance and recruitment |
| Multifunctional SOC/NOC | Combined SOC and Network Operations Center | Telecom, companies with extensive IT infrastructure | Risk of splitting attention between tasks |
| Virtual SOC | Distributed team, no permanent location | Outsourcing, remote organizations | Mainly reactive, weaker coordination |
| Command SOC | Manages and coordinates other SOCs | Multi-branch corporations, global operations | Requires the most highly qualified staff |
TIP
For organizations with OT infrastructure the hybrid model works particularly well: an in-house Tier 2/3 team familiar with industrial processes, supported by an external Tier 1 handling 24/7 monitoring.
SOC Technology Stack
Every SOC relies on a set of tools that collect, correlate, and analyze security data. Below is a full map of technology categories that a mature SOC should cover.
SIEM - The Central Security Data Hub
Security Information and Event Management is the core of every SOC. A SIEM collects logs from systems, network devices, applications, and security tools, correlates events, and generates alerts based on defined rules.
Leading SIEM platforms:
| Platform | Strengths | Notes |
|---|---|---|
| Microsoft Sentinel | Native Azure/M365 integration, AI-powered analytics | Cloud-native, per-GB pricing model |
| Splunk Enterprise Security | Advanced search, flexibility | High costs at large log volumes |
| IBM QRadar | Mature event correlation, OT-ready | On-prem and cloud, good IBM X-Force integration |
| Elastic Security | Open-source foundation, scalability | Requires more configuration effort |
The SIEM market is undergoing a transformation. According to Frost & Sullivan, the global SIEM market will reach $13.55 billion by 2029. At the same time, 73% of security leaders are considering changing their SIEM vendor, and 44% plan a full replacement. This is driven by a consolidation trend - platforms unifying SIEM, EDR, NDR, and SOAR into a single solution are displacing fragmented toolsets.
SOAR - Incident Response Automation
Security Orchestration, Automation and Response enables automatic reaction to security events without human involvement. SOAR combines threat data from multiple sources, enables analysis, triage, and prioritization of incidents, and then executes predefined response playbooks.
Key platforms: Cortex XSOAR (Palo Alto), Splunk SOAR (formerly Phantom), IBM QRadar SOAR, Microsoft Sentinel with built-in SOAR.
EDR/XDR - Endpoint Protection and Extended Correlation
- EDR (Endpoint Detection and Response) - monitors activity on endpoints, detects malicious behavior, enables isolation of infected systems
- XDR (Extended Detection and Response) - extends EDR with network, cloud, and email data, correlating events across layers
2025-2026 trend: XDR is absorbing the functions of traditional SIEM and SOAR. According to Trend Micro, the boundary between these categories is blurring, and the market is ultimately heading toward unified SOC platforms.
NDR - Network Traffic Visibility
Network Detection and Response analyzes network traffic for anomalies, lateral attacker movement, and command-and-control communications. In OT environments NDR is particularly important because many industrial devices do not support EDR agents.
Threat Intelligence Platform (TIP)
A Cyber Threat Intelligence (CTI) platform aggregates threat information from multiple sources - commercial feeds, OSINT, sector-specific ISACs - and enriches SOC alerts with context. CTI operates at three levels:
- Strategic - who attacks and why, geopolitical context, trends; for the CISO and board
- Operational - how and where they attack, TTPs of specific APT groups; for IR analysts
- Tactical - IoCs (IP addresses, hashes, domains), YARA/Sigma rules; for automated systems
NOTE
In OT environments it is essential to use CTI sources specific to industrial systems - such as Dragos reports, CISA ICS-CERT advisories, or the MITRE ATT&CK for ICS framework.
Additional Technology Stack Components
- UEBA (User Entity Behaviour Analytics) - detecting anomalies in user behavior, particularly important for insider threats
- IDS/IPS - intrusion detection and prevention systems monitoring network traffic
- CASB (Cloud Access Security Broker) - visibility and access control for cloud applications
- Vulnerability Management - continuous scanning and vulnerability management (e.g., Tenable)
- GRC (Governance, Risk and Compliance) - risk and regulatory compliance management
- Firewalls - rule management, log analysis
People - SOC Team Structure
A SOC is first and foremost about people. The full team structure covers four escalation levels, each requiring different competencies.
Tier 1 - Alert Investigator
The most junior position in the SOC hierarchy. Responsible for continuous system monitoring, analyzing alerts generated by the SIEM, determining their significance and urgency, and triaging events. When an alert confirms a probable incident, the analyst creates a ticket for Tier 2.
Required competencies: system administration (Windows/Linux), basic programming (Python, PowerShell), security certifications (Security+, GCIA).
Tier 2 - Incident Responder
Conducts in-depth analysis of alerts escalated from Tier 1 - determines the attack source, threat nature, scope, and affected systems. Develops a containment and remediation strategy for the incident, then implements it.
Required competencies: incident response experience, forensics, malware analysis, threat intelligence. Certifications: GCIH, ECIH.
Tier 3 - Threat Hunter
A security expert who proactively searches for threats not yet detected by automated systems. Uses advanced analytics, knowledge of APT group TTPs, and attack hypotheses to identify hidden adversaries in the network.
Required competencies: advanced forensics, reverse engineering, red teaming experience, deep knowledge of MITRE ATT&CK.
SOC Manager / Director
Manages SOC operations, reports to the CISO, defines processes, metrics, and budget. Responsible for recruitment, team development, and relationships with business stakeholders.
SOC Processes
The primary task of a SOC is identifying and responding to threats. The scope of operational processes depends on organizational maturity.
1. Monitoring and Analysis
Continuous data collection and analysis - user activity, firewall behavior, system events. The SOC team, enriched with current threat intelligence, looks for patterns and anomalies requiring attention. Key principle: document every step of the investigation process.
2. Incident Response, Containment, and Recovery
Detection and response speed directly impacts the scale of damage. The type, scope, and severity of an incident determine how it is handled - from isolating affected systems, through reimaging and patching, to full forensic analysis and threat eradication.
3. Post-Incident Assessment and Audit
After every incident - lessons learned analysis, procedure updates, and detection rule refinement. Incident documentation serves as evidence for audit and regulatory purposes.
4. Asset Inventory and Baseline
You cannot protect what you do not know. Asset inventory includes identifying all devices on the network, installed systems, applications, and services. Equally important is establishing a baseline - the normal operating state of the system, against which anomalies are detected. More about inventory in the OT context is covered in the ICS asset inventory article.
5. Vulnerability Management
A continuous process of identifying, prioritizing, and eliminating vulnerabilities. This is not a one-time activity - it requires automation and integration with patching processes.
6. Training and Awareness Building
Cybersecurity cannot be the exclusive domain of the SOC team. It is a culture that must permeate the entire organization - from IT and OT to the board.
Key SOC Metrics
SOC effectiveness is measured by specific indicators that help identify areas for improvement and report to the board.
| Metric | What it measures | Why it matters |
|---|---|---|
| MTTD (Mean Time to Detect) | Time from event occurrence to confirmation as an incident | Shorter MTTD = less time for the attacker to operate undetected |
| MTTR (Mean Time to Respond) | Time from incident confirmation to full containment and remediation | Determines the scale of damage |
| False positive rate | Percentage of alerts that turn out to be false positives | High FP rate leads to alert fatigue |
| Detection coverage | Percentage of MITRE ATT&CK techniques covered by detection rules | Measures detection maturity |
| Containment time | Time from detection to isolation of the compromised system | Limits the attacker’s lateral movement |
WARNING
Alert fatigue is a real operational threat. According to Vectra AI (2026), SOC teams receive an average of 2,992 alerts per day, and 63% of them remain uninvestigated. Triage automation and contextual alert enrichment are prerequisites for effectiveness.
SOC for OT Environments
Security monitoring of industrial systems requires a different approach than a traditional IT SOC. The differences affect nearly every aspect - from priorities through tools to team competencies.
How an OT SOC Differs from an IT SOC
| Aspect | IT SOC | OT SOC |
|---|---|---|
| Priority | Confidentiality (CIA: C-first) | Availability and safety (Safety-first) |
| System lifecycle | 3-5 years | 15-25 years |
| Patching | Regular updates | Maintenance windows, vendor approval, testing |
| Agent-based monitoring | Standard (EDR on every endpoint) | Often impossible - passive network monitoring |
| Protocols | TCP/IP, HTTP/S, DNS | Modbus, OPC UA, DNP3, IEC 61850, EtherNet/IP |
| False positive tolerance | Medium | Very low - a false alert can cause unnecessary downtime |
| Incident response | Isolation, reimaging | Controlled disconnection preserving process continuity |
OT SOC-Specific Tools
Traditional IT SIEM tools do not understand industrial protocols. That is why a SOC serving OT environments needs dedicated solutions:
- Dragos Platform - OT network monitoring, threat detection, ICS-specific threat intelligence
- Claroty xDome - OT asset visibility, risk analysis, anomaly detection
- Nozomi Networks - NDR for industrial networks, IT SIEM integration
These platforms integrate with traditional SIEMs (Sentinel, Splunk, QRadar), passing OT context to the central SOC. More about the differences between IT and OT security is covered in the article OT and IT security - together or separate.
Organizational Challenges
The biggest barrier is not technology but people and processes. IT security teams and OT engineers have historically operated in silos with different priorities. Many OT environments lack a complete asset inventory due to limited network visibility and the absence of telemetry compatible with modern tools. Aging OT devices often do not tolerate intrusive monitoring, which limits the use of active discovery methods.
TIP
When building a SOC covering OT, start with passive network monitoring - it requires no agents on devices and does not impact industrial processes. Only after achieving full visibility should you expand detection with rules specific to OT protocols.
The Human Factor - Burnout and the Skills Gap
A SOC is one of the most demanding work environments in cybersecurity.
of SOC analysts experience burnout
are considering leaving their role within a year
of organizations have experienced consequences of the skills gap
of respondents report at least one skills gap
Sources: Tines Voice of the SOC Analyst 2025, ISC2 Cybersecurity Workforce Study 2025
The ISC2 2025 report points to a paradigm shift: the problem is no longer just a shortage of people but a shortage of the right skills. 59% of organizations (up from 44%) report critical or significant skills deficits. The solution is not just recruitment but investment in developing the existing team - cross-training between IT and OT, certifications, and mentoring.
AI in the SOC - 2025-2026 Trends
Artificial intelligence is changing how SOCs operate but is not replacing people.
Gartner formally named “AI SOC Agents” as a category in June 2025. According to their forecast, by the end of 2026 30% or more of SOC processes in large organizations will be performed by AI agents.
Agentic SOC - The Next Generation of Automation
Traditional SOAR operates on rigid, predefined playbooks. The Agentic SOC is the next step - AI agents that can reason, plan, and act dynamically. Instead of executing scripts, the agent evaluates context, connects patterns across disparate data, and independently selects the optimal investigation path.
Real Benefits of AI in the SOC
According to IBM (2025), organizations with high levels of AI and automation adoption:
- save $1.9 million per breach
- reduce the breach lifecycle by 80 days
- cut response time by more than 50%
Limitations and Risks
Satisfaction with AI/ML tools in the SOC ranks last among SOC technologies (SANS 2025). The technology is being adopted but is not yet mature. AI in the SOC excels at automating repetitive tasks (alert enrichment, initial classification) but containment and escalation decisions still require human judgment - especially in OT environments where an incorrect automated response can cause physical consequences.
Mapping the SOC to Standards and Regulations
A SOC does not operate in a regulatory vacuum. SOC processes and metrics map directly to the requirements of key frameworks and standards.
| Standard / Regulation | SOC Relevance |
|---|---|
| NIST CSF 2.0 | The Detect and Respond functions directly describe SOC processes. MTTD/MTTR metrics support demonstrating effectiveness |
| ISO/IEC 27001 | Annex A.8 (Operations security) requires event monitoring, logging, and vulnerability management |
| IEC 62443 | For OT SOCs - requirements for monitoring security zones and conduits |
| NIS2 | Requirement for incident detection, response, and reporting capabilities - a SOC is the natural way to meet these requirements |
| DORA | For the financial sector - requirement for continuous monitoring and resilience testing, including detection capabilities |
Checklist - Building a SOC
The following checklist will help assess an organization’s readiness to launch or mature a SOC:
- SOC model defined (in-house, hybrid, outsourced) matched to organization size and risks
- Complete IT and OT asset inventory
- SIEM deployed with established correlation rules and baseline
- Incident response playbooks defined for at least the 10 most common scenarios
- Team structure with clear Tier 1/2/3 division and escalation paths
- Threat intelligence integration (commercial feeds + OSINT + sector ISACs)
- MTTD/MTTR metrics measured and reported regularly
- Team skills development plan (certifications, IT/OT cross-training)
- For OT: passive network monitoring + dedicated tool for industrial protocols
- Regular tabletop exercises and incident scenario testing
- SOC process mapping to regulatory requirements (NIST CSF, ISO 27001, NIS2)
- Analyst burnout management and retention program
How SEQRED Supports SOC Capability Building
SEQRED helps organizations design and develop security monitoring capabilities - from OT asset inventory, through selecting detection tools, to developing incident response playbooks. Our experience in securing industrial systems and conducting penetration testing of OT infrastructure translates into SOC processes that account for the specifics of industrial protocols, legacy device constraints, and physical process safety requirements.
Sources
- SANS Institute - SOC Definition and Best Practices
- Frost & Sullivan - Modern SIEM Market to Reach $13.55 Billion by 2029
- Tines - Voice of the SOC Analyst 2025
- ISC2 - 2025 Cybersecurity Workforce Study
- Vectra AI - SOC Operations Guide 2026
- IBM - Security Operations in the AI Era 2025
- Help Net Security - Autonomous SOC Operations 2026
- Industrial Cyber - Rethinking Next-Generation OT SOC
- Trend Micro - XDR, SIEM & SOAR Convergence
- NIST CSF 2.0