Security Operation Centers: Analyzing COVID-19's Work-from-Home Influence on Endpoint Management and Developing a Sociotechnical Metrics Framework
Drew Davidson
Fengjun Li
Bo Luo
John Symons
Security Operations Centers (SOCs) are central components of modern enterprise networks. Organizations in industry, government, and academia deploy SOCs to manage their networks, defend against cyber threats, and maintain regulatory compliance. For reporting, SOC leadership typically use metrics such as “number of security incidents”, “mean time to remediation/ticket closure”, and “risk analysis” to name a few. However, these commonly leveraged metrics may not necessarily reflect the effectiveness of a SOC and its supporting tools.
To better understand these environments, we employ ethnographic approaches (e.g., participant observation) and embed a graduate student (a.k.a., field worker) in a real-world SOC. As the field worker worked in-person, alongside SOC employees and recorded observations on technological tools, employees and culture, COVID-19's work-from-home (WFH) phenomena occurred. In response, this dissertation traces and analyzes the SOC's effort to adapt and reprioritize. By intersecting historical analysis (starting in the 1970s) and ethnographic field notes (analyzed 352 field notes across 1,000+ hours in a SOC over 34 months) whilst complementing with quantitative interviews (covering 7 other SOCs), we find additional causal forces that, for decades, have pushed SOC network management toward endpoints.
Although endpoint management is not a novel concept to SOCs, COVID-19's WFH phenomena highlighted the need for flexible, supportive, and customizable metrics. As such, we develop a sociotechnical metrics framework with these qualities in mind and limit the scope to a core SOC function: alert handling. With a similar ethnographic approach (participant observation paired with semi-structured interviews covering 15 SOC employees across 10 SOCs), we develop the framework's foundation by analyzing and capturing the alert handling process (a.k.a., alert triage). This process demonstrates the significance of not only technical expertise (e.g., data exfiltration, command and control, etc.) but also the social characteristics (e.g., collaboration, communication, etc.). In fact, we point out the underlying presence and importance of expert judgment during alert triaging particularly during conclusion development.
In addition to the aforementioned qualities, our alert handling sociotechnical metrics framework aims to capture current gaps during the alert triage process that, if improved, could help SOC employees' effectiveness. With the focus upon this process and the uncovered limitations SOCs usually face today during alert handling, we validate not only this flexibility of our framework but also the accuracy in a real-world SOC