Security Analytics in Action: Use Cases for Deep Monitoring of Privileged Users

October 13, 2014 Derek Lin

With the recent media coverage of data breaches in today’s world, companies are actively and aggressively looking for greater security measures.

Insider threat detection is one of the most interesting and valuable security use cases, particularly if the threat comes from privileged users. Privileged users have the right to access critical resources and perform activities for day-to-day operations. With significant access privileges, there is risk for potential abuse. For example, the WikiLeaks case of Manning resulted in the loss of sensitive government documents.

In this post, we will address the problem of detecting privilege misuse from help desk administrators and show you how data science is applied in monitoring activities with anomaly alerting. Beyond our general experience with customers in the area, the approach also describes an actual Pivotal Data Science Lab engagement with a client. The real-world program goes beyond the use of authentication records for user-to-resource access anomaly detection as covered in an earlier blog. Here, we examine the actual and most granular command-level user activities and adds service ticket information into the analysis. We use historical data to text mine for behavioral norms and flag anomalous command-level activities. Such deep monitoring over unstructured data presents a newer opportunity to solve problems within security analytics use cases.

Current Problems with Existing Solutions and Scenarios

Help desk administrators respond to support request tickets. At minimum, administrator access privileges, whenever required, must be approved and, at best, is made available over a temporary basis. While here are commercial products supporting the workflow, having the right policy and process for access control is insufficient to prevent privilege misuse. Beyond access control, security workers use simple behavioral rules that trigger alerts based on resource access statistics, looking for events with high velocity of failed logins or excessive traffic volume. These rules have limited success, because such high noise signal is generally not present in activities from legitimate but ill-intentioned users.

Some enterprises go one step further in their attempt to detect privilege misuse by logging command-level activities and analyzing them. But, it is not straightforward to extract informational signal from unstructured text data in these cases. One common approach is indexing all word tokens in logged commands to enable keyword-matching based alerts. These simple keyword-based search rules almost always result in very high false positive rate as evidenced in a client we worked with. In fact, the client had trouble keeping up with the staffing level of the Security Operation to investigate the ever-increasing volume of alerts. Many security use cases use indexing and keyword searching over logs as the primary tool for manual, post-incident investigations. These approaches are not suited for the type of pro-active monitoring solution companies are looking for.

How Data Science Can Help Solve the Problem

Similar to many other anomaly detection use cases, data science is conceptually used to solve the problem by establishing the normal and then identify the entities that deviate from the norm as anomalies. More concretely, data science can achieve this by first finding informational patterns present in the support tickets and then within associated, logged command activities, as shown in the command-line screen capture above.

In other words, we can detect privilege misuse from an information signal that exists across logged command-level activities and support request tickets. Since human eyes can discern command-level activities that are not consistent with actions requested in support tickets, we can also teach a system to identify these inconsistencies. For example, if a service ticket was opened with an english sentence requesting database write access, subsequent command line activities could be logged and related to the original ticket. By correlating the statistical clusters derived from the service tickets and command activities, a system could, for example, uncover where an activity contains a ‘cp’ command to copy /etc/passwd file to some other location. This type of command would not be within the scope of the original service request for write access, and this activity would raise an alert.

Service request tickets can be clustered to groups in a principal way, and there is a similar approach for clustering logged command activities to groups. Text mining methods can then be used to find structures in the unstructured text of tickets and commands along with their arguments and parameters. The core methodology is similar to our past work that was described in the previous blog. In it, data science is used for automatic IT ticket clustering and this is also discussed in the work of others ^[1]. Effectively, individual tickets and activities have higher-level abstract representations or classes, and we can identify these by analyzing volumes of historical data. For examples based on text information only, service tickets describing the same type of requests will likely share common words and would be in a cluster. As well, command activities that share similar arguments or parameters would be in a cluster. With such an analysis, there are a number of possibilities. For example:

A cluster of command activities can have attributes such as number of administrators who issued these commands, or number of days these commands are used. An outlier cluster that has a small attribute value, say, in the number of administrators, represents rare activities performed by just a few users in population and is interesting to identify.
A knowledge base can be built to associate classes of tickets to classes of activities using an a-priori algorithm to find common associations. Activities not conforming to the knowledge base relationships become notable.

Combined with these two examples, other contextual information, such as administrator’s tenure, rarity of issued commands, and age of commands, can be used to construct policy rules and find anomalous activities. Once the patterns are trusted, we could use real-time analysis capabilities to flag events as they happen in real time if needed.

[1]: “Unveiling clusters of events for alert and incident management in large-scale enterprise IT”, KDD’2014, Lin ,Raghu, Ramamurthy, Yu, Radhakrishnan, Fernandez

To Learn More:

Read more articles by Derek that cover data science and IT security
Check out Pivotal Data Labs
Find more blog articles about data science, big data, and cloud computing platforms
Get more information on Pivotal’s data science software and data lakes

About the Author

Biography

What to do with a bullet-pointed list of features

Our client came in with a short bullet-pointed list of features they want for their new iPhone app. Part of...

Mobile Video Big Data Architecture with Spring XD/Hadoop/HAWQ/Redis: Measuring Live Usage

In this post, Allan Baril, Pivotal Lab’s Director for the Internet of Things, outlines the architecture and...

Security Analytics in Action: Use Cases for Deep Monitoring of Privileged Users

Current Problems with Existing Solutions and Scenarios

How Data Science Can Help Solve the Problem

About the Author

Previous

Next

Security Analytics in Action: Use Cases for Deep Monitoring of Privileged Users

Current Problems with Existing Solutions and Scenarios

How Data Science Can Help Solve the Problem

About the Author

Previous

Next

Related content in this Stream

Following the xz supply chain attack blog, explore security and trust in open source with VMware Tanzu's secure container solutions and proactive measures.

VMware Tanzu empowers Netflix accelerates its service evolution and boosts the capabilities of its development teams. Tanzu helps to provide them with the platform to run on and scale.

Unveil regulatory compliance ease with VMware Tanzu Spring Runtime! Elevate audits, adhere to FIPS & NIST standards, benefit IT, DevOps, and Auditors.

Uncover open source risks and the 'Zero CVE' myth with insights on continuous lifecycle management. Discover how VMware Tanzu supports diverse projects effectively.

This 7-part blog series provides a roadmap for architecting a data science platform using VMware Tanzu. We'll delve into the building blocks of a successful platform that drives data-driven insights.

This 7-part blog series provides a roadmap for architecting a data science platform using VMware Tanzu. We'll delve into the building blocks of a successful platform that drives data-driven insights.

This 7-part blog series provides a roadmap for architecting a data science platform using VMware Tanzu. We'll delve into the building blocks of a successful platform that drives data-driven insights.

This blog provides a summary of VMware Tanzu CloudHealth news and product updates for the month of April, 2024

This 7-part blog series provides a roadmap for architecting a data science platform using VMware Tanzu. We'll delve into the building blocks of a successful platform that drives data-driven insights.

How VMware Tanzu CloudHealth helps customers uncover spiraling AWS Extended Support charges.

VMware Tanzu enhances Spring development with simplified operations, accelerated innovation, seamless microservices transition, increased security, and effortless scaling.

This 7-part blog series provides a roadmap for architecting a data science platform using VMware Tanzu. We'll delve into the building blocks of a successful platform that drives data-driven insights.

This 7-part blog series provides a roadmap for architecting a data science platform using VMware Tanzu. We'll delve into the building blocks of a successful platform that drives data-driven insights.

This 7-part blog series provides a roadmap for architecting a data science platform using VMware Tanzu. We'll delve into the building blocks of a successful platform that drives data-driven insights.

Bitnami-packaged open source software is loved by developers for its ease of use, which enables developers to directly pull a Bitnami package and seamlessly start using it with little effort.

VMware Tanzu announces the General Availability of AWS Commitment Discount Recommendations, which provides recommendations for all reservable services in AWS through VMware Tanzu CloudHealth.

Introducing VMWare Tanzu Data Hub, a self-managed Database as a Service (DBaaS) Platform, providing enterprises a way to host their internal DBaaS offering for internal business users.

In the cloud-native landscape, MCAs drive seamless compliance integration. Their expertise ensures proactive security measures align with regulatory standards for sustained innovation & collaboration.