<img height="1" width="1" style="display:none" src="https://www.facebook.com/tr?id=725552337608854&amp;ev=PageView&amp;noscript=1">

Three Minutes Until the Apocalypse

Posted by John Bradshaw on Nov 7, 2017 6:28:00 AM

Find me on:

 This blog covers three main topics:

    • Three Key Questions Needing Answers Within Three Minutes When You Suspect a Breach
    • Using Deception and Endpoint Logs to Backtrack Command and Control
    • Improving SOC Triage Workflow with Prevention Failure Detection

Picture2.pngAn adversary has targeted your organization and commenced campaign to breach your defenses, establish a foothold and begin to either gather up your proprietary information or encrypt it and hold you hostage until you pay the ransom demand. If your present security solutions provide any warnings, they will be sent along to the front pane of glass of your Security Operations Center (SOC). Here, it is up to your team of analysts performing triage of the alerts to separate the wheat from the chaff, look at the alert details, and determine whether the alert is telling them something impactful is occurring, or decide it can be ignored.

What could possibly go wrong? Plenty.

RACE TO THE BOTTOM (OF THE ALERT PILE)

Most organizations utilize a Security Information Event Management (SIEM) solution in their SOC to aggregate, correlate and prioritize alerts presented to the front line SOC Analyst.  Initial triage of alerts is generally handled by a Level I Analyst – often the newest, and most inexperienced members of the team.  With Network-based IDS often spitting out 40 events per second along with a myriad of other security solutions and operating / application logs feeding into the SIEM, it is a daunting task to keep up with the alerts on the screen.  To further increase the pressure, SOC Analysts are usually expected to triage an alert in three minutes or less.  Get it right, you live to triage another day; get it wrong, your stock price tumbles, people lose jobs and your company gets a ton of negative press.

Security Teams need to do three things to make their lives better:

      1. Refocus front-line triage on Prevention Failure Detection (PFD)
      2. Prioritize solutions that provide high-fidelity, but low-volume alerts during triage
      3. Enable correlations that answer three key questions every SOC Analyst must know when investigating alerts

PREVENTION FAILURE DETECTION

I first heard the term “Prevention Failure Detection” from a friend of mine, Tim Crothers, Vice-President of Cyber Security for Target Corporation. PFD refocuses your detection capabilities away from trying to detect everything happening in your environment and instead focus your detection efforts on where your prevention capabilities are most likely to fail.  It also focuses your alerting, visualizations and SOC displays not on everything that has been seen, but only what your PFD workflows say is important.  You don’t need pie charts and bar graphs telling you how many times your AV quarantined a file or your firewall blocked an access attempt – those distract your team away from the alerts that matter.  You want to establish clear hunting grounds for your SOC and Incident Response teams to focus their efforts.

Picture3.png

HIGH-FIDELITY ALERTS

If you knew an alert was a true positive every time it fired, how would that impact your workflows and decision process in handling that particular incident?  High-Fidelity alerts essentially mean you can trust and act on the information contained within the alert.  They also tend to be very low in volume (unless you’re having a really bad day).  There are not many solutions out there that can claim zero false positives (and I would be wary of any vendor that does make that claim!); however, let’s consider how Deception Solutions rate when looking at Fidelity and Alert volume.

Deception and High-Fidelity

Deception-based solutions utilize decoys and misinformation to divert and delay an adversary giving the SOC / IR teams sufficient time to perform remediation before the adversary can complete his mission.  Deception objects are not known to normal end-users and are white-listed against allowed vulnerability and IT Asset Discovery scanning systems in the organization – so no one should ever touch a deception decoy.  Let’s consider the possible ways a decoy could be touched:

      1. Network Misconfiguration – a scanner was missed in the whitelist or some other misconfiguration causes a system to attempt communications with a decoy
      2. Curious Insider – an end-user or system administrator pokes around outside of their normal duties, comes across a decoy and reaches out to see what the system is all about
      3. Malicious Insider – an end-user or system administrator is looking to steal information or cause disruption and stumbles across a decoy while looking for the crown jewels
      4. External Adversary – an adversary of varying skill level and resources has evaded your prevention layers and is now poking around inside your network

In all four cases, some type of action is required that demands immediate attention.  The first two are not malicious in nature and will most likely involve different groups resolving the issue other than the security teams (most likely Network Operations for the first and Human Resources for the second).  The last two are malicious and require immediate escalation and gathering of additional information to learn the full nature of the attack.

Deception and Low-Volume

Deception is a Breach Detection Solution.  By that we mean that Deception is not generally used to detect Intrusion Attempts or even Breach Attempts.  Deception is a great Prevention Failure Detection solution because it focuses detection capabilities on adversaries and malware that have already successfully bypassed your prevention capabilities.

If we take a typical breach scenario, an adversary will spear-phish an end-user, get them to click on the malicious attachment or link, a payload gets downloaded and/or detonated on the end-user’s system and command and control is established between the adversary and the compromised system. 

Breach Accomplished.  The Doomsday Clock starts ticking.

Many security solutions had to fail for this to happen.  This first beachhead is not the mission of the adversary, they want your data or to disrupt your operations.  They must establish additional beachheads, reach out to application and database servers, map out your organization’s assets and determine what are likely targets.  Most intrusion / breach attempts will be blocked by your prevention technologies, you aren’t losing sleep over those.  It is the ones that get through you need to lose sleep over - and this is where Deception solutions step in and present the adversary with inviting targets – targets that only an adversary should be touching.

Because of this, Deception alerts are few and far between (as I said earlier, unless you’re having a really bad day!) 

THREE KEY QUESTIONS EVERY TRIAGE ANALYST MUST ANSWER

Whenever I have seen an analyst investigating a network-based alert (alerts that only contain IP addresses, ports and service information) there are always three questions at the top of their minds – and if they had the answers would greatly speed up the time to triage as well as accuracy of any decisions made.

      1. User - What is the endpoint user session responsible for causing this alert to occur (I.E.: Which user clicked on something they shouldn’t have?)
      2. Process - What is the endpoint user session’s process and parent process responsible for causing this alert to occur (I.E.: Was it a user’s interactive program like Chrome or Firefox, or was it an underlying process that normally doesn’t communicate like Explorer?)
      3. Network - Who else has this system communicated with in the past few minutes of this alert being generated?

Answering Question #1 lets me focus on what roles and permissions that user has so I can determine the potential extent of a breach (does this user have local admin rights?  Is it a domain administrator?).

Answering Question #2 can tell me if this was a user-initiated action (John clicked a website in Chrome and downloaded a malicious payload) or an adversary-initiated action (Adversary through an injected svchost process downloaded mimikatz to the workstation).  One is pre-breach/pre-detonation, the other is post-breach requiring a different level of urgency.

Answering Question #3 tells me if the adversary has laterally moved, can I identify potential command and control servers or if a malware detonation is spreading to other systems.

PREPARING FOR THE APOCALYPSE

Now that we know some key questions needing answers, we turn our attention to high-fidelity alerts that are focused around Prevention Failure Detection.  We need to identify potential base and correlated events that will help us realize our vision.  Once the logs, alerts and workflows have been identified, we can begin building out content that will make alert triage efficient and actionable.

Base Logs and Alerts

This paper will not address every potential log source and operating system; however, the concepts should be universally applied to your environment.  For purposes of this paper, we will focus on Windows Workstations (because they are one of the most likely points of first breach) for our event source and ArcSight and Splunk (because that’s what I know) for our SIEM correlation efforts.  As always, there may be more than one method to accomplish the goal, organizations should explore what methods fit best to their environments.

Event Logs

Log Source

Description

Key Fields Required

Windows Endpoint  Log - Sysmon

Sysmon is a Microsoft Sysinternals tools that can be installed on Windows Workstations that provide additional logging capabilities.  Sysmon will allow us to log network connections with associated process information.

https://docs.microsoft.com/en-us/sysinternals/downloads/sysmon

Network Tuple

Process ID

User Session

Windows Endpoint Log – Security

The Windows Security Event file contains log entries for all processes created (and terminated) on the endpoint.  Process creation/termination auditing must be enabled in the security policy.

User Session

Process Name

Process ID

Parent Process ID

High-Fidelity Network-based Alert

This can really come from any source so long as you consider it to be a high-fidelity alert that you always want correlated against the other event sources described above.  For this paper we will be using Deception-based Network alerts from the Acalvio ShadowPlex solution.

Network Tuple


CORRELATING HIGH-FIDELITY ALERTS

The first step in defining solid correlation use cases is being able to define and understand the problem.

Let’s review a (highly simplified) diagram of a typical breach by an adversary.  The attacker sends a spear-phishing email to an end-user who opens the attachment or clicks on the link.  The detonated process establishes persistence to ensure it will run if the system is rebooted or the user logs out/in.  When this process runs, it will usually inject itself into a legitimate process and establish command and control (C&C) back to the adversary.  From his remote system, the attacker will direct activities through the C&C channel and use tools that he downloads, or built-in system programs, to laterally move to other targets in the environment. 

Adversary Breach Scenario.png

If I start with my high-fidelity alert and work backwards, I can establish the correlations required to provide answers to those three most sought-after questions:

Correlate:

  • Deception-based Alert
  • Sysmon Network Process Log

Correlate:

  • Sysmon Network Process (above)
  • Process Creation Log

Correlate:

  • Process Creation Log (above)
  • Parent Process Creation Log 

ARCSIGHT HIGH-FIDELITY BREACH ALERT CORRELATION EXAMPLE

Here is what the associated logic would look like in the ArcSight Common Condition Editor (CCE):

Picture5.png

SPLUNK HIGH-FIDELITY BREACH ALERT CORRELATION EXAMPLE

Here is what the associated logic would look like in the Splunk Search Interface if sending the events in tagged-field format:

Deception and Matching Endpoint Events

("NETWORK CONNECTION DETECTED" AND "MICROSOFT-WINDOWS-SYSMON/OPERATIONAL" AND NOT "SOURCEISIPV6: TRUE") OR ("ACALVIO|SHADOWPLEX") | REX "CEF:1\|ACALVIO\|.* SRC=(?<DECEPTIONSOURCE>\D{1,3}\.\D{1,3}\.\D{1,3}\.\D{1,3})" | REX "CEF:1\|ACALVIO\|.* DST=(?<DECEPTIONDESTINATION>\D{1,3}\.\D{1,3}\.\D{1,3}\.\D{1,3})" | REX "CEF:1\|MICROSOFT\|.* SRC=(?<SYSMONSOURCE>\D{1,3}\.\D{1,3}\.\D{1,3}\.\D{1,3})" | REX "CEF:1\|MICROSOFT\|.* DST=(?<SYSMONDESTINATION>\D{1,3}\.\D{1,3}\.\D{1,3}\.\D{1,3})" | REX "SPROC=(?<SOURCEPROCESS>\S.*?\ )[A-Z].*?\=?" |  REX "CEF:1\|ACALVIO\|(.*?)\|(.*?)\|(.*?)\|(?<IDSALERT>.*?)\|" | EVAL TESTSIP=COALESCE(DECEPTIONSOURCE,SYSMONSOURCE) | EVAL TESTDIP=COALESCE(DECEPTIONDESTINATION,SYSMONDESTINATION) | TRANSACTION TESTSIP TESTDIP MAXSPAN=5M | SEARCH "ACALVIO|SHADOWPLEX" SYSMON EVENTCOUNT>1 | EVAL "COMPROMISED IP"=TESTSIP, "DECOY IP"=TESTDIP, "COMPROMISED USER SESSION NAME"=SUSER, "COMPROMISED PROCESS NAME"=SOURCEPROCESS | TABLE _TIME IDSALERT,"COMPROMISED IP", "DECOY IP", "COMPROMISED USER SESSION NAME", "COMPROMISED PROCESS NAME"

 

RELATED COMMUNICATIONS FROM SUSPECT ENDPOINT

Now that I have a high-fidelity alert correlating against endpoint process logs, I also want to be able to gather up any other communications the compromised endpoint transmits.  The goal is to help the analyst identify the attacker’s C&C channel more effectively.

Matching Communications to Breach Alert

Here is an example of the correlation rule in the ArcSight CCE, we are aggregating all matching events within a seven-minute time window:

Picture6.png 

PUTTING IT ALL TOGETHER

With this correlation in place, the SOC Analyst has visibility into systems involved, program/process information, user session and the parent process ID responsible for the security alert.:

QUESTION 1: WHO IS THE RESPONSIBLE USER AND PROCESS?

This correlation clearly tells me it is Steve’s user session running firefox that reached out and touched the Deception Decoy. 

Correlation Alert in Arcsight

Picture7.png

Picture01.png

 

Correlation Alert in Splunk

Picture02.png 

QUESTION 2: WHO ELSE DID THE COMPROMISED HOST RECENTLY COMMUNICATE WITH?

Correlating other communications around the time of the high-fidelity alert can yield a wealth of information.  I see this workstation had approximately nine other systems it was communicating with at the time of the alert, one that stands out as highly suspicious.

I see there is a process launched from the end-user’s Local/Temp directory, while the other processes are all located in system directories.  This process would get immediately flagged for investigation by the SOC Analyst.

Correlation Alert in Arcsight

Picture03.png

If I pivot on that correlation rule to see the underlying events, I get further detail about the host involved.  I highly suspect I am looking at the process and host the adversary is using for Command and Control of the compromised endpoint.

Picture04.png

QUESTION 3: WHAT IS THE PARENT PROCESS FOR THE RESPONSIBLE COMMUNICATION?

As we see in Question 1, the Parent Process ID of the process that generated the alert is 2340.  I can pivot in my logs searching for the Process Creation event for this Process ID to find out which process is most likely injected with malicious code.  Bear in mind this process could have been created when the user session originally started, Active Lists in ArcSight and additional correlation techniques could help locate this faster, but are a topic for another day.

This pivot shows me that the Explorer.exe process is the parent process responsible for creating the task that reached out and touched my Deception Decoy.

Correlation Alert in Arcsight

Picture11.png

WHAT REALLY HAPPENED

So, did the correlated events present an actual representation of how the attack really happened?  Let’s find out.

To create the attack, I used the DarkComet Remote Access Trojan (RAT) to take command and control of my “victim” system.  I attached the RAT in an email and sent it to my target user (Steve).

When Steve opened the attachment, the RAT detonated and established a C&C link back to the control system (msoutlookg).  The DarkComet program planted itself in Steve’s Local/Temp directory using the name MSDCSC.EXE.  Through DarkComet’s Remote Desktop capability, I launched FireFox from the user’s desktop (Explorer.exe) and reached out to an HTTP Deception Decoy – generating the high-fidelity alert.

As you can see, all these activities were correlated and captured providing clear context to the analyst as to what happened.  And it all started from a single high-fidelity alert.

CONCLUSION

It is not enough for organizations to keep pumping all types of security events into SIEMs and hoping they get correlated and prioritized appropriately for the Level I Analyst.  The triage process needs to focus on Prevention Failure Detection utilizing high-fidelity alerts combined with use case focused correlations that answer the key questions accurately and efficiently.  Knowing the user session involved in the breach, processes responsible for communications, and other network communications involving a breached system are critical to rapidly isolating and remediating the compromise.

Utilizing Deception-based alerts with endpoint logs, SIEM can deliver on its capability to correlate alerts that matter.

The Apocalypse has been averted.

Want the Full PDF with endpoint configuration and sub-parser code? Download here.

John Bradshaw, Sr. Director, Solutions Engineering at Acalvio Technologies, has more than 25 years of experience in the Cyber Security industry focusing on advanced, targeted threats. He held senior leadership roles at Mandiant, ArcSight, Internet Security Systems, Lastline, and UUNET. 

Topics: deception, detection, breach, siem, SOC, analyst

Subscribe to Email Updates