New to Google SecOps: Using Metrics in YARA-L Rules (Part 4)

jstoner

Our previous blogs on metrics (Part 1), (Part 2), (Part 3) have taken us step by step deeper into metrics and the function that unlocks its capabilities within a YARA-L rule. Today, we are going to cover the final portions of the metrics function and take a look at grouping and filtering.

After defining the type of metric (first_seen, event_count_sum, et al.) and the aggregation (max, min, sum, avg, etc.), a decision needs to be made about how the metric is grouped. Currently, there is a list of defined UDM fields that can be used for this grouping. In the future, this may broaden further. So far in all of our examples, we’ve only grouped on a single UDM field. Single field grouping on fields like principal.asset.ip or target.user.userid are a great start but determining the first_seen for target.user.userid and target.application, for example, provides greater precision in your detection. There are some metrics that can group across three UDM fields as well.

$first_seen_execution_window = max(metrics.file_executions_total(
      period:1d, window:30d,
      metric:first_seen,
      agg:min,
      metadata.event_type:$event_type, principal.asset.hostname: $hostname, principal.process.file.sha256: $sha256
  ))

In this example, we are trying to determine when an executable was first executed within the past 30 days. If we just grouped on the hostname, we would not have the specificity of the executable. If we grouped just on the hash, we would not be able to establish the host that executed it. To overcome this, we have three UDM fields that are being grouped; metadata.event_type, principal.asset.hostname, and principal.process.file.sha256. Notice that each field name is formatted as the <UDM field name>:<Placeholder variable>.

Let’s take our metric function and fold it into a YARA-L detection rule. If you’ve been following this mini-series on metrics, the first thing you will notice is that our events section contains more criteria and additional placeholder variables. Our example is focused on Microsoft Sysmon events that contain a sha256 hash. Each of the ‘group by’ fields in our metric are represented with three placeholder variables defined in the events section. We will also be using the hostname and the sha256 hash as match variables for our rule.

rule metric_examples_execution {

meta:
  author = "Google Cloud Security"

events:
  $process.metadata.vendor_name = "Microsoft"
  $process.metadata.product_name = "Microsoft-Windows-Sysmon"
  $process.metadata.event_type = $event_type
  $process.principal.hostname = $hostname
  $process.principal.process.file.sha256 != ""
  $process.principal.process.file.sha256 = $sha256

match:
  $hostname, $sha256 over 1d

outcome:
  $first_seen_execution_window = max(metrics.file_executions_total(
      period:1d, window:30d,
      metric:first_seen,
      agg:min,
      metadata.event_type:$event_type, principal.asset.hostname: $hostname, principal.process.file.sha256: $sha256
  ))
   $principal_process = array_distinct($process.principal.process.file.full_path)
   $principal_command_line = array_distinct($process.principal.process.file.full_path)

condition:
  $process and $first_seen_execution_window = 0
}

Notice how those placeholder variables are applied to our function. We’ve added a few additional outcome variables for context around the executable we are attempting to detect and much like our other condition statements, we are using our metric function to identify when the hash on the system has not been seen in the past 30 days. In this case our detection indicates that win-server has a process module load event for the LogMeIn console yet we haven’t seen it used in the past 30 days.

The final option in our metric function is filtering. Why filter within the metric? Suppose we have some additional tuning of our metric that we want to perform before producing a result. Filtering within a metric isn’t designed to narrow results to just a target.user.userid, for example. That filtering takes place in the events section of the rule and the placeholder variable that is associated with the target.user.userid is applied in the group by portion of the metric function.

This filtering uses metrics like first_seen, last_seen, event_count_sum and value_sum and filters before aggregation occurs. We haven’t shown any examples yet that use filtering in this manner, so let’s take a quick look at what this might look like and then apply it to a YARA-L detection.

In this example, notice how after the group by statement of principal.asset.ip:$ip an additional comma is added followed by filter: and then the filter itself. Our example only calculates the average outbound byte count for the 30 day window with days where the number of events (event_count_sum) is greater than 5000 and the sum of the bytes sent (value_sum) is greater than 40,000,000.

$filtered_avg_byte_count_per_day = max(metrics.network_bytes_outbound(
       period:1d, window:30d,
       metric:value_sum,
       agg:avg,
       principal.asset.ip:$ip,
       filter: event_count_sum > 5000 and value_sum > 40000000
))

This filtering allows us to handle outlier data that might impact the metrics that we are looking to calculate. In this example we will apply the function above. Yes, those numbers may seem a bit contrived, but let’s take a look at what that might look like below.

rule metric_examples_network {

 meta:
   author = "Google Cloud Security"

 events:
   $net.metadata.event_type = "NETWORK_CONNECTION"
   $net.network.sent_bytes > 0
   net.ip_in_range_cidr($net.principal.ip, "10.128.0.0/24")
   $net.principal.ip = $ip

 match:
   $ip over 1d

 outcome:
   $filtered_avg_byte_count_window = max(metrics.network_bytes_outbound(
       period:1d, window:30d,
       metric:value_sum,
       agg:avg,
       principal.asset.ip:$ip,
       filter: event_count_sum > 5000 and value_sum > 40000000
   ))
   $avg_byte_count_window = max(metrics.network_bytes_outbound(
       period:1d, window:30d,
       metric:value_sum,
       agg:avg,
       principal.asset.ip:$ip
   ))
   $daily_event_count = count($net.metadata.id)
   $daily_byte_sum = sum($net.network.sent_bytes)
  
condition:
   $net
}

We aren’t using our metric to narrow our results down, just highlighting how the use of filters in a metric function can impact the value. Notice how the IP address of 10.128.0.10 has an average byte count over the 30 day window of 780. However, the filtered average is Unknown. Unknown in this case means we don’t have a value, that is none of the values in the time window meet our filter condition to be included in the calculation. The two columns on the far right are the event count and sum of network.sent_bytes for the previous day which clearly isn’t every period in the window, but a sample to illustrate that this system is a low logging system and therefore would not have a filtered average.

I’ve got one final example for you that cuts across filters and an additional metric we didn’t discuss earlier. The metric, num_unique_filter_values, is a bit different from all of the others because it isn’t something that is pre-computed within Google SecOps and is computed at execution. This provides us a method to output an outcome variable that groups by a value, like a user or asset and then counts the second value within the group by, like distinct IP addresses or applications or countries.

For example, we could compute how many countries a specific user has successfully authenticated from. Here we are identifying user login events in Office 365 where the IP address is not geolocated to the United States and we are aggregating by the userid on a daily basis.

rule metric_examples_success_cloud_authentication {

meta:
  author = "Google Cloud Security"

events:
   $login.metadata.event_type = "USER_LOGIN"
   $login.security_result.action = "ALLOW"
   $login.metadata.product_name = "Office 365"
   $login.target.user.userid != /^Sync_/ nocase
   $login.target.user.userid != /^admin/ nocase
   $login.target.user.userid != "Not Available"
   $login.principal.ip_geo_artifact.location.country_or_region != "United States"
   $login.target.user.userid = $userid

match:
  $userid over 1d

outcome:
  $logins_country_count_30d = max(metrics.auth_attempts_success(
       period: 1d, window: 30d,
       metric: num_unique_filter_values,
       agg: max,
       target.user.userid: $userid,
       principal.ip_geo_artifact.location.country_or_region : *
   ))
   $systems_logged_in_from = array_distinct($login.principal.ip)
   $other_countries_logged_in_from = array_distinct($login.principal.ip_geo_artifact.location.country_or_region)

condition:
  $login and $logins_country_count_30d > 1
}

In the outcome section, our metric function is generating a count of the number of locations by target.user.userid that were successfully logged into over the 30 day window. We’ve added a few outcome variables based on the UDM events that met our criteria and we set our condition to detect when our country count is more than one, or more than the United States, in this case.

When we test our rule, we can see that twice we saw tim.smith_admin login from Canada and the specific IP address that was used. Depending upon the account, a single country might not be a big deal, but these concepts can be applied more broadly as well.

Let’s wrap this mini-series on metric functions to a close with a few reminders.

If we add a UDM field to our function that isn’t available to that metric, the YARA-L syntax editor will provide an error message.
If the metric does not exist, the value returned will be 0. Remember you can use this in the condition section. For example, we did this with first_seen and if we aren’t seeing that value within our time window, the value will be 0.
Pay attention to aggregations. Last_seen usually would be paired with max, first_seen with min.
Metric functions are only used in the outcome section of a YARA-L rule. Because they are going to be used with YARA-L rules that encompass multiple events, an aggregation function must be prepended to the metric function. Max is generally a safe bet.

While we only used a few of the different metrics in our mini-series, keep in mind that additional metrics for DNS Bytes Outbound, DNS Queries, HTTP Queries and Workspace are also available and they all follow this same general layout.