Traffic pattern analysis inside out

The term network traffic pattern analysis (a.k.a. traffic pattern analysis) rings a bell with the majority of information security specialists, yet the idea of what is behind this word combination is very vague. Let’s slice and dice sketchy facts on this method for a more comprehensive picture.

Overview

Traffic pattern analysis is the process of monitoring traffic anomalies in order to detect APT, abnormal or excessive communication patterns and various malware activities.

In general, traffic pattern analysis employs information from featuring applications, ports used in communication, employed protocols, communication endpoints, traffic direction and volume.

The ways to collect traffic flows

Network traffic flows are collected from network equipment that establishes communication between the network’s endpoints. This equipment includes routers, switchers and firewalls. The collected traffic flows go to SIEM systems (for example, IBM® Security QRadar® SIEM) where traffic patterns are built and analyzed. Further we will use QRadar as an example of a SIEM system.

There are two ways for QRadar to receive traffic flows: port mirroring and flow records collection (SIEM solutions may use various flow export protocols, such as NetFlow, J-Flow, sFlow). Both have to be configured on the source system.

Port mirroring means that the source is configured to send a complete copy of network traffic to a QRadar QFlow Collector (a component which collects a traffic flow). With this approach, the SIEM system can conduct application (Layer 7) analysis to determine the type of application that generates traffic. NetFlow, J-Flow and sFlow can only notify QRadar about the source IP, destination IP, ports, protocols and the quantity of bytes. The advantage of QRadar QFlow Collector is that it analyzes network packets and identifies signatures of suspicious protocols, for example, P2P and IRC widely used for botnet communication.

Yet, the technique requires considerable expenses on mirroring, extra network infrastructure expenses (to manage the network load), and flow license for QRadar.

QRadar can receive a traffic flow from network traffic equipment (firewalls, routers, switchers) via a flow export protocol. This technique is called flow records collection, and unlike the previous one, it is fairly simple to implement. However, although flows provide information on the IP address of the source and destination, direction, ports and traffic volume, it can’t help QRadar to determine the type of application.

How it works

Having received traffic flows, QRadar builds traffic patterns and stores them for future analysis performed by a security administrator. The administrator has to configure behavioral rules which are determined by the security policy of a company to monitor data exchange between network devices. Behavioral rules are a subtype of anomaly detection rules, which compare the real traffic against the baseline to detect volume changes in regular traffic patterns.

Baseline creation

The baseline is a data set representing the average values of monitored properties in a particular event or flow search. Security administrators create an event or a flow search with predefined flow or event properties to be monitored and a time frame.

Properties and a time frame

Properties are the parameters of an event or a flow, such as IPs, ports, total bytes and packets, flow types, etc. It’s possible to configure any number of more than 90 default event and flow properties (for example, bytes in/out, source/destination IPs, applications) and create custom ones. Thus, different rules can be applied to track traffic on different ports, inbound and outbound traffic, and other parameters separately.

Time frame options for event or flow searches include specific intervals (if there’s a need to check a definite time interval in a particular day) and recent intervals (from last minute to last week). In the latter case, QRadar serially updates the accumulated data.

Behavioral rules in action

Creating a behavioral rule, a security administrator should choose one traffic property out of those involved in the event or flow search (e.g., the amount of IPs, ports, protocols and total bytes) and set an appropriate season.

Choosing a season

Season is a time period when a behavioral rule is active. It may range from one day to four weeks. Season length should be adjusted to the peculiarities of the monitored traffic. To reduce the number of false positives, security administrators should set the season during which no traffic spikes are expected.

Let’s consider the example, where we need to monitor employees’ traffic activity. For those who work day shift we’ll configure a behavioral rule which will define a baseline for their traffic activities during the working day (e.g., 9 a.m. – 18 p.m.). For night shifters, the rule will be different, with the season covering night hours (e.g., 18 p.m. – 3 a.m.). Likewise, we need to create two separate rules for weekdays and weekends, as baselines of these periods differ considerably.

After the season and the property are configured, a security administrator sets the threshold (X percent from the baseline) for each behavioral rule, and, referring to this threshold, QRadar checks the actual data exchange against the baseline. If the average volume of a certain accumulated property within a predefined season exceeds the threshold, QRadar generates an offence.

Some behavioral rules use cases

Deviations from a baseline is the evidence of potential security breaches. Imagine your SIEM solution alerts you that the network’s critical servers generate a lot of traffic in the off-peak time (for example, at night). It may signify that these servers are compromised and have established communication with a botnet. One more case: hackers may get regular video-capture feeds to learn the activities of your company’s employees. Video translations produce traffic that exceeds the baseline, which can be detected by SIEM solution with configured behavioral rules.

The rules can also define who is allowed to communicate. Let’s consider the following example. There is a firewall which forbids communication between networks A and B. QRadar has the rule to generate an offense in case this communication happens. If a malicious security administrator changes firewall rules allowing communication between the networks, QRadar will spot the unexpected communication patterns and generate an offense.

Traffic pattern analysis based on behavioral rules can also spot the signs of APT. For example, a firewall may register inbound traffic over a specific non-https protocol. This may indicate that botnets send targeted requests to an internal server that has already been compromised.

Some limitations

As any technique, traffic pattern analysis has its downsides. If a SIEM system builds a baseline to monitor users’ network activities, it may generate a lot of false-positives. Users’ traffic patterns are jumpy. QRadar will notice that traffic volume way exceeds the volume from the previous day, but won’t establish the cause of such a leap and will create an offense. Yet, if the SIEM system receives traffic flows through port mirroring, it can identify applications, which carry the real threat.

Usually, traffic pattern analysis is used to monitor servers. Most servers have a predictive behavior, which makes it easy to build the baseline of its traffic. For example, if QRadar notices an abnormal increase in outbound traffic via a 5432 database server port, it’s a sure sign that the database information has been leaked. Increased inbound traffic over certain ports indicates that some internal server has been compromised.

On a final note

Traffic pattern analysis deserves to be the talk of the town. Certainly, if used independently, the technique can’t provide a robust network security, but it can considerably facilitate the battle against malware and other security threats by detecting excessive traffic volume over specific protocols and suspicious direction of communication patterns.

Traffic Pattern Analysis Inside Out

Overview

The ways to collect traffic flows

How it works

Baseline creation

Behavioral rules in action

Some behavioral rules use cases

Some limitations

On a final note