UEBA in action
About a year ago, a collaborative effort with students of the Centre of Excellence lab at VJTI College (the CoE-CNDS lab, for short) began at Netmonastery. This amazing initiative it eventually led to one of the students from the group joining this amazing company right out of the lab. To give you some extra context here, I’m a part of the group that was featured in an earlier blog on modeling behavioral patterns, and I’m now a part of Netmonastery!
The previous blog was all about interactions with users, malicious or legitimate, from outside of the network. But what about threats from within the network? What if someone decides to go rogue and causes irreparable damage to their organization? The answer to this is a small experiment conducted in our lab, focusing on user and entity behavior analysis (UEBA). UEBA is a family of technologies for observing the normal conduct of users and entities, making it possible to detect anomalous behavior when a user deviates from established patterns.
UEBA establishes a baseline from this data of what constitutes normal behavior. By understanding what is normal for each user and entity, UEBA can easily detect when something unusual occurs. For example, if a user suddenly logs in from a foreign location and accesses a server they don’t usually access, the software will notice this and generate an alert.
UEBA in action
To monitor the users in an organization, logs from host systems are forwarded to the central DNIF server host system. These logs are ingested and parsed into JSON-formatted files. The resulting files are then represented as rows in the output of a DNIF Query Language (DQL) query in.
Windows host systems generate event logs containing event IDs, which denote various types of normally occurring events. This helps to build a baseline of desirable and undesirable (or malicious) events. These different events each have a correlated description for actions occurring and any countermeasures required.
An event logging service like nxlog or rsyslog records events from various sources and stores them in a single collection called an event log. This is handled in the Event Store in the Management tab of the DNIF console, which retains event data and returns it when queried. Up to 100 GB of data may be stored here — enough for more than 1,000,000 records. To begin analyzing these events, DQL is used to form a query applied to identify the host systems whose logs have been recorded in the entire network.
In Figure A, hosts are identified by their IP addresses. This shows that 3 hosts occur in the log data. Of all these systems, the host with IP address 192.168.1.4 has the highest number of records, at 1,237,399 records.
Figure B, below, is a graphical depiction of the same query. IP addresses are plotted on the Y axis, while record counts are plotted on the X axis. It shows that the host systems identified by IP addresses 192.168.1.2, 192.168.1.4 and 192.168.1.5 appear in 43304, 1237399 and 7110 records, respectively.
Figure C shows a graphical representation of event IDs from the result of a DQL query for different event IDs . Event ID 10 occurs in about 92.42% of all events. Event ID 10 denotes the normal operation of applications with parent and child processes.
Figure D shows a graphical representation of event IDs for a host with IP Address 192.168.1.4, whose hostname is Idea-PC. This is one of the hosts from Figure A. Event ID 114 occurs in 500 events out of about 1,000,000 events that we have fetched for the host with IP address 192.168.1.4. This ID is followed closely by event ID 10, at 300 occurrences.
Figures E and F show the distribution of logged event types. The majority of the events are of the Information type. This is expected, since we are looking to establish a baseline for events. However, the Error event type is close behind, which shows that the model is using these as labels for detecting issues relative to normal behavior and differentiating for suspicious behavior.