_buffer is a query directive used to filter rows from a result set based on the time of the previous occurrence of a specified field. The eliminated rows, thus, will not be passed to the next query function.
Assume that you want to send out email alerts if you encounter certain types of IP addresses. However, you may not want to send repeated alerts every hour for the same IP address. Here, _buffer can help you eliminate those IP addresses for which alerts have been sent recently by ‘remembering’ them. Only the remaining IP addresses are alerted via email. Internally, the _buffer directive maintains a profile (store) created using the _store directive to keep track of recently processed data.
The _buffer query directive typically forms a part of a workbook (scheduler), scheduled to run at regular intervals. A set of pipelined query functions may be is used to retrieve specific data (for example, IP addresses) from the data store and process it to identify the IP addresses for which alerts need to be generated. The _buffer directive (added in the same pipeline) checks if this data (the IP addresses) has been ‘alerted for’ (processed) recently. Only rows (IP addresses) that have not been alerted for recently are passed on to the remaining query functions in the pipeline, typically a _raise or a _trigger, to send alerts or call APIs.
The generic syntax of the _buffer directive is as given below:
_buffer <buffer_name> $Field $Duration
- buffer_name: Refers to the store/profile created internally by the _buffer directive
Take a look at the example given below:
_fetch * from event where NOT $SrcCN=IN AND $AtkClass=bad-unknown AND $LogType=DPI AND $Duration=1h group count_unique $SrcIP limit 100 >>_checkif int_compare count_unique > 10 include >>_buffer test1_ik_alerts $SrcIP 1h >>_raise notify_email <emailid>
NOTE: This query is a workbook query scheduled to run every hour. The images below are are only to show a sample output.
1. The _fetch directive retrieves all the fields for each event in the event index that have been received and stored in the last one hour and where:
- $SrcIN (source country) is not IN (India)
- $AtkClass is bad-unknown and
- $LogType is DPI
The result set is grouped by unique values of $SrcIP along with a count (count_unique) for each IP address. The result set is sorted in the descending order of count_unique (by default). It is then limited to 100 rows. The output is as shown below:
2. In the pipelined query function (f2), the _checkif directive uses the int_compare keyword to check whether count_unique is greater than 10. Only groups where this condition is satisfied are included in the result set. The output is as shown below:
Only one IP address has a count greater than 10.
3. In the pipelined query function (f3), the _buffer directive checks whether each $SrcIP has already been ‘processed’. It ‘remembers’ this using the tm_alerts store that it creates and updates.
- If a $SrcIP is not present in the store, it adds it along with the current timestamp. This row is included in the output result set.
- Else if a $SrcIP is present in the store, it checks whether the difference between the current timestamp and the timestamp against the $SrcIP (StoreTstamp) is greater than the duration (here, 1 hour) specified using the _buffer query directive.
- If it has exceeded the duration, then the StoreTstamp against that $SrcIP is updated in the store (as the current timestamp value) and that row is included in the output result set.
- If it has not exceed the duration then no updates are made and the row is not included in the output result set. The output is as shown below:
This resultant IP address is stored in a store/profile along with the StoreTstamp. In the next iteration, this timestamp will be used for processing (as explained above).
4. In the pipelined query function (f4), the _raise directive sends a notification email to the specified email id for each $SrcIP (row) in the result set returned by the _buffer query directive. The output is as shown below:
The emails received are as shown below:
NOTE: Click here to understand how multiple query functions work in a pipeline.
NOTE: The store create by the _buffer directive can be viewed using the _retrieve directive.