As you may have seen in a previous blog post, I wrote a JSON FlexConnector to parse the alerts output from OSSEC (an open-source log based host intrusion detection system.) This blog post covers the important step which should always come after writing a FlexConnector, ensuring all messages types are categorized correctly in line with existing ArcSight Event Categorization. For additional details, this document from back in 2014 is a great help ArcSightCategorizationWhitePaper.pdf
What is Categorization?
ArcSight Categorization is a layer of abstraction on top of the Common Event Format. CEF is produced when SmartConnectors parse and tokenise events into known fields, whilst this is incredibly powerful, this will still result in differences between vendors when, for example, it comes to identifying a login to a Windows Desktop, a Cisco Router or a SSH session.
Categorization provides a solution to this by providing a taxonomy of fields and known values for numerous actions, mapping all the individual signatures to this taxonomy allows us to then write sensor-independent content.
To summarise the following is a list of benefits created by the ArcSight taxonomy:
- Vendor independence, mainly for content creation
- Analysts do not need to remember specific nomenclatures for all the devices in the environment.
- ArcSight Taxonomy immediately captures event impact
- Content generation is easier and more effective (Rules, Data Monitors, Forensic Analysis, Reports, Pattern Discovery)
- Content is generic (to support a new IDS, none of the rules have to be rewritten, because they utilize the categorized events)
- More powerful content can be written, for example, correlation rules can reason about “failures” and “successes” as opposed to relying upon the reporting devices
Give us some examples
The ArcSight Taxonomy uses seven dimensions (fields) to characterize an event.
- Object - Events are always about a certain object. An object can, for example, be an application, the operating system, a database, a file, or the memory of a server
- Behavior - Events not only refer to certain objects, but there is generally an action or a behavior associated with an event. What is being done to an object? Behaviors include access, execution, or modification
- Outcome - With the first two dimensions, we know what object is being referred to and what action targeted the object. However, we do not know whether the behavior was successful or not. Therefore, the outcome is a success, a failure, or an attempt.
- Technique - The type of events with respect to a security domain. Is an event talking about a denial of service, a brute force attack, IDS evasions
- Device Group - Many devices serve a multitude of purposes in one product. Intrusion Prevention Systems, for example, generate events associated with their firewall capabilities, as well as their intrusion detection capabilities. Routers can generate events associated with user authentication, etc. To distinguish between these types of events, we introduced a dimension called deviceGroup. This dimension lets us query, for example, all the firewall-type events as opposed to all the events generated by a firewall.
- Device Type - Many devices serve a multitude of purposes in one product. Intrusion Prevention Systems, for example, generate events associated with their firewall capabilities, as well as their intrusion detection capabilities. Routers can generate events associated with user authentication, etc. To distinguish between these types of events, we introduced a dimension called deviceGroup. This dimension lets us query, for example, all the firewall-type events as opposed to all the events generated by a firewall.
- We need the capability to separate normal events from hostile events. We also need to know whether certain activity reported by the device impacts the availability, confidentiality, or integrity of our systems. All this information is captured in the significance.
|Event Description||Object||Behavior||Technique||Device Group||Outcome||Significance|
|Network communication was allowed||/Host/Application||/Communicate||/Firewall||/Success||/Informational|
|A process started successfully||/Host/Resource/Process||/Execute/Start||/Application||/Success||/Informational|
|Successful login||/Host/Operating System||/Authentication/Verify||/Operating System||/Success||/Informational|
|Failed login||/Host/Operating System||/Authentication/Verify||/Operating System||/Failure||/Informational/Warning|
|A vulnerability exploit was detected||/Host/Application||/Communicate||/Exploit/Vulnerability||/IDS/Network||/Attempt||/Compromise|
Where to begin
As part of writing a Flex Connector it is important to think about how you are going to populate the deviceEventClassId field. The deviceEventClassId field is usually the field which is used to set the categorization of the event, as it is expected to uniquely identify the specific event. In the OSSEC example this is an easy choice, as there is already a unique identifier provided for each alert. For less well writen device logs the deviceEventClassId field can be constructed by concatenating other fields in the log.
After writing the FlexConnector I allowed it to run for a few days in my environment to build up a list of deviceEventClassIds to categorize as an example here. In reality it is better to find a documented list of all potential field values, as you could be waiting a while for them to fire in real life.
I have shown two methods for identifying the unique deviceEventClassIds, by far the easiest is to use the ESM Command Center with a quick search.
Applying the Categorization
Categorization happens on the ArcSight SmartConnector. The connector contains a mapping table (a categorization file) for each of the devices. A categorization file contains a header-line and is followed by all the categorization entries. The header line looks as follows:
This tells the connector to look out for the deviceEventClassId field and whenever a match is found, it is to set the following seven category fields.
To build a categorization file it is therefore necessary to know about as many possible deviceEventClassIds as possible. The values of those deviceEventClassIds then have to be added to the categorization file along with the correct category entries.
Looking at the output of my deviceEventClassId query above the following might be a suitable first attempt at categorizing these fields:
Once the file is generated, it has to be placed under:
In my case this is:
After restarting the connector and allowing enough time for the creation of new events my query now looks like this: