Events and Incidents in Entuity Network Analytics (ENA)
Learn about the concepts of incidents and events in Entuity Network Analytics, and how they are generated in ENA to help with proactive network management.
What you’ll Learn
- Incidents and Events
- Incident Dashboard
- Entuity Network Analytics
- Network Management
Before I explained the purpose of incidence and how they’re generated. It would be worth comparing the concept of an incident with that of an event. Whereas events are raised to notify a user that something has happened, incidents are raised in response to one or more events to indicate that there’s a situation that needs attention. It’s also worth noting that not all events result in incidents being raised. An important goal of the event system is to reduce the number of separate incidents presented to operators while increasing the value and relevance of each one.
I’m going to use some of the standard provided incident definitions to illustrate this concept. When I select the incidence dashboard, the list of open incidents is displayed.
Let’s take this port status problem. It’s warning the user that a port on a device is in the operationally down state. Even though the system may have detected this from either an incoming trap or a regular polling operation or both, there’s only one incident describing the current situation.
Let’s wait a moment to see if anything changes. The port is now transitioned into the operationally up state and so the incident has automatically closed. It is therefore been removed from the list.
The port has now come operationally down, you’ll notice that the message has now been updated and that the incident is reporting that the port is now in the flapping state. That’s warning that it is regularly oscillating between the operationally up and down states.
Individual state transitions are no longer causing an incident to be opened and closed. This incident warning of the flapping situation will continue to be open until the port settles in either the up or the down state. If it stops in the down state, then the incident will be updated to show the down state but the incident will be closed and removed from the list of the port settles in the up state. That port has now settled in the down state.
It’s no longer flapping up and down and I can explore the history of events that contributed to its state changes using the incident details option in the context menu.
Here’s the most recent event at the top of the list, which was a port down event. Next is the port flapping event which was internally derived within the event system as a result of seeing several port up and down events over a short duration. Those port up-and-down events were themselves discarded and replaced by this single port flapping event. Further down there was the port up event that had caused it to be seen in the up state previously. All these different events have been gathered together and represented as a single incident.
Other incidents such as this one warning a device reboot may also warn of a sequence of related but different events.
These can be seen if the details are displayed. This is an example of an incident which doesn’t have an explicit closing event, so it’s automatically closed after a time defined in its age out setting.
The age out is an optional setting and many incidents don’t have an age out setting. An incident is considered to be open until either it’s aged out, time expires if it has this setting, another event explicitly closes it or an operator manually closes it.
There’s then a period of time referred to as the expiry period during which the same incident will be reopened if an opening event is received. If another event updates Benson before the expiry timer has expired, the timer starts again.
Once the expiry period has been exceeded, any new opening events will open a new incident instead, but the details of the original incident are maintained in the database for further seven days. So it can be viewed when investigating the history of a problem.
Next is how an operator can manually close an incident without waiting for it to be closed automatically.
If the toggle isn’t clicked, the incident is simply closed but will be reopened if a new opening event is received. If the toggle is clicked the incident will be considered to have already expired and won’t be reopened. Details of this incident will still be available through a view for further seven days.
The history of events can also be displayed using the events dashboard.
Events and the change in state of incidents can also be forwarded to an external event handling system such as BMC TrueSight operations management.
The list of incidents and events displayed in the viewer can be controlled in several ways.
Let’s go back and look at the incidents again. If a view is selected in the Explorer then only those incidents relating to components within that view were displayed in the incidents dashboard. In a similar manner, selecting a device or port in the Explorer causes the list of incidents to be further filtered to only those originating from that component. These are the incidents specifically associated with the device hq02.
If I focus on this specific port, then only the incident associated with that port is displayed.
The incidence generated for each view are controlled by an incident filter and various different incident filters can be created and assigned to different views. If an incident doesn’t pass the incident filter and it will not be forwarded to an external incident handler.
A similar facility is provided for events. However, it should be noted that although the incidents are opened and closed by events, this happens independently of any event filter configuration. Regardless of which view or component is selected in the Explorer, a further filter option is also available. These filters only control which incidents and events appear in the dashboard and have no impact on the forwarding of events or incidents to external event or incident handlers.