Key information involved in Event Management includes the following:
SNMP messages, which are a standard way of communicating technical information about the status of component s of an IT Infrastructure.
Management Information Bases (MIBs) of IT devices. An MIB is the database on each device that contains information about that device, including its operating system, BIOS version, configuration of system parameters, etc. The ability to interrogate MIBs and compare them to a norm is critical to being able to generate events.
Vendor’s monitoring tools agent software.
Correlation Engines contain detailed rules to determine the significance and appropriate response to events. Details on this are provided in paragraph 4.1.5.6.
There is no standard Event Record for all types of event. The exact contents and format of the record depend on the tools being used, what is being monitored (e.g. a server and the Change Management tools will have very different data and probably use a different format). However, there is some key data that is usually required from each event to be useful in analysis. It should typically include the:
Device
Component
Type of failure
Date/time
Parameters in exception
Value.
For each measurement period in question, the metric s to check on the effectiveness and efficiency of the Event Management process should include the following:
Number of events by category
Number of event s by significance
Number and percentage of events that required human intervention and whether this was performed
Number and percentage of events that resulted in incidents or changes
Number and percentage of events caused by existing problem s or Known Errors. This may result in a change to the priority of work on that problem or Known Error
Number and percentage of repeated or duplicated events. This will help in the tuning of the Correlation Engine to eliminate unnecessary event generation and can also be used to assist in the design of better event generation functionality in new services
Number and percentage of events indicating performance issues (for example, growth in the number of times an application exceeded its transaction threshold s over the past six months)
Number and percentage of events indicating potential availability issues (e.g. failovers to alternative devices, or excessive workload swapping)
Number and percentage of each type of event per platform or application
Number and ratio of events compared with the number of incident s.
Нам важно ваше мнение! Был ли полезен опубликованный материал? Да | Нет
studopedia.su - Студопедия (2013 - 2025) год. Все материалы представленные на сайте исключительно с целью ознакомления читателями и не преследуют коммерческих целей или нарушение авторских прав!Последнее добавление