Centreon Enterprise Server Documentation Release 3.0 Merethis
Transcription
Centreon Enterprise Server Documentation Release 3.0 Merethis
Centreon Enterprise Server Documentation Release 3.0 Merethis December 19, 2014 Contents i ii The plugins packs are a set of templates and plugins developed by Merethis. They offer a simplified and optimized monitoring of your IT infrastructure. The pre-configured templates promote a quick and performant deployment. This documentation explains how to use them. Contents: Contents 1 M ERETHIS 12 AVENUE R ASPAIL FR94290 G ENTILLY 2 Contents M ERETHIS 12 AVENUE R ASPAIL FR94290 G ENTILLY CHAPTER 1 Overview 1.1 Plugins Packs and Centreon Plugins Plugins Packs are a set of plugins pre configured to be easy to install and use on a Centreon installation. These plugins are: • either existing community plugins selected and validated by Merethis as they are known to work • or plugins written by Merethis that are distributed as free software and available on the Forge https://forge.centreon.com/projects/centreon-plugins Plugins Packs added value is in the plugins validation and pre configuration. They come with Centreon templates for command, services and hosts that are installed with the Plugin Pack. Once installed, all you have to do is refer to the associated help (in case anything needs to be configured or activated) and create your hosts and services based on these templates. Note: Please note that plugins from Centreon Plugins are free software (GPL), you can contribute to this project or use them WITHOUT being a Centreon Enterprise Server (CES) client. But using Plugins Packs makes it easier to use through templates, macros and documentation. Plugins Packs are packaged using RPM files. There are 2 kinds of RPMs, one that contains the plugins, the other that contains the templates. More information in the installation section. 1.2 Connectors In addition to the plugins available on Centreon Plugins, Plugins Packs subscription gives you access to some specific connectors or agents used by the plugins that not available elsewhere. Here are the corresponding connectors: Connector AS400 JMX WMI ESXD NSClient++ (TODO) NRPE (TODO) Description Java based connector allowing you to execute checks on AS400 Java based connector allowing you to monitoring Java app servers through JMX C/C++ connector allowing you to monitoring Windows environments through WMI Perl daemon using VMware SDK to monitor VMware platforms Merethis packaging of the NSClient++ project ready to use with embedded plugins Merethis packaging of the NRPE server with needed patches to comply with Plugins Packs Please refer to the associated documentation of each connector on http://documentation.centreon.com for more information. 3 1.3 Templates description Each Plugins Packs, once installed will create commands, service templates and host templates ready to use in Centreon. As an example, here is a diagram of hiearchy between different Centreon objects. Note: Hosts and services illustrated here are templates, not actual hosts or services. In this example, the Plugin Pack is shipping 2 commands, 2 service templates (one of which has an extra inheritance level) and 1 host template. You benefit from this Plugin Pack, you need to define your own hosts and services based on these templates. Generic templates are provided by a base pack needed by all other packs. Here is the structure of the base pack : Warning: Templates provided by the Plugins Packs must NOT be edited, except those ending with “-custom” that will not be overriden during upgrades. 4 Chapter 1. Overview M ERETHIS 12 AVENUE R ASPAIL FR94290 G ENTILLY Warning: Only service templates have a “-custom” template at the moment, host templates are being overriden during upgrade, this will be improved in a next version. 1.3. Templates description M ERETHIS 12 AVENUE R ASPAIL FR94290 G ENTILLY 5 6 Chapter 1. Overview M ERETHIS 12 AVENUE R ASPAIL FR94290 G ENTILLY CHAPTER 2 Prerequisites 2.1 General requirements Plugins Packs is available to the clients of Centreon Enterprise Server (CES) Advanced or Complete. Please refer to the following page to compare versions : http://www.centreon.com/Content-Products-Entreprise-Server/ces-comparisonchart Here are the software dependencies needed to use Plugins Packs: • Centreon Enterprise Server >= 2.2 (or RPM installed on RHEL >= 5, CentOS >= 5) • Centreon >= 2.4.4 • Centreon Clapi >= 1.5 • Centreon Plugins Packs Manager >= 1.x Note: Plugins Packs Manager is a Centreon module in charge of the installation and removal of Plugins Packs. It is a key component that is receiving regular updates, it is recommended to stay up to date regarding this component. 2.2 Plugins Packs YUM Repository Merethis provides RPMs for its products through Centreon Entreprise Server (CES). Plugins Packs are also packaged using RPMs, distributed through a dedicated YUM repository. 2.2.1 Installation To install the Plugins Packs repository, install a specific RPMs that will deploy the .repo file containing the link to the YUM repository containing Plugins Packs RPMs. This must be done on all Centreon servers (central, pollers) that need to have access to the Plugins Packs. Once you retrieved the RPM file from the Merethis support, run the following command: yum install plugin-packs-1.0-1.noarch.rpm The repository is now installed. 7 2.3 Dependencies Dependencies can now be installed on the central server. All dependencies will be provided by the YUM repository. 2.3.1 Installation The Plugins Packs need centreon-pp-manager. two Centreon modules to work properly: centreon-clapi and To install them, run the following commands: yum install centreon-clapi yum install centreon-pp-manager Once the RPM files are installed, you need to finish the modules setup through the web frontend. Go to Administration > Modules, you get a page listing all modules on the filesystem, whether they are installed or not: Click on the gears icon, then you get this page: Click on “Install Module”, click on “Back” Note: The Modules section used in Centreon 2.4 has been renamed in Centreon 2.5 and is now called Extensions Repeat the same procedure for the centreon-clapi module. 8 Chapter 2. Prerequisites M ERETHIS 12 AVENUE R ASPAIL FR94290 G ENTILLY CHAPTER 3 Installation 3.1 Search for available packages Merethis provides several Plugins Packs. To list all available Plugins Packs on your server, run the following command: yum search ces-pack 3.2 Plugins Packs installation 3.2.1 Pack installation First you have to install the set of Centreon templates provided by the Plugins Pack Use the following command on the central server: yum install ces-pack-$PLUGIN-PACK Where $PLUGIN-PACK$ is the name of the Pack as listed by the yum search command. 3.2.2 Plugin installation Then you have to install the necessary plugins referenced by the templates installed in the previous steps. Use the following command on all pollers that will execute the checks: yum install ces-plugins-$PLUGIN-PACK Where $PLUGIN-PACK$ is the name of the Pack as listed by the yum search command. 3.2.3 Best practices It is up to you to decide whether you install the plugins on all pollers, or only on the poller that will perform the checks. Keep in mind that if you do not install the plugin on one poller, you may have some errors if you decide one day to move a monitored host from one poller with the plugin to another poller that does not have this plugin. Moreover, if you update packs on the central, it is highly recommanded to also update associated plugins on the pollers, as some new checks are sometimes added in the packs, and won’t work if they do not have the corresponding command. 9 3.3 Plugins Packs management Managing the installed plugins packs can be done through the web interface provided by the Plugins Packs Manager Centreon module. Go to Administration > Modules > Plugin packs > Setup, you will the list of all the plugins packs installed on your server: Each Plugins Pack displays the following information: • Version: version of the Plugin Pack, matches RPM version • Number of host templates: list number of host templates provided by the Plugin Pack, click on the question mark to get the corresponding names • Number of service templates: list number of service templates provided by the Plugin Pack, click on the question mark to get the corresponding names • Number of commands: list number of commands provided by the Plugin Pack, click on the question mark to get the corresponding names • Actions: available actions on the Plugin Packs You can click on the icon of a Plugin Pack to get help, this will open the documentation in a new window. 10 Chapter 3. Installation M ERETHIS 12 AVENUE R ASPAIL FR94290 G ENTILLY CHAPTER 4 Uninstallation 4.1 Pack uninstallation First you have to remove the Pack. Warning: Before removing the RPM, you MUST remove it from the web interface, otherwise database won’t get cleaned up. Warning: Before removing the RPM, you need to ensure there are no remaining hosts or services you have defined that are linked to templates (hosts, services, commands) provided by this Pack A pack must be uninstalled through the management page. For each installed Plugins Pack, you will find a the corresponding Pack. icon located into the Actions column. Click on the icon to uninstall You’ll get the following confirm dialog: Click Yes to confirm. If you still have hosts or services that are linked to templates provided by this pack, you won’t be able to uninstall the pack. You’ll get the following dialog: 11 To work around this problem, you can: 1. Delete the hosts or services linked to the templates provided by the pack 2. Or unlink the hosts or services linked to the templates provided by the pack Once this is performed, you’ll be able to uninstall the pack. Finally, you can remove the pack from the server: $ yum erase ces-pack-$PLUGIN_PACK$ Where $PLUGIN-PACK$ is the name of the Plugin Pack. 4.2 Plugin uninstallation Once the pack is uninstalled, you can safely remove the plugin on all pollers using: $ yum erase ces-plugins-$PLUGIN_PACK$ Where $PLUGIN-PACK$ is the name of the Plugin Pack. 12 Chapter 4. Uninstallation M ERETHIS 12 AVENUE R ASPAIL FR94290 G ENTILLY CHAPTER 5 Monitoring checks list 5.1 Infrastructure services monitoring 5.1.1 Infra-DHCP Template to check a DHCP system Name DHCP Description Check the availability of the DHCP server. 5.1.2 Infra-IMAP Template to check an IMAP system Name IMAP Description Check the availability of the IMAP service for a specific address. 5.1.3 Infra-POP Template to check a POP system Name POP Description Check the availability of the POP service for a specific address. 5.1.4 Infra-SSH Template to check a SSH system Name SSH Description Check the availability of the SSH service for a specific address. 5.2 Operating systems monitoring 5.2.1 OS-AIX-SNMP Template to check AIX server using SNMP protocol 13 Name Cpu Swap Cpu ProcessGeneric Swap DiskGeneric-Id DiskGenericName Disk-Global TrafficGeneric-Id TrafficGenericName TrafficGlobal Description Check the rate of utilization of CPU for the machine. This check can give the average CPU utilization rate and the rate per CPU for multi-core CPU. Check virtual memory usage (SWAP) Check the rate of utilization of CPU for the machine. This check can give the average CPU utilization rate and the rate per CPU for multi-core CPU. Check Linux process/service is working. Check virtual memory usage (SWAP) Check the rate of free space on the disk. For each checks the name of the disk will appear (« label ») rather than the letter assigned. Thresholds can be in percentage or in free space remaining. Check the rate of free space on the disk. For each checks the mount pont of the disk will appear (« label »). Thresholds can be in percentage or in free space remaining. Check the rate of free space on disks. For each checks the mount point of disks will appear (« label »). Thresholds can be in percentage or in free space remaining. Check the bandwidth of the interface. For each checks the name of the interface will appear (« label » shortcut describing the interface). Check the bandwidth of the interface. For each checks the name of the interface will appear (« label » shortcut describing the interface). Check the bandwidth of interfaces. For each checks the name of the interface will appear (« label » shortcut describing the interface). 5.2.2 OS-AS400 Template to check AS/400 server using Centreon-Connector-as400 protocol 14 Chapter 5. Monitoring checks list M ERETHIS 12 AVENUE R ASPAIL FR94290 G ENTILLY Name Asp1-Usage Description Check the occupation rate of ASP1.Report a OK state if the occupation rate is below the threshold WARNING.Report a WARNING state if the occupation rate is above the threshold WARNING.Report a CRITICAL state if the occupation rate is above the threshold CRITICAL. CPU-Usage Check the time during which the CPU AS/400 were used.Report a OK state if the time CPU is below the threshold WARNING.Report a WARNING state if the time CPU pass above the threshold WARNING.Report a CRITICAL state if the time CPU pass above the threshold CRITICAL.It may exceed 100% on the uncapped partition. Disk-State Check running state of whole physical disks.Physical disk can take statements: no control, active; failed, hardware failure, rebuild, not ready, protected, busy, not operational, unknown state (13 states in total).Report a OK state if all disks are active.Report a CRITICAL state if at least one disk is in different state from active. Disk-UsageCompare utilization rate of different physical disks.Calculate the gap between minimum Repartition utilization rate and maximum whole physical disks.Report a OK state if the gap is below the threshold WARNING.Report a WARNING state if the gap is above the threshold WARNING.Report a CRITICAL state if the gap is above the threshold CRITICAL. Page-Fault Check page fault rate.Recover page rate per second and page fault rate per second, database and no database.Calculate the percentage of page fault in relation to the page.Report a OK state if the percentage is below the threshold WARNING.Report a WARNING state if the percentage is above the threshold WARNING.Report a CRITICAL state if the percentage is above the threshold CRITICAL. Disk-Usage Check occupation rate of physical disk.Report a OK state if the occupation rate is below the threshold WARNING.Report a WARNING state if the occupation rate is above the threshold WARNING.Report a CRITICAL state if the occupation rate is above the threshold CRITICAL. Job-Exist Check job existence. Indicator don’t check is state.Report a OK state if the job exist.Report CRITICAL if the job doesn’t exist. Job-Has-No-Msgw Check job existence. Check job hasn’t a MSGW.Report OK state if the job exist.Report CRITICAL state if the job doesn’t exist, or job has a MSGW. Job-Queue-Status Check job queue existence and state.Job queue can take statements RELEASED or HELD.Report a OK state if the job queue is in RELEASED state.Report a CRITICAL state if the job queue is in HELD state or not existing. Job-Queue-WaitCheck the number of job waiting in a job queue ignoring priorities of jobs.Jobs in job queue Job-Count can take different statements: ACTIVE, HELD or SCHEDULED.Report a OK state if the number of job in HELD state is lower than the threshold WARNING.Report a WARNING state if the number of job in HELD state is above the threshold WARNING.Report a CRITICAL state if the number of job in HELD state is above the threshold CRITICAL. Message-QueueCheck messages queue size.Recover all the message whose the severity is superior or equal Size to the define arguments.Report a OK stage if the number of message is below the threshold WARNING.Report a WARNING state if the number of message is above the threshold WARNING.Report a CRITICAL state if the number of message is above the threshold CRITICAL. SubSystem-Exist Check the presence and the state of a subsystem.A subsystem can take statements ACTIVE, ENDING, INACTIVE, RESTRICTED, and STARTING.Report a OK state if the subsystem state is ACTIVE.Report a CRITICAL state if the subsystem state is different than ACTIVE, or the subsystem wasn’t found. Job-Has-No-Msgw Check job existence. Check job hasn’t a MSGW.Report OK state if the job exist.Report CRITICAL state if the job doesn’t exist, or job has a MSGW. Message-QueueCheck messages queue size.Recover all the message whose the severity is superior or equal Filtered-Size to the define arguments.Report a OK stage if the number of message is below the threshold WARNING.Report a WARNING state if the number of message is above the threshold WARNING.Report a CRITICAL state if the number of message is above the threshold CRITICAL. Jobs-Have-NoCheck job existence. Check job hasn’t a MSGW.Report OK state if the job exist.Report Msgw-InCRITICAL state if the job doesn’t exist, or job has a MSGW. 5.2. Operating systems monitoring 15 SubSystem ERETHIS 12 A VENUE ASPAILa FR94290 G ENTILLY Job-Running-InCheckMjob existence. Check jobRhasn’t MSGW.Report OK state if the job exist.Report Subsystem CRITICAL state if the job doesn’t exist, or job has a MSGW. Job-In-Subsystem Check job existence. Check job hasn’t a MSGW.Report OK state if the job exist.Report 5.2.3 OS-FreeBSD-SNMP Template to check FreeBSD server using SNMP protocol Name Cpu Load Memory Swap DiskGeneric-Id DiskGenericName Disk-Global Disk-IO ProcessGeneric TrafficGeneric-Id TrafficGenericName TrafficGlobal Uptime Description Check the rate of utilization of CPU for the machine. This check can give the average CPU utilization rate and the rate per CPU for multi-core CPU. Check the server load average. Check the rate of the utilization of memory (RAM). Check virtual memory usage (SWAP) Check the rate of free space on the disk. For each checks the name of the disk will appear (« label ») rather than the letter assigned. Thresholds can be in percentage or in free space remaining. Check the rate of free space on the disk. For each checks the mount pont of the disk will appear (« label »). Thresholds can be in percentage or in free space remaining. Check the rate of free space on disks. For each checks the mount point of disks will appear (« label »). Thresholds can be in percentage or in free space remaining. Check access disk of the disk. For each check the name of the disk will appear “label” rather than the letter assigned. Check Unix process/service is working. Check the bandwidth of the interface. For each checks the name of the interface will appear (« label » shortcut describing the interface). Check the bandwidth of the interface. For each checks the name of the interface will appear (« label » shortcut describing the interface). Check the bandwidth of interfaces. For each checks the name of the interface will appear (« label » shortcut describing the interface). Time since the server has been working and available. 5.2.4 OS-Linux-SNMP Template to check Linux server using SNMP protocol 16 Chapter 5. Monitoring checks list M ERETHIS 12 AVENUE R ASPAIL FR94290 G ENTILLY Name Cpu Load Memory Swap ProcessGeneric Disk-IO Uptime TrafficGlobal TrafficGeneric-Id TrafficGenericName Disk-Global DiskGeneric-Id DiskGenericName CpuDetailed Description Check the rate of utilization of CPU for the machine. This check can give the average CPU utilization rate and the rate per CPU for multi-core CPU. Check the server load average. Check the rate of the utilization of memory (RAM). Check virtual memory usage (SWAP) Check Linux process/service is working. Check access disk of the disk. For each check the name of the disk will appear “label” rather than the letter assigned. Time since the server has been working and available. Check the bandwidth of interfaces. For each checks the name of the interface will appear (« label » shortcut describing the interface). Check the bandwidth of the interface. For each checks the name of the interface will appear (« label » shortcut describing the interface). Check the bandwidth of the interface. For each checks the name of the interface will appear (« label » shortcut describing the interface). Check the rate of free space on disks. For each checks the mount point of disks will appear (« label »). Thresholds can be in percentage or in free space remaining. Check the rate of free space on the disk. For each checks the name of the disk will appear (« label ») rather than the letter assigned. Thresholds can be in percentage or in free space remaining. Check the rate of free space on the disk. For each checks the mount pont of the disk will appear (« label »). Thresholds can be in percentage or in free space remaining. Check the detailed rate of utilization of CPU for the machine. This check can give the average CPU utilization rate and the rate per CPU for multi-core CPU. 5.2.5 OS-Windows-NRPE Template to check Windows server using NRPE protocol (NSClientpp 0.4.2) 5.2. Operating systems monitoring M ERETHIS 12 AVENUE R ASPAIL FR94290 G ENTILLY 17 Name Cpu Disks Memory Services-Auto Swap Disks CounterGeneric Processgeneric Files-Generic EventlogGeneric ActiveSessions LogfilesGeneric Ntp ServicesGeneric-Name Task-Generic Uptime Description Check the rate of utilization of CPU for the machine. This check can give the average CPU utilization rate and the rate per CPU for multi-core CPU. Check Windows disk usage. Check the rate of the utilization of memory (RAM) Check that all auto-start services are running. Check the rate of the utilization of virtual memory (SWAP) Check Windows disk usage. Check the rate of utilization of CPU for the machine. This check can give the average CPU utilization rate and the rate per CPU for multi-core CPU. Checks processes (state, memory size, numbers,...). Check files/directories (size, last access,...). Check eventlogs errors. Check active sessions. Check log files. Check the synchronization with an NTP server. Check windows services states. Check windows scheduled task. Check windows uptime. 5.2.6 OS-Windows-SNMP Template to check Windows server using SNMP protocol Name Cpu Memory Swap ProcessGeneric TrafficGeneric-Id DiskGenericName Uptime TrafficGenericName TrafficGlobal Disk-Global DiskGeneric-Id 18 Description Check the rate of utilization of CPU for the machine. This check can give the average CPU utilization rate and the rate per CPU for multi-core CPU. Check the rate of the utilization of memory (RAM) and the paging file. The paging file is a partial copy of the RAM in the form of file, allowing to release the memory by copying in the file the less used elements. Check the rate of the utilization of virtual memory (SWAP) Check if Windows process are started. Check the bandwidth of the interface. For each checks the name of the interface will appear (« label » shortcut describing the interface). Check the rate of free space on the disk. For each checks the name of the disk will appear (« label ») rather than the letter assigned. Thresholds can be in percentage or in free space remaining. Check the uptime of the Windows server since the last reboot. It’s just an indication with no threshold Check the bandwidth of the interface. For each checks the name of the interface will appear (« label » shortcut describing the interface). Check the bandwidth of the interface. For each checks the name of the interface will appear (« label » shortcut describing the interface). Check the rate of free space on the disk. For each checks the name of the disk will appear (« label ») rather than the letter assigned. Thresholds can be in percentage or in free space remaining. Check the rate of free space on the disk. For each checks the name of the disk will appear (« label ») rather than the letter assigned. Thresholds can be in percentage or in free space remaining. Chapter 5. Monitoring checks list M ERETHIS 12 AVENUE R ASPAIL FR94290 G ENTILLY 5.2.7 OS-Windows-WMI Template to check Windows server using WMI protocol Name BootServices CPU Disk-C PhysicalMemory VirtualMemory ProcessGeneric ServiceGeneric Uptime TrafficGeneric DiskGeneric Description Check the state of all services which are started at boot Check the rate of utilization of CPU for the machine. This check can give the average CPU utilization rate and the rate per CPU for multi-core CPU. Check the rate of free space in disk C. Thresholds can be in percentage or in free space remaining. Check the utilisation of the Physical memory for the Windows Server Check the utilisation of the Virtual memory for a Windows Server Check if Windows process are started. Check if Windows services are started. Check the uptime of the Windows server since the last reboot. It’s just an indication with no threshold Check the bandwidth of the interface. For each checks the name of the interface will appear (« label » shortcut describing the interface). Check the rate of free space on the disk. For each checks the name of the disk will appear (« label ») rather than the letter assigned. Thresholds can be in percentage or in free space remaining. 5.3 Network equipments monitoring 5.3.1 Net-Cisco-WaaS-SNMP Template to check Cisco Waas using SNMP protocol Name Sessions-Tfo Description Check returning number of TCP connection in passthrough and optimized state with Cisco WaaS tool. 5.3.2 Net-FW-Arkoon-SNMP Template to check Arkoon firewall using SNMP protocol Name Uptime Memory Load Swap Traffic-Global Traffic-Generic-Name Process-Generic Packet-Errors-Generic-Name Packet-Errors-Global Description Get uptime. Check memory usage. Check CPU usage. Check virtual memory usage (SWAP) Check traffic of multiple network interfaces. Check traffic of an network interface. Check Arkoon process/service is working. Check packets on errors/discards of a network interface. Check packets on errors/discards of multiple network interfaces. 5.3.3 Net-FW-Cisco-Asa-SNMP Template to check Cisco Asa Firewall using SNMP protocol 5.3. Network equipments monitoring M ERETHIS 12 AVENUE R ASPAIL FR94290 G ENTILLY 19 Name Cpu Memory Sessions Traffic-Generic-Name Traffic-Generic-Id Traffic-Global Failover Packet-Errors-GenericId Packet-Errors-GenericName Packet-Errors-Global Description Check the rate of utilization of CPU for the machine. This check can give the average utilization rate of CPU. Check machine memory usage. Check current/average connections. Check traffic of an network interface. Check traffic of an network interface. Check traffic of multiple network interfaces. Check failover status. Check packets on errors/discards of a network interface. Check packets on errors/discards of a network interface. Check packets on errors/discards of multiple network interfaces. 5.3.4 Net-Juniper-MAG Template to check Juniper MAG using SNMP protocol Name CPU Memory Logfile Swap BladeTemperature TrafficGeneric-ID TrafficGeneric-Name Traffic-Global Users Disk Description Check the rate of utilization of CPU for the machine. This check can give the average utilization rate of CPU. Check machine memory usage. Check log file usage. Check machine swap usage. Check current blade temperature. Check the bandwidth of the interface. For each checks the name of the interface will appear (« label » shortcut describing the interface). Check the bandwidth of the interface. For each checks the name of the interface will appear (« label » shortcut describing the interface). Check the bandwidth of interfaces. For each checks the name of the interface will appear (« label » shortcut describing the interface). Check current connected users. Check disk usage 5.3.5 Net-Juniper-SA Template to check Juniper SA using SNMP protocol 20 Chapter 5. Monitoring checks list M ERETHIS 12 AVENUE R ASPAIL FR94290 G ENTILLY Name CPU Logfile Memory Swap CPU-Detailed Disk Users TrafficGeneric-ID TrafficGeneric-Name Traffic-Global Description Check the rate of utilization of CPU for the machine. This check can give the average utilization rate of CPU. Check log file usage. Check machine memory usage. Check machine swap usage. Check the rate of utilization of CPU for the machine. This check can give the average utilization rate of CPU. Check disk usage. Check current connected users. Check the bandwidth of the interface. For each checks the name of the interface will appear (« label » shortcut describing the interface). Check the bandwidth of the interface. For each checks the name of the interface will appear (« label » shortcut describing the interface). Check the bandwidth of interfaces. For each checks the name of the interface will appear (« label » shortcut describing the interface). 5.3.6 Net-Juniper-SRX Template to check Juniper SRX using SNMP protocol Name Hardware Cpu-Routing CpuForwarding MemoryRouting MemoryForwarding FlowSessions Cp-Sessions TrafficGeneric-Id TrafficGenericName TrafficGlobal DiskGeneric-Id DiskGenericName Disk-Global Description Check hardware. Check CPU Usage of routing engine. Check CPU Usage of packet forwarding engine. Check Memory Usage of routing engine. Check Memory Usage of packet forwarding engine. Check Packet Forwarding Engine sessions usage. Check CP (‘central point’) sessions usage. Check the bandwidth of the interface. For each checks the name of the interface will appear (« label » shortcut describing the interface). Check the bandwidth of the interface. For each checks the name of the interface will appear (« label » shortcut describing the interface). Check the bandwidth of interfaces. For each checks the name of the interface will appear (« label » shortcut describing the interface). Check the rate of free space on the disk (use the ID). Thresholds can be in percentage or in free space remaining. Check the rate of free space on the disk (use the Name. Difficult to use: prefer ‘Id’ or ‘Global’ to filter). Thresholds can be in percentage or in free space remaining. Check the rate of free space on the disk (use the Name. Difficult to use: prefer ‘Id’ or ‘Global’ to filter). Thresholds can be in percentage or in free space remaining. 5.3.7 Net-Juniper-SSG Template to check Juniper SSG using SNMP protocol 5.3. Network equipments monitoring M ERETHIS 12 AVENUE R ASPAIL FR94290 G ENTILLY 21 Name Sessions Cpu Memory Hardware Sessions TrafficGeneric-Id TrafficGeneric-Name Traffic-Global Cpu Memory Hardware Description Check current active sessions. Check the rate of utilization of CPU for the machine. This check can give the average utilization rate of CPU. Check machine memory usage. Check hardware. Check current active sessions. Check the bandwidth of the interface. For each checks the name of the interface will appear (« label » shortcut describing the interface). Check the bandwidth of the interface. For each checks the name of the interface will appear (« label » shortcut describing the interface). Check the bandwidth of interfaces. For each checks the name of the interface will appear (« label » shortcut describing the interface). Check the rate of utilization of CPU for the machine. This check can give the average utilization rate of CPU. Check machine memory usage. Check hardware. 5.3.8 Net-Fortinet-Fortigate-SNMP Template to check Fortinet Fortigate using SNMP protocol Name Cpu Memory Sessions Hardware VirusGlobal Virus-Name TrafficGlobal TrafficName Traffic-Id Disk ClusterStatus Description Check the rate of utilization of CPU for the machine. This check can give the average utilization rate of CPU. Check memory usage. Check current active sessions. Check hardware sensors. Check blocked and detected virus on multiple virtual domains. Check blocked and detected virus on a virtual domain. Check traffic of multiple network interfaces. Check traffic of an network interface. Check traffic of an network interface. Check system disk usage. Check cluster status. 5.3.9 Net-PaloAlto-500-SNMP Host Template to check Palo Alto Firewall using SNMP protocol Name CPU Hardware Packets-Errors-and-Discards Traffic-Generic-Name Traffic-Generic-Id 22 Description Check CPU Utilization. Check hardware components health through standard RFC MIB. Check packets on errors/discards of multiple network interfaces. Check traffic of an network interface. Check traffic of an network interface. Chapter 5. Monitoring checks list M ERETHIS 12 AVENUE R ASPAIL FR94290 G ENTILLY 5.3.10 Net-FW-Pfsense-SNMP Template to check Pfsense firewall using SNMP protocol Name Blocked-Packets-Global Memory-Dropped-Packets Runtime Blocked-Packets-ID Blocked-Packets-ID Description Check count of packets blocked on multiple network interfaces. Check count of packets dropped due to memory limitations. Time since Pfsense service has been runing and available. Check count of packets blocked on network interface. Check count of packets blocked on network interface. 5.3.11 Net-Stonesoft Template to check Stonesoft Firewall using SNMP protocol Name Memory CPU Rejected-Packets Dropped-Packets Disk-Global Disk-Name Cluster-State Cluster-Load Traffic-Global Traffic-Generic-Name Traffic-Generic-ID Description Check the rate of the utilization of memory. Check the rate of utilization of CPU. Check count of rejected packets. Check count of dropped packets. Check the rate of utilization on disks. Check the rate of utilization on disks. Check the state of the cluster. Check the load of the cluster. Check the bandwidth of the interface. Check the bandwidth of the interface. Check the bandwidth of the interface. 5.3.12 Net-Bluecoat-SNMP Template to check Bluecoat using SNMP protocol Name Client-Connections Client-Requests Client-Traffic Cpu Hardware Memory Server-Connections Disk Description Check current client connections on Bluecoat. Check current http client requests on Bluecoat. Check bytes/s received/delivered to clients on Bluecoat. Check CPU usage on Bluecoat. Check hardware sensors on Bluecoat. Check memory usage on Bluecoat. Check current server connections on Bluecoat. Check disk usage on Bluecoat. 5.3.13 Net-F5-Bigip-SNMP Template to check F5 BIG-IP equipments using SNMP protocol 5.3. Network equipments monitoring M ERETHIS 12 AVENUE R ASPAIL FR94290 G ENTILLY 23 Name Virtualserver-Status-Global Connections Hardware-Global Virtualserver-Status-Generic-Name Node-Status-Generic-Name Node-Status-Global Pool-Status-Generic-Name Pool-Status-Global Hardware-Fan Hardware-Psu Hardware-Temperature Description Check virtual servers status. Check current connections. Check hardware status (‘fan’, ‘temperature’, ‘power supply’). Check a virtual server status. Check a node status. Check nodes status. Check a pool status. Check pools status. Check fan status on hardware. Check fan status on hardware. Check hardware temperatures. 5.3.14 Net-Citrix-Netscaler-MPX8000-SNMP Template to check Citrix Netscaler MPX8000 Series using SNMP protocol Name Cpu Memory Health Storage Traffic-Generic-Id Traffic-Generic-Name Traffic-Global Ha-State Vserver-StatusGeneric-Name Vserver-Status-Global Description Check the rate of utilization of CPU for the machine. This check can give the average utilization rate of CPU. Check machine memory usage. Check hardware environment (Fanspeed, Power Supplies, Temperatures, Voltages). Check the rate of utilization of storages. Check traffic of an network interface. Check traffic of an network interface. Check traffic of multiple network interfaces. Check High Availability Status. Check a vserver status and health. Check vservers status and health. 5.3.15 Net-Ruggedcom-SNMP Template to check Ruggedcom devices using SNMP protocol Name HardwareGlobal Memory Temperature Errors TrafficGeneric-Id TrafficGeneric-Name Traffic-Global Description Check all sensors (‘fan’, ‘power supply’). Check the rate of the utilization of memory (RAM). Check hardware temperature. Check hardware errors. Check the bandwidth of the interface. For each checks the name of the interface will appear (« label » shortcut describing the interface). Check the bandwidth of the interface. For each checks the name of the interface will appear (« label » shortcut describing the interface). Check the bandwidth of interfaces. For each checks the name of the interface will appear (« label » shortcut describing the interface). 5.3.16 Net-Alcatel-OmniSwitch-6850 Template to check Alcatel OmniSwitch 6850 using SNMP protocol 24 Chapter 5. Monitoring checks list M ERETHIS 12 AVENUE R ASPAIL FR94290 G ENTILLY Name Cpu Flash-Memory Hardware Memory Spanning-Tree Traffic-Generic-Id Traffic-Generic-Name Traffic-Global Packet-Errors-Generic-Id Packet-Errors-Generic-Name Packet-Errors-Global Description Check CPU usage. Check Flash memory usage. Check hardware state. Check memory usage. Check Spanning Tree state on interfaces. Check traffic of an network interface. Check traffic of an network interface. Check traffic of multiple network interfaces. Check packets on errors/discards of a network interface. Check packets on errors/discards of a network interface. Check packets on errors/discards of multiple network interfaces. 5.3.17 Net-Aruba-7200 Template to check Aruba 7200 serie using SNMP protocol Name Cpu Memory Storage Hardware-Global Traffic-Generic-Id Traffic-Generic-Name Traffic-Global Packet-Errors-Global Packet-Errors-GenericId Packet-Errors-GenericName Hardware-Fan Hardware-Module Hardware-Psu Description Check the rate of utilization of CPU for the machine. This check can give the average utilization rate of CPU. Check machine memory usage. Check storage device usages. Check hardware status (‘fan’, ‘module’, ‘power supply’). Check traffic of an network interface. Check traffic of an network interface. Check traffic of multiple network interfaces. Check packets on errors/discards of multiple network interfaces. Check packets on errors/discards of a network interface. Check packets on errors/discards of a network interface. Check fan status on hardware. Check module status on hardware. Check power supply status on hardware. 5.3.18 Net-Brocade-SNMP Template to check Brocade switch using SNMP protocol Name Cpu Hardware Memory TrafficGeneric-ID TrafficGeneric-Name Traffic-Global Description Check the rate of utilization of CPU for the machine. Check hardware state. Check memory usage. Check the bandwidth of the interface. For each checks the name of the interface will appear (« label » shortcut describing the interface). Check the bandwidth of the interface. For each checks the name of the interface will appear (« label » shortcut describing the interface). Check the bandwidth of interfaces. For each checks the name of the interface will appear (« label » shortcut describing the interface). 5.3. Network equipments monitoring M ERETHIS 12 AVENUE R ASPAIL FR94290 G ENTILLY 25 5.3.19 Net-Cisco-2800 Template to check Cisco 2800 Router using SNMP protocol Name Cpu Environment Memory Stack Traffic-Generic-Name Traffic-Generic-Id Spanning-Tree Traffic-Global Packet-Errors-GenericId Packet-Errors-GenericName Packet-Errors-Global Description Check the rate of utilization of CPU for the machine. This check can give the average utilization rate of CPU. Check hardware environment (Fans, Power Supplies, Temperatures, Voltages). Check machine memory usage. Check Cisco “Stack” state. Check traffic of an network interface. Check traffic of an network interface. Check Spanning Tree state on interfaces. Check traffic of multiple network interfaces. Check packets on errors/discards of a network interface. Check packets on errors/discards of a network interface. Check packets on errors/discards of multiple network interfaces. 5.3.20 Net-Cisco-2900 Template to check Cisco 2900 Switch using SNMP protocol Name Cpu Environment Memory Stack Traffic-Generic-Name Traffic-Generic-Id Spanning-Tree Traffic-Global Packet-Errors-GenericId Packet-Errors-GenericName Packet-Errors-Global Description Check the rate of utilization of CPU for the machine. This check can give the average utilization rate of CPU. Check hardware environment (Fans, Power Supplies, Temperatures, Voltages). Check machine memory usage. Check Cisco “Stack” state. Check traffic of an network interface. Check traffic of an network interface. Check Spanning Tree state on interfaces. Check traffic of multiple network interfaces. Check packets on errors/discards of a network interface. Check packets on errors/discards of a network interface. Check packets on errors/discards of multiple network interfaces. 5.3.21 Net-Cisco-3750 Template to check Cisco 3750 Switch using SNMP protocol 26 Chapter 5. Monitoring checks list M ERETHIS 12 AVENUE R ASPAIL FR94290 G ENTILLY Name Cpu Environment Memory Stack Traffic-Generic-Name Traffic-Generic-Id Spanning-Tree Traffic-Global Packet-Errors-GenericId Packet-Errors-GenericName Packet-Errors-Global Description Check the rate of utilization of CPU for the machine. This check can give the average utilization rate of CPU. Check hardware environment (Fans, Power Supplies, Temperatures, Voltages). Check machine memory usage. Check Cisco “Stack” state. Check traffic of an network interface. Check traffic of an network interface. Check Spanning Tree state on interfaces. Check traffic of multiple network interfaces. Check packets on errors/discards of a network interface. Check packets on errors/discards of a network interface. Check packets on errors/discards of multiple network interfaces. 5.3.22 Net-Cisco-3850 Template to check Cisco 3850 Switch using SNMP protocol Name Cpu Environment Memory Stack Traffic-Generic-Name Traffic-Generic-Id Spanning-Tree Traffic-Global Packet-Errors-GenericId Packet-Errors-GenericName Packet-Errors-Global Description Check the rate of utilization of CPU for the machine. This check can give the average utilization rate of CPU. Check hardware environment (Fans, Power Supplies, Temperatures, Voltages). Check machine memory usage. Check Cisco “Stack” state. Check traffic of an network interface. Check traffic of an network interface. Check Spanning Tree state on interfaces. Check traffic of multiple network interfaces. Check packets on errors/discards of a network interface. Check packets on errors/discards of a network interface. Check packets on errors/discards of multiple network interfaces. 5.3.23 Net-Cisco-Nexus-5000 Template to check Cisco Nexus 5000 Switch using SNMP protocol 5.3. Network equipments monitoring M ERETHIS 12 AVENUE R ASPAIL FR94290 G ENTILLY 27 Name Cpu Environment Memory Traffic-Generic-Name Traffic-Generic-Id Spanning-Tree Traffic-Global Packet-Errors-GenericId Packet-Errors-GenericName Packet-Errors-Global Description Check the rate of utilization of CPU for the machine. This check can give the average utilization rate of CPU. Check hardware environment (Fans, Power Supplies, Temperatures, Voltages). Check machine memory usage. Check traffic of an network interface. Check traffic of an network interface. Check Spanning Tree state on interfaces. Check traffic of multiple network interfaces. Check packets on errors/discards of a network interface. Check packets on errors/discards of a network interface. Check packets on errors/discards of multiple network interfaces. 5.3.24 Net-Hp-Procurve-SNMP Template to check HP Procurve Switches using SNMP protocol Name Cpu Memory Environment Traffic-Global Traffic-Generic-Id Traffic-Generic-Name Description Check the rate of utilization of CPU for the machine. Check machine memory usage. Check hardware environment (Fans, Power Supplies, Temperatures). Check traffic of multiple network interfaces. Check traffic of an network interface. Check traffic of an network interface. 5.4 Software monitoring 5.4.1 App-Appservers-JMX-JDK6 Template to check JMX JDK6 28 Chapter 5. Monitoring checks list M ERETHIS 12 AVENUE R ASPAIL FR94290 G ENTILLY Name jmx-threads Description Check thread activity (3 metrics). OK if thread & daemon thread are below warning threshold WARNING if thread & daemon thread are between warning and critical threshold CRITICAL if thread & daemon thread are above critical threshold. Check the heap memory’s state. jmx-MemoryHeapMemoryUsage jmx-MemoryCheck the non-hip state NonHeapMemoryUsage jmx-MemoryPoolCheck the memory pool CMS Old Gen (ConcurrentMarkSweep). CMS-Old-Gen jmx-MemoryPoolCheck the memory pool CMS Perm Gen (ConcurrentMarkSweep). CMS-Perm-Gen jmx-MemoryPoolMemory size (in megabytes) used for “PS Code Cache”. Informations JMX: Code-Cache Name:java.lang:type=MemoryPool,name=CodeCache Attribut : Usage jmx-MemoryPoolMemory size used (in megabytes) by “PS Eden Space ». Informations JMX : Name: Par-Eden-Space java.lang:type=MemoryPool,name=PS Eden Space Attribut : Usage jmx-MemoryPoolMemory size used (in megabytes) for “PS Survivor Space”. Informations JMX : Name: Par-Survivor-Space java.lang:type=MemoryPool,name=PS Survivor Space Attribut : Usage jmx-fd Check the number of file descriptors opened by an application. The following states can be traced back: * OK: percent of file descriptors opened lower than alert threshold. 5.4.2 App-Appservers-JMX-JDK7 Template to check JMX JDK7 5.4. Software monitoring M ERETHIS 12 AVENUE R ASPAIL FR94290 G ENTILLY 29 Name jmx-classes jmx-threads Description Number of classes busy in FJV memory. Check thread activity (3 metrics). OK if thread & daemon thread are below warning threshold WARNING if thread & daemon thread are between warning and critical threshold CRITICAL if thread & daemon thread are above critical threshold. Check the heap memory’s state. jmx-MemoryHeapMemoryUsage jmx-MemoryCheck the non-hip state NonHeapMemoryUsage jmx-MemoryPoolMemory size (in megabytes) used for “PS Code Cache”. Informations JMX: Code-Cache Name:java.lang:type=MemoryPool,name=CodeCache Attribut : Usage jmx-MemoryPoolCheck the memory pool eden space (PS MarkSweep, PS Scavenge). PS-Eden-Space jmx-MemoryPoolMemory size used (in megabytes) for “PS Old Gen” Informations JMX : Name: PS-Old-Gen java.lang:type=MemoryPool,name=PS Old Space Attribut : Usage jmx-MemoryPoolCheck the memory pool PermGen (PS MarkSweep). PS-Perm-Gen jmx-MemoryPoolCheck the memory pool Survivor Space (PS MarkSweep, PS Scavenge). PS-Survivor-Space jmx-fd Check the number of file descriptors opened by an application. The following states can be traced back: * OK: percent of file descriptors opened lower than alert threshold. jmx-traffic Check the traffic in a specific port server tomcat. jmx-memory Check Java memory state (4 metrics) Report a OK state if the percentage of memory used/committed is below the maximum limit threshold WARNING and above the minimum limit threshold WARNING Report a WARNING state if the percentage of memory used/committed is below the maximum limit threshold CRITICAL and above the maximum limit threshold WARNING Report a WARNING state if the percentage of memory used/committed is above the minimum limit threshold CRITICAL and below the minimum limit threshold WARNING Report a CRITICAL state if the percentage of memory used/committed is above the maximum limit threshold CRITICAL Report a CRITICAL state if the percentage of memory used/committed is below the minimum limit threshold CRITICAL. jmx-simpletype Check an attribute to a single metric. Report a OK state if the metric is below the maximum limit threshold WARNING and above the minimum limit threshold WARNING. Report a WARNING state if the metric is below the maximum limit threshold CRITICAL and above the maximum limit threshold WARNING. Report a WARNING state if the metric is above the minimum limit threshold CRITICAL and below the minimum limit threshold WARNING. Report a CRITICAL state if the metric is above the maximum limit threshold CRITICAL. Report a CRITICAL state is the metric is below the minimal limit threshold CRITICAL. 5.4.3 App-Backup-EMC-RecoveryPoint-SSH Template to check RecoveryPoint Appliance using SSH protocol Name Monitored-Parameters System-Status Description Check monitored parameters by EMC RecoveryPoint appliance. Check system status of EMC RecoveryPoint appliance. 5.4.4 App-DB-MSSQL Template to check MSSQL Database 30 Chapter 5. Monitoring checks list M ERETHIS 12 AVENUE R ASPAIL FR94290 G ENTILLY Name ConnectionTime BlockedProcesses Deadlocks Failed-Jobs DatabaseUsed ConnectedUsers Backup-Age DatabaseFree CacheHitratio Transactions ConnectedUsers Locks-Waits Description Check the connection time to the server. This time is given in seconds. Check blocked processes on the server. Service cannot work because of a SQL request. It depends of your MS SQL Server version. Check deadlocks per second of the server. Check MSSQL failed jobs. Check used space of databases on the server. Check number of connected users on the database. Check database backups of the server. This time is given in minutes. Check free space of databases on the server. Check the “Data Buffer Cache Hit Ratio” of the server. No alerts by default. Check transactions per second of the server. No alerts by default. Check number of connected users on the database. Check locks-waits per second of the server. 5.4.5 App-DB-MySQL Template to check MySQL Database Name Connection-Time ConnectionsNumber Database-Size Innodb-Bufferpool Myisam-Keycache Slowqueries Open-Files Queries Uptime Description Check the connection time to the server. This time is given in seconds. Check the number of open connections. Check size of databases. No Alerts by default. Check the hit rate of the InnoDB buffer. Check the hit rate of the MyISAM tables. Check the number of slow queries since the last check. Gives the average rate per second. Check the number of files that currently are open. Check the average of queries per second. This check indicates the operation time since the server is running. This time in given in minutes. 5.4.6 App-DB-Oracle Template to check Oracle Database 5.4. Software monitoring M ERETHIS 12 AVENUE R ASPAIL FR94290 G ENTILLY 31 Name Connection-Time Corrupted-Blocks Tnsping Tablespace-Usage Connection-Number Datacache-Hitratio Rman-Backup-Problems Connection-Time Corrupted-Blocks Tnsping Tablespace-Usage Connection-Number Datacache-Hitratio Rman-Backup-Problems Description Check the connection time to the server. This time is given in seconds. Check the number of corrupted blocks on the server. Check the connection to a remote listener. Check the tablespace usage of the server. Check connection number to the Oracle server. Check the ‘Data Buffer Cache Hit Ratio’ of the server. No alerts by default. Check RMAN backup errors of the server during the last three days. Check the connection time to the server. This time is given in seconds. Check the number of corrupted blocks on the server. Check the connection to a remote listener. Check the tablespace usage of the server. Check connection number to the Oracle server. Check the ‘Data Buffer Cache Hit Ratio’ of the server. No alerts by default. Check RMAN backup errors of the server during the last three days. 5.4.7 App-DB-Postgres Template to check Postgres Database Name Connection-Number Connection Cache-Hitratio Locks Query-Time Time-Sync Tablespace-Size Vacuum Description Check connection number to the Postgres server. Check connection to the Postgres server. Check the “buffer cache hitratio” of the Postgres server. Check locks of the Postgres server. Check request time of the Postgres server. Check time between poller and the Postgres server. Check time between poller and the Postgres server. Check the execution of Vacuum on a DB for a given amount of days. 5.4.8 App-Lm-Sensors-SNMP Template to check LM Sensors using SNMP protocol Name Hardware-Fan Hardware-Misc Hardware-Temperature Hardware-Voltage Description Check fans sensors. Check misc sensors. Check temperature sensors. Check voltage sensors. 5.4.9 App-Mail-Bluemind Template to check Bluemind Server 32 Chapter 5. Monitoring checks list M ERETHIS 12 AVENUE R ASPAIL FR94290 G ENTILLY Name Bluemind-Process-Core Bluemind-Process-EAS Bluemind-Process-HPS Bluemind-Process-IPS Bluemind-Process-LMTP Bluemind-Process-MQ Proc-Postfix Bluemind-Process-Cyrus Hprof-File Incoming-Mails Description Check bm-core process. Check bm-eas process. Check bm-hps process. Check bm-ips process. Check bm-lmtpd process. Check bm-mq process. Check postfix process. Check cyrus process. Check the presence of hprof file. Check incoming mails. 5.4.10 App-Monitoring-Centreon-Central Template to check Centreon Central Server Name proc-centcore proc-snmptrapd proc-crond proc-httpd proc-ntpd proc-broker-rrd proc-centengine proc-sshd proc-broker-sql proc-snmptrapd Description Check centcore process. Check snmptrapd process. Check crond process. Check Apache process. Check NTP process. Check Broker RRD process. Check centreon-engine process. Check sshd process. Check Broker SQL process. Check snmptrapd process. 5.4.11 App-Monitoring-Centreon-Poller Template to check Centreon Poller Server Name proc-snmptrapd proc-ntpd proc-centengine proc-sshd proc-snmptrapd Description Check snmptrapd process. Check NTP process. Check centreon-engine process. Check sshd process. Check snmptrapd process. 5.4.12 App-Protocol-DNS Template to check a DNS Server Name DNS-Request Description Check requests to a DNS server. 5.4.13 App-Protocol-FTP Ttemplate to check several things on remote FTP Server 5.4. Software monitoring M ERETHIS 12 AVENUE R ASPAIL FR94290 G ENTILLY 33 Name FTP-Login FTP-Date FTP-Commands FTP-FilesCount Description Check connection on a remote FTP server with username and password. Check date of files within a directory or a specific file on a remote FTP Server. Check several commands execution on a remote FTP server. Count files on a remote FTP directory (recursive or not). 5.4.14 App-Protocol-HTTP Template to check an HTTP Server Name HTTP-Response-Time HTTP-Expected-Content Description Check response time of a Wabpage. Check the presence of a string in a Webpage. 5.4.15 App-Protocol-LDAP Template to check a LDAP Server Name LDAP-Login LDAP-Search Description Check login to a LDAP server. Check search to a LDAP server. 5.4.16 App-Protocol-NTP Template to check an NTP Server Name NTP-Response-Time Description Check response time of NTP server. 5.4.17 App-Protocol-SMTP Template to check a SMTP Server Name SMTP-Login SMTP-Message Description Check connection to a SMTP server. Check sending a message to a SMTP server. 5.4.18 App-Webserver-Apache-ServerStatus Template to check Apache Server using Server Status webpage Name Apache-Requests Apache-ResponseTime Apache-SlotStates Apache-Cpuload Apache-Workers 34 Description Check request informations. Check response time of ‘mod_status’ page. Check slot informations. Check Apache Cpuload. Check Apache busy processes. Chapter 5. Monitoring checks list M ERETHIS 12 AVENUE R ASPAIL FR94290 G ENTILLY 5.4.19 App-Webserver-Nginx-ServerStatus Template to check Nginx Server using ‘stub_status_module’ webpage Name Nginx-ResponseTime Nginx-Requests Nginx-Connections Description Check response time of ‘stub_status_module’ page. Check request informations. Check current connections. 5.4.20 App-Webserver-Tomcat6-Webmanager Template to check Tomcat 6 Server using Tomcat Webmanager Name Tomcat-Requestinfo-Global Tomcat-Traffic-Global Tomcat-Applications-Global Tomcat-Sessions-Global Tomcat-Threads-Global Tomcat-Memory Description Check tomcat metrics (request count, error count,...). Check traffic for each connectors. Check status of tomcat applications. Check number of sessions per application. Check threads for each connectors. Check Tomcat memory. 5.4.21 App-Webserver-Tomcat7-Webmanager Template to check Tomcat 7 Server using Tomcat Webmanager Name Tomcat-Requestinfo-Global Tomcat-Traffic-Global Tomcat-Applications-Global Tomcat-Sessions-Global Tomcat-Threads-Global Tomcat-Memory Description Check tomcat metrics (request count, error count,...). Check traffic for each connectors. Check status of tomcat applications. Check number of sessions per application. Check threads for each connectors. Check Tomcat memory. 5.5 Virtualisation monitoring 5.5.1 Virt-VMWare-ESX Template to check VMWare ESX Server using Centreon-ESXD connector 5.5. Virtualisation monitoring M ERETHIS 12 AVENUE R ASPAIL FR94290 G ENTILLY 35 Name Esx-Cpu Esx-Memory Esx-Swap Esx-Health Esx-Global-Status Esx-Traffic-Generic Esx-Datastores-Latency Esx-Vm-Count Esx-is-Maintenance Esx-Uptime Datastore-Usage-Generic Datastore-Io-Generic Datastore-Snapshots-Generic Esx-Traffic-Global Datastore-Iops-Generic Description Check CPU usage of an ESX Server. Check Memory usage of an ESX Server. Check if a virtual machine is swapping on the ESX server. Check hardware and CPU sensors of an ESX Server. Check global status of an ESX Server. Check traffic usage of a physical network interface. Thresholds are in percent. Check Datastores latency of an ESX Server. Check virtual machines running on an ESX Server. Check maintenance mode of an ESX Server. Get uptime in days of an ESX Server. Check datastore usage. Check datastore usage in Kbps. Check snapshots usage on a datastore. Check traffic usage of multiple physical network interfaces. Thresholds are in percent. Check average IOPs of a datastore. 5.5.2 Virt-VMWare-VCenter-5 Template to check VCenter 5 using Centreon-ESXD connector Name Vm-Snapshot-Global Vm-Limit-Global Description Check snapshot age of multiple virtual machines (Vsphere 5). Check limit definition (cpu, memory, disk) on multiple virtual machines. 5.5.3 Virt-VMWare-VCenter Template to check VCenter using Centreon-ESXD connector Name Vm-Tools-Global Datastore-Usage-Global Datastore-Iops-Global Datastore-Snapshots-Global Vm-Thinprovisioning-Global Description Check vmtools state of multiple virtual machines. Check multiple datastores usage. Check average IOPs of multiple datastores. Check snapshots usage on multiple datastores. Check if a virtual machine has a disk in mode ‘thinprovisioning’ or not. 5.5.4 Virt-VMWare-VM Template to check VMWare Virtual Machine using Centreon-ESXD connector Name Vm-Cpu Vm-Memory Vm-Datastores-Iops Vm-Swap Vm-Tools Vm-Snapshot Vm-Thinprovisioning Vm-Limit 36 Description Check CPU usage of a virtual machine. Check memory usage of a virtual machine. Check datastores IOPS linked to the virtual machine. (IOPS done by the virtual machine) Check if a virtual machine is swapping. Check vmtools state of a virtual machine. Check snapshot age of a virtual machine. Check if a virtual machine has a disk in mode ‘thinprovisioning’ or not. Check limit definition (cpu, memory) on a virtual machine. Chapter 5. Monitoring checks list M ERETHIS 12 AVENUE R ASPAIL FR94290 G ENTILLY 5.6 Hardware monitoring 5.6.1 HW-Printer-standard-rfc3805 Template to check printer according to RFC 3805 Name Cover-Status Printer-Errors Printer-Hardware Impressions MarkerSupplyUsage PaperTray-Usage Description Check Printer cover status based on components present in the “Cover” table of the rfc3805 mib. Check errors on printer like paper low or jammed. Check several hardware health indicator through an unique service. Check the number of impressions between two checks. Check marker supply usage. Check utilization rate of papertrays. 5.6.2 HW-Sensor-Sensorip-SNMP Template to check SensorIP equipments using SNMP Name Sensors-Global Sensors-Temperature Sensors-Humidity Sensors-Sp Sensors-Switch Description Check all sensors (global status, , temperatures, humidity, switch) of equipment. Check temperature sensors of equipment. Check humidity sensors of equipment. Check sensor probe status of equipment. Check switch sensors of equipment. 5.6.3 HW-Sensor-Sensormetrix-Em01-Web Template to check Sensormetrix Em01 series using HTTP protocol Name Humidity Illumination Temperature Thermistor-Temperature Voltage Contact Flood Description Check humidity sensor. Check illumination sensor. Check temperature sensor. Check thermistor temperature sensor. Check voltage sensor. Check contact sensor. Check flood sensor. 5.6.4 HW-Server-Cisco-Ucs Template to check Cisco UCS using SNMP protocol Name Audit-Logs Equipment Faults Description Check audit logs. Check hardware state. Check faults. 5.6. Hardware monitoring M ERETHIS 12 AVENUE R ASPAIL FR94290 G ENTILLY 37 5.6.5 HW-Server-Dell-iDrac-SNMP Template to check Dell server through iDrac card using SNMP protocol Name GlobalStatus Description Check global status of the equipment. 5.6.6 HW-Server-Dell-Openmanage-SNMP Template to check Dell server using Openmanage using SNMP protocol Name Hardware-Global Hardware-Globalstatus Hardware-Fan Hardware-Psu Hardware-Temperature Hardware-Cachebattery Hardware-Memory Hardware-Physicaldisk Hardware-Logicaldrive Hardware-Esmlog Hardware-Battery Hardware-Controller Hardware-Connector Hardware-Storage Description Check hardware (‘fan’, ‘cpu’, ‘power supply’, ‘temperature’, ‘battery’,...) of Dell Server. Check global status of Dell Server. Check fans of Dell Server. Check power supplies of Dell Server. Check temperatures of Dell Server. Check cache batteries of Dell Server. Check memories of Dell Server. Check physical drives of Dell Server. Check logical drives of Dell Server. Check event logs of Dell Server. Check batteries of Dell Server. Check controllers of Dell Server. Check connector of Dell Server. Check storage of Dell Server. 5.6.7 HW-Server-Hp-Bladechassis-SNMP Template to check HP Blade Chassis using SNMP protocol Name HardwareGlobal Hardware-Fan HardwareEnclosure HardwareManager Hardware-Blade HardwareNetwork Hardware-Psu HardwareTemperature Hardware-Fuse Description Check hardware (‘enclosure’, ‘manager’, ‘fan’, ‘blade’, ‘network’, ‘power supply’, ‘temperature’, ‘fuse’) of HP Blade chassis. Check ‘fan’ hardware of HP Blade chassis. Check ‘enclosure’ hardware of HP Blade chassis. Check ‘manager’ hardware of HP Blade chassis. Check ‘blades’ hardware of HP Blade chassis. Check ‘network’ hardware of HP Blade chassis. Check ‘power supply’ hardware of HP Blade chassis. Check ‘temperatures’ hardware of HP Blade chassis. Check ‘fuse’ hardware of HP Blade chassis. 5.6.8 HW-Server-Hp-Server-SNMP Template to check HP server through HP Insight Management agent using SNMP protocol Name Hardware-Global 38 Description Check all hardware (‘cpu’, ‘fan’, ‘temperature’, ‘power supply’, ...). Chapter 5. Monitoring checks list M ERETHIS 12 AVENUE R ASPAIL FR94290 G ENTILLY 5.6.9 HW-Server-IBM-IMM-SNMP Template to check IBM server through IMM card using SNMP protocol Name Environment-Global Eventlog Environment-GlobalStatus Environment-Temperature Environment-Voltage Environment-Fan Description Check all sensors (‘globalstatus’, ‘fan’, ‘temperature’, ‘voltage’) of IBM IMM card. Check eventlogs of IBM IMM card. Check ‘globalstatus’ sensor of IBM IMM card. Check ‘temperature’ sensors of IBM IMM card. Check ‘voltage’ sensors of IBM IMM card. Check ‘fan’ sensors of IBM IMM card. 5.6.10 HW-Server-Sun-Alom-TELNET Template to check Sun server vXXX (v240, v440, v245,...) through ALOM card using Telnet protocol Name Hardware Description Check Sun vXXX (v240, v440, v245,...) Hardware (through ALOM). 5.6.11 HW-Server-Sun-Sfxxxx-TELNET Template to check Sun server sfXXXX (sf6900, sf6800, sf3800,...) through ScpApp card using Telnet protocol Name Hardware Description Check Sun SFxxxx (sf6900, sf6800, sf3800,...) Hardware through ScApp. 5.6.12 HW-Server-Sun-V8xx-TELNET Template to check Sun server v8xx (v890, v880) through RSC card using Telnet protocol Name Hardware Description Check Sun v890 and v880 Hardware through RSC card. 5.6.13 HW-Server-Sun-V4xx-TELNET Template to check Sun server v4xx (v490, v480) through RSC card using Telnet protocol Name Hardware Description Check Sun v480 and v490 Hardware through RSC card. 5.6.14 HW-Server-Sun-Sf2xx-TELNET Template to check Sun server sf280 through RSC card using Telnet protocol Name Hardware Description Check Sun sf280 Hardware through RSC card. 5.6. Hardware monitoring M ERETHIS 12 AVENUE R ASPAIL FR94290 G ENTILLY 39 5.6.15 HW-Server-Sun-Alom4v-SSH Template to check Sun server (T1xxx, T2xxx) through ALOM4v card using SSH protocol Name Hardware Description Check Sun ‘T1xxx’, ‘T2xxx’ ans ‘T5xxx’ Hardware through ALOM4v. 5.6.16 HW-Server-Sun-Ilom-SSH Template to check Sun server (T3-x, T4-x, T5xxx) through ILOM card using SSH protocol Name Hardware Description Check Sun ‘T3-x’, ‘T4-x’ and ‘T5xxx’ Hardware through ILOM. 5.6.17 HW-Server-Sun-Ilom-IPMITOOL Template to check Sun server (x4600, x4500, x4100,...) through ILOM card using IPMI protocol Name Chassis-Status Description Check Sun ‘x4600’, ‘x4100’,... global chassis status through ILOM. 5.6.18 HW-Server-Sun-Mseries-SSH Template to check Sun server Mxxxx (M4000, M5000, M8000,...) through XSCF using SSH protocol Name Hardware Description Check Sun ‘Mxxx’ Hardware through XSCF. 5.6.19 HW-Server-Sun-Mseries-SNMP Template to check Sun server Mxxxx (M4000, M5000, M8000,...) using SNMP protocol Name Hardware Domains Description Check Sun mseries Hardware. Check status of sun domains. 5.6.20 HW-Server-Sun-Sfxxk-PSSH Template to check Sun server sfxxk (sf12k, sf15k, sf20k, sf25k) using SSH protocol (no plugins on system controller) Name Failover-Status Boards Environment Description Check system controller failover status. Check Sun ‘sfxxk’ boards. Check Sun ‘sfxxk’ environment. 5.6.21 HW-Storage-Dell-MD3000-Cli Template to check Dell MD3000 series using SMcli Name Health-Status 40 Description Check storage health status. Chapter 5. Monitoring checks list M ERETHIS 12 AVENUE R ASPAIL FR94290 G ENTILLY 5.6.22 HW-Storage-Dell-TL2000-SNMP Template to check Dell TL2000 Tape Library using SNMP protocol Name GlobalStatus Description Check global status of the equipment. 5.6.23 HW-Storage-EMC-Clariion-Navisphere Template to check EMC Clariion using Navisphere client Name Disks Hardware-Global Cache Controller Faults Port-State Hba-State Description Check disks status and performances. Check all hardware status (fan, psu, cpu, memory, cable, io module, sp, lcc). Check cache state. Check global controller (busy usage, iops). Check faults on the array. Check SP port state. Check connection state of servers. 5.6.24 HW-Storage-EMC-DataDomain-SNMP Template to check EMC DataDomain using SNMP Name Hardware-Global Filesystem-Global Hardware-Fan Hardware-Psu Hardware-Disk Hardware-Battery Description Check all hardware (fans, power supplies, temperatures, disks, nvram batteries) of storage. Check the rate of free space on filesystems. Check fans of storage. Check power supplies of storage. Check disks of storage. Check nvram batteries of storage. 5.6.25 HW-Storage-Hp-Msa2000-SNMP Template to check HP MSA2000 using SNMP protocol Name Hardware Traffic-Generic-Id Traffic-Generic-Name Traffic-Global Description Check hardware state. Check traffic of an network interface. Check traffic of an network interface. Check traffic of multiple network interfaces. 5.6.26 HW-Storage-Hp-P2000-Xmlapi Template to check HP P2000 using XML API Name Health Volume-Stats-Global Volume-Stats-Name Description Check health state. Check volumes statistics. Check volume statistics. 5.6. Hardware monitoring M ERETHIS 12 AVENUE R ASPAIL FR94290 G ENTILLY 41 5.6.27 HW-Storage-IBM-DS3000-Cli Template to check IBM DS3000 series using SMcli Name Health-Status Description Check storage health status. 5.6.28 HW-Storage-IBM-DS4000-Cli Template to check IBM DS4000 series using SMcli Name Health-Status Description Check storage health status. 5.6.29 HW-Storage-IBM-DS5000-Cli Template to check IBM DS5000 series using SMcli Name Health-Status Description Check storage health status. 5.6.30 HW-Storage-IBM-TS3100-SNMP Template to check IBM TS3100 Tape Library using SNMP protocol Name GlobalStatus Description Check global status of the equipment. 5.6.31 HW-Storage-IBM-TS3200-SNMP Template to check IBM TS3200 Tape Library using SNMP protocol Name GlobalStatus Description Check global status of the equipment. 5.6.32 HW-Storage-NetApp-SNMP Template to check Netapp Storage using SNMP protocol 42 Chapter 5. Monitoring checks list M ERETHIS 12 AVENUE R ASPAIL FR94290 G ENTILLY Name Cpu-Load Disk-Failed Fan Global-status Nvram Psu Shelf Temperature Ndmpsessions Fs-Global Volume-Options-Generic Partner-Status Fs-Generic Description Check CPU usage. Check the current number of disk broken. Check if fans are failed. Check the overall status of the appliance. Check current status of the NVRAM batteries. Check if power supplies are failed. Check Shelves hardware. Check if hardware is currently operating outside of its recommended temperature range. Check current total of ndmp sessions opened. Check filesystem usage. Check options from volumes. Check status of clustered failover partner. Check filesystem usage. 5.6.33 HW-Storage-Violin-3000-SNMP Template to check Violin 3000 using SNMP Name Hardware-Global Hardware-Fan Hardware-Psu Hardware-ChassisAlarm Hardware-Vimm HardwareTemperature Hardware-Global-Fc Hardware-Local-Fc Description Check all hardware (Fans, Power Supplies, Temperatures, Chassis alarm, vimm, global fc, local fc) of storage. Check fans of storage. Check power supplies of storage. Check chassis alarm of storage. Check vimm of storage. Check temperatures of storage. Check global fc of storage. Check local fc of storage. 5.6.34 HW-UPS-Standard-Rfc1628-SNMP Template to check UPS Hardware using rfc 1628 standard using SNMP protocol Name Alarms Output-Source Output-Lines Input-Lines Battery-Status Description Check if alarms present. Check output source status. Check output lines metrics. Check input lines metrics. Check battery status and battery charge remaining. 5.6. Hardware monitoring M ERETHIS 12 AVENUE R ASPAIL FR94290 G ENTILLY 43