VOOZH about

URL: https://www.zabbix.com/integrations/smart

⇱ S.M.A.R.T. monitoring and integration with Zabbix


Propose integration

S.M.A.R.T.

S.M.A.R.T. (Self-Monitoring, Analysis and Reporting Technology; often written as SMART) is a monitoring system included in computer hard disk drives (HDDs), solid-state drives (SSDs), and eMMC drives

Available solutions




This template is for Zabbix version: 6.2
Also available for: 6.0 5.4 5.0

Source: https://git.zabbix.com/projects/ZBX/repos/zabbix/browse/templates/module/smart_agent2?at=release/6.2

SMART by Zabbix agent 2

Overview

For Zabbix version: 6.2 and higher
The template for monitoring S.M.A.R.T. attributes of physical disk that works without any external scripts. It collects metrics by Zabbix agent 2 version 5.0 and later with Smartmontools version 7.1 and later. Disk discovery LLD rule finds all HDD, SSD, NVMe disks with S.M.A.R.T. enabled. Attribute discovery LLD rule have pre-defined Vendor Specific Attributes for each disk, and will be discovered if attribute is present.

This template was tested on:

  • Smartmontools, version 7.1 and later

Setup

See Zabbix template operation for basic instructions.

Install the Zabbix agent 2 and Smartmontools 7.1. Grant Zabbix agent 2 super/admin user privileges for smartctl utility.

Linux example:

sudo dnf install smartmontools sudo visudo

zabbix ALL=(ALL) NOPASSWD:/usr/sbin/smartctl

Plugin parameters list

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name Description Default
{$SMART.DISK.NAME.MATCHES}

This macro is used in the filter of attribute and disk discoveries. It can be overridden on the host or linked on the template level.

^.*$
{$SMART.DISK.NAME.NOT_MATCHES}

This macro is used in the filter of attribute and disk discoveries. It can be overridden on the host or linked on the template level.

CHANGE_IF_NEEDED
{$SMART.TEMPERATURE.MAX.CRIT}

This macro is used for trigger expression. It can be overridden on the host or linked on the template level.

65
{$SMART.TEMPERATURE.MAX.WARN}

This macro is used for trigger expression. It can be overridden on the host or linked on the template level.

50

Template links

There are no template links in this template.

Discovery rules

Name Description Type Key and additional info
Disk discovery

Discovery SMART disks.

ZABBIX_PASSIVE smart.disk.discovery

Filter:

AND

- {#NAME} MATCHES_REGEX {$SMART.DISK.NAME.MATCHES}

- {#NAME} NOT_MATCHES_REGEX {$SMART.DISK.NAME.NOT_MATCHES}

Overrides:

Self-test
- {#DISKTYPE} MATCHES_REGEX nvme
- ITEM_PROTOTYPE LIKE Self-test - NO_DISCOVER

Not NVMe
- {#DISKTYPE} NOT_MATCHES_REGEX nvme
- ITEM_PROTOTYPE REGEXP `Media

Items collected

Group Name Description Type Key and additional info
Zabbix raw items SMART [{#NAME}]: Smartctl error

This metric will contain smartctl errors.

DEPENDENT smart.disk.error[{#NAME}]

Preprocessing:

- JSONPATH: $.error

- DISCARD_UNCHANGED_HEARTBEAT: 1h

Zabbix raw items SMART [{#NAME}]: Get disk attributes

-

ZABBIX_PASSIVE smart.disk.get[{#PATH},"{#RAIDTYPE}"]
Zabbix raw items SMART [{#NAME}]: Device model

-

DEPENDENT smart.disk.model[{#NAME}]

Preprocessing:

- JSONPATH: $.model_name

- DISCARD_UNCHANGED_HEARTBEAT: 6h

Zabbix raw items SMART [{#NAME}]: Serial number

-

DEPENDENT smart.disk.sn[{#NAME}]

Preprocessing:

- JSONPATH: $.serial_number

- DISCARD_UNCHANGED_HEARTBEAT: 6h

Zabbix raw items SMART [{#NAME}]: Self-test passed

The disk is passed the SMART self-test or not.

DEPENDENT smart.disk.test[{#NAME}]

Preprocessing:

- JSONPATH: $.self_test_passed

- DISCARD_UNCHANGED_HEARTBEAT: 6h

Zabbix raw items SMART [{#NAME}]: Temperature

Current drive temperature.

DEPENDENT smart.disk.temperature[{#NAME}]

Preprocessing:

- JSONPATH: $.temperature

- DISCARD_UNCHANGED_HEARTBEAT: 6h

Zabbix raw items SMART [{#NAME}]: Power on hours

Count of hours in power-on state. The raw value of this attribute shows total count of hours (or minutes, or seconds, depending on manufacturer) in power-on state. "By default, the total expected lifetime of a hard disk in perfect condition is defined as 5 years (running every day and night on all days). This is equal to 1825 days in 24/7 mode or 43800 hours." On some pre-2005 drives, this raw value may advance erratically and/or "wrap around" (reset to zero periodically). https://en.wikipedia.org/wiki/S.M.A.R.T.#Known_ATA_S.M.A.R.T._attributes

DEPENDENT smart.disk.hours[{#NAME}]

Preprocessing:

- JSONPATH: $.power_on_time

- DISCARD_UNCHANGED_HEARTBEAT: 6h

Zabbix raw items SMART [{#NAME}]: Percentage used

Contains a vendor specific estimate of the percentage of NVM subsystem life used based on the actual usage and the manufacturer's prediction of NVM life. A value of 100 indicates that the estimated endurance of the NVM in the NVM subsystem has been consumed, but may not indicate an NVM subsystem failure. The value is allowed to exceed 100. Percentages greater than 254 shall be represented as 255. This value shall be updated once per power-on hour (when the controller is not in a sleep state).

DEPENDENT smart.disk.percentage_used[{#NAME}]

Preprocessing:

- JSONPATH: $.percentage_used

- DISCARD_UNCHANGED_HEARTBEAT: 6h

Zabbix raw items SMART [{#NAME}]: Critical warning

This field indicates critical warnings for the state of the controller.

DEPENDENT smart.disk.critical_warning[{#NAME}]

Preprocessing:

- JSONPATH: $.critical_warning

- DISCARD_UNCHANGED_HEARTBEAT: 6h

Zabbix raw items SMART [{#NAME}]: Media errors

Contains the number of occurrences where the controller detected an unrecovered data integrity error. Errors such as uncorrectable ECC, CRC checksum failure, or LBA tag mismatch are included in this field.

DEPENDENT smart.disk.media_errors[{#NAME}]

Preprocessing:

- JSONPATH: $.media_errors

- DISCARD_UNCHANGED_HEARTBEAT: 6h

Zabbix raw items SMART [{#NAME}]: Exit status

The exit statuses of smartctl are defined by a bitmask but in decimal value. The eight different bits in the exit status have the following meanings for ATA disks; some of these values may also be returned for SCSI disks.

Bit 0: Command line did not parse.

Bit 1: Device open failed, device did not return an IDENTIFY DEVICE structure, or device is in a low-power mode (see '-n' option above).

Bit 2: Some SMART or other ATA command to the disk failed, or there was a checksum error in a SMART data structure (see '-b' option above).

Bit 3: SMART status check returned "DISK FAILING".

Bit 4: We found prefail Attributes <= threshold.

Bit 5: SMART status check returned "DISK OK" but we found that some (usage or prefail) Attributes have been <= threshold at some time in the past.

Bit 6: The device error log contains records of errors.

Bit 7: The device self-test log contains records of errors. [ATA only] Failed self-tests outdated by a newer successful extended self-test are ignored.

DEPENDENT smart.disk.es[{#NAME}]

Preprocessing:

- JSONPATH: $.exit_status

- DISCARD_UNCHANGED_HEARTBEAT: 6h

Zabbix raw items SMART [{#NAME}]: Raw_Read_Error_Rate

Stores data related to the rate of hardware read errors that occurred when reading data from a disk surface. The raw value has different structure for different vendors and is often not meaningful as a decimal number. For some drives, this number may increase during normal operation without necessarily signifying errors.

DEPENDENT smart.disk.attribute.raw_read_error_rate[{#NAME}]

Preprocessing:

- JSONPATH: $.raw_read_error_rate.value

- DISCARD_UNCHANGED_HEARTBEAT: 6h

Zabbix raw items SMART [{#NAME}]: Spin_Up_Time

Average time of spindle spin up (from zero RPM to fully operational [milliseconds]).

DEPENDENT smart.disk.attribute.spin_up_time[{#NAME}]

Preprocessing:

- JSONPATH: $.spin_up_time.value

- DISCARD_UNCHANGED_HEARTBEAT: 6h

Zabbix raw items SMART [{#NAME}]: Start_Stop_Count

A tally of spindle start/stop cycles. The spindle turns on, and hence the count is increased, both when the hard disk is turned on after having before been turned entirely off (disconnected from power source) and when the hard disk returns from having previously been put to sleep mode.

DEPENDENT smart.disk.attribute.start_stop_count[{#NAME}]

Preprocessing:

- JSONPATH: $.start_stop_count.value

- DISCARD_UNCHANGED_HEARTBEAT: 6h

Zabbix raw items SMART [{#NAME}]: Power_Cycle_Count

This attribute indicates the count of full hard disk power on/off cycles.

DEPENDENT smart.disk.attribute.power_cycle_count[{#NAME}]

Preprocessing:

- JSONPATH: $.power_cycle_count.value

- DISCARD_UNCHANGED_HEARTBEAT: 6h

Zabbix raw items SMART [{#NAME}]: Reported_Uncorrect

The count of errors that could not be recovered using hardware ECC.

DEPENDENT smart.disk.attribute.reported_uncorrect[{#NAME}]

Preprocessing:

- JSONPATH: $.reported_uncorrect.value

- DISCARD_UNCHANGED_HEARTBEAT: 6h

Zabbix raw items SMART [{#NAME}]: Seek_Error_Rate

Rate of seek errors of the magnetic heads. If there is a partial failure in the mechanical positioning system, then seek errors will arise. Such a failure may be due to numerous factors, such as damage to a servo, or thermal widening of the hard disk. The raw value has different structure for different vendors and is often not meaningful as a decimal number. For some drives, this number may increase during normal operation without necessarily signifying errors.

DEPENDENT smart.disk.attribute.seek_error_rate[{#NAME}]

Preprocessing:

- JSONPATH: $.seek_error_rate.value

- DISCARD_UNCHANGED_HEARTBEAT: 6h

Zabbix raw items SMART [{#NAME}]: Bad_Block_Rate

Percentage of used reserve blocks divided by total reserve blocks.

DEPENDENT smart.disk.attribute.bad_block_rate[{#NAME}]

Preprocessing:

- JSONPATH: $.bad_block_rate.value

- DISCARD_UNCHANGED_HEARTBEAT: 6h

Zabbix raw items SMART [{#NAME}]: Program_Fail_Count_Chip

The total number of flash program operation failures since the drive was deployed.

DEPENDENT smart.disk.attribute.program_fail_count_chip[{#NAME}]

Preprocessing:

- JSONPATH: $.program_fail_count_chip.value

- DISCARD_UNCHANGED_HEARTBEAT: 6h

Zabbix raw items SMART [{#NAME}]: Reallocated_Sector_Ct

Disk discovered attribute.

DEPENDENT smart.disk.attribute.reallocated_sector_ct[{#NAME}]

Preprocessing:

- JSONPATH: $.reallocated_sector_ct.value

- DISCARD_UNCHANGED_HEARTBEAT: 6h

Triggers

Name Description Expression Severity Dependencies and additional info
SMART [{#NAME}]: Disk has been replaced

Device serial number has changed. Ack to close.

last(/SMART by Zabbix agent 2/smart.disk.sn[{#NAME}],#1)<>last(/SMART by Zabbix agent 2/smart.disk.sn[{#NAME}],#2) and length(last(/SMART by Zabbix agent 2/smart.disk.sn[{#NAME}]))>0 INFO

Manual close: YES

SMART [{#NAME}]: Disk self-test is not passed

-

last(/SMART by Zabbix agent 2/smart.disk.test[{#NAME}])="false" HIGH
SMART [{#NAME}]: Average disk temperature is too high

-

avg(/SMART by Zabbix agent 2/smart.disk.temperature[{#NAME}],5m)>{$SMART.TEMPERATURE.MAX.WARN} WARNING

Depends on:

- SMART [{#NAME}]: Average disk temperature is critical

SMART [{#NAME}]: Average disk temperature is critical

-

avg(/SMART by Zabbix agent 2/smart.disk.temperature[{#NAME}],5m)>{$SMART.TEMPERATURE.MAX.CRIT} AVERAGE
SMART [{#NAME}]: NVMe disk percentage using is over 90% of estimated endurance

-

last(/SMART by Zabbix agent 2/smart.disk.percentage_used[{#NAME}])>90 AVERAGE
SMART [{#NAME}]: Command line did not parse

Command line did not parse.

( count(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}],#2) = 1 and bitand(last(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}]),1) = 1 ) or ( bitand(last(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}]),1) = 1 and bitand(last(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}]),1) > bitand(last(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}],#2),1) ) HIGH

Manual close: YES

SMART [{#NAME}]: Device open failed

Device open failed, device did not return an IDENTIFY DEVICE structure, or device is in a low-power mode.

( count(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}],#2) = 1 and bitand(last(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}]),2) = 2 ) or ( bitand(last(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}]),2) = 2 and bitand(last(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}]),2) > bitand(last(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}],#2),2) ) HIGH

Manual close: YES

SMART [{#NAME}]: Some command to the disk failed

Some SMART or other ATA command to the disk failed, or there was a checksum error in a SMART data structure.

( count(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}],#2) = 1 and bitand(last(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}]),4) = 4 ) or ( bitand(last(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}]),4) = 4 and bitand(last(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}]),4) > bitand(last(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}],#2),4) ) HIGH

Manual close: YES

SMART [{#NAME}]: Check returned "DISK FAILING"

SMART status check returned "DISK FAILING".

( count(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}],#2) = 1 and bitand(last(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}]),8) = 8 ) or ( bitand(last(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}]),8) = 8 and bitand(last(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}]),8) > bitand(last(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}],#2),8) ) HIGH

Manual close: YES

SMART [{#NAME}]: Some prefail Attributes <= threshold

We found prefail Attributes <= threshold.

( count(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}],#2) = 1 and bitand(last(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}]),16) = 16 ) or ( bitand(last(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}]),16) = 16 and bitand(last(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}]),16) > bitand(last(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}],#2),16) ) HIGH

Manual close: YES

SMART [{#NAME}]: Some Attributes have been <= threshold

SMART status check returned "DISK OK" but we found that some (usage or prefail) Attributes have been <= threshold at some time in the past.

( count(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}],#2) = 1 and bitand(last(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}]),32) = 32 ) or ( bitand(last(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}]),32) = 32 and bitand(last(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}]),32) > bitand(last(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}],#2),32) ) HIGH

Manual close: YES

SMART [{#NAME}]: Error log contains records

The device error log contains records of errors.

( count(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}],#2) = 1 and bitand(last(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}]),64) = 64 ) or ( bitand(last(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}]),64) = 64 and bitand(last(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}]),64) > bitand(last(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}],#2),64) ) HIGH

Manual close: YES

SMART [{#NAME}]: Self-test log contains records

The device self-test log contains records of errors. [ATA only] Failed self-tests outdated by a newer successful extended self-test are ignored.

( count(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}],#2) = 1 and bitand(last(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}]),128) = 128 ) or ( bitand(last(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}]),128) = 128 and bitand(last(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}]),128) > bitand(last(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}],#2),128) ) HIGH

Manual close: YES

Feedback

Please report any issues with the template at https://support.zabbix.com

You can also provide feedback, discuss the template or ask for help with it at ZABBIX forums.

References

https://www.smartmontools.org/

This template is for Zabbix version: 6.0
Also available for: 6.2 5.4 5.0

Source: https://git.zabbix.com/projects/ZBX/repos/zabbix/browse/templates/module/smart_agent2?at=release/6.0

SMART by Zabbix agent 2

Overview

This template is designed for the effortless deployment of SMART monitoring by Zabbix via Zabbix agent 2 and doesn't require any external scripts.

It collects metrics by Zabbix agent 2 version 5.0 and later with Smartmontools version 7.1 and later. Disk discovery LLD rule finds all HDD, SSD, NVMe disks with S.M.A.R.T. enabled. Attribute discovery LLD rule have pre-defined Vendor Specific Attributes for each disk, and will be discovered if attribute is present.

Requirements

Zabbix version: 6.0 and higher.

Tested versions

This template has been tested on:

  • Smartmontools 7.1 and later

Configuration

Zabbix should be configured according to the instructions in the Templates out of the box section.

Setup

Install the Zabbix agent 2 and Smartmontools 7.1. Grant Zabbix agent 2 super/admin user privileges for smartctl utility.

Linux example:

sudo dnf install smartmontools sudo visudo

zabbix ALL=(ALL) NOPASSWD:/usr/sbin/smartctl

Plugin parameters list

Macros used

Name Description Default
{$SMART.TEMPERATURE.MAX.WARN}

This macro is used for trigger expression. It can be overridden on the host or linked on the template level.

50
{$SMART.TEMPERATURE.MAX.CRIT}

This macro is used for trigger expression. It can be overridden on the host or linked on the template level.

65
{$SMART.DISK.NAME.MATCHES}

This macro is used in the filter of attribute and disk discoveries. It can be overridden on the host or linked on the template level.

^.*$
{$SMART.DISK.NAME.NOT_MATCHES}

This macro is used in the filter of attribute and disk discoveries. It can be overridden on the host or linked on the template level.

CHANGE_IF_NEEDED

LLD rule Disk discovery

Name Description Type Key and additional info
Disk discovery

Discovery SMART disks.

Zabbix agent smart.disk.discovery

Item prototypes for Disk discovery

Name Description Type Key and additional info
SMART [{#NAME}]: Smartctl error

This metric will contain smartctl errors.

Dependent item smart.disk.error[{#NAME}]

Preprocessing

  • JSON Path: $.error

  • Discard unchanged with heartbeat: 1h

SMART [{#NAME}]: Get disk attributes Zabbix agent smart.disk.get[{#PATH},"{#RAIDTYPE}"]
SMART [{#NAME}]: Device model Dependent item smart.disk.model[{#NAME}]

Preprocessing

  • JSON Path: $.model_name

  • Discard unchanged with heartbeat: 6h

SMART [{#NAME}]: Serial number Dependent item smart.disk.sn[{#NAME}]

Preprocessing

  • JSON Path: $.serial_number

  • Discard unchanged with heartbeat: 6h

SMART [{#NAME}]: Self-test passed

The disk is passed the SMART self-test or not.

Dependent item smart.disk.test[{#NAME}]

Preprocessing

  • JSON Path: $.self_test_passed

  • Discard unchanged with heartbeat: 6h

SMART [{#NAME}]: Temperature

Current drive temperature.

Dependent item smart.disk.temperature[{#NAME}]

Preprocessing

  • JSON Path: $.temperature

  • Discard unchanged with heartbeat: 6h

SMART [{#NAME}]: Power on hours

Count of hours in power-on state. The raw value of this attribute

shows total count of hours (or minutes, or seconds, depending on manufacturer)

in power-on state. "By default, the total expected lifetime of a hard disk

in perfect condition is defined as 5 years (running every day and night on

all days). This is equal to 1825 days in 24/7 mode or 43800 hours." On some

pre-2005 drives, this raw value may advance erratically and/or "wrap around"

(reset to zero periodically). https://en.wikipedia.org/wiki/S.M.A.R.T.#Known_ATA_S.M.A.R.T._attributes

Dependent item smart.disk.hours[{#NAME}]

Preprocessing

  • JSON Path: $.power_on_time

  • Discard unchanged with heartbeat: 6h

SMART [{#NAME}]: Percentage used

Contains a vendor specific estimate of the percentage of NVM subsystem

life used based on the actual usage and the manufacturer's prediction of NVM

life. A value of 100 indicates that the estimated endurance of the NVM in

the NVM subsystem has been consumed, but may not indicate an NVM subsystem

failure. The value is allowed to exceed 100. Percentages greater than 254

shall be represented as 255. This value shall be updated once per power-on

hour (when the controller is not in a sleep state).

Dependent item smart.disk.percentage_used[{#NAME}]

Preprocessing

  • JSON Path: $.percentage_used

  • Discard unchanged with heartbeat: 6h

SMART [{#NAME}]: Critical warning

This field indicates critical warnings for the state of the controller.

Dependent item smart.disk.critical_warning[{#NAME}]

Preprocessing

  • JSON Path: $.critical_warning

  • Discard unchanged with heartbeat: 6h

SMART [{#NAME}]: Media errors

Contains the number of occurrences where the controller detected

an unrecovered data integrity error. Errors such as uncorrectable ECC, CRC

checksum failure, or LBA tag mismatch are included in this field.

Dependent item smart.disk.media_errors[{#NAME}]

Preprocessing

  • JSON Path: $.media_errors

  • Discard unchanged with heartbeat: 6h

SMART [{#NAME}]: Exit status

The exit statuses of smartctl are defined by a bitmask but in decimal value. The eight different bits in the exit status have the following meanings for ATA disks; some of these values may also be returned for SCSI disks.

Bit 0: Command line did not parse.

Bit 1: Device open failed, device did not return an IDENTIFY DEVICE structure, or device is in a low-power mode (see '-n' option above).

Bit 2: Some SMART or other ATA command to the disk failed, or there was a checksum error in a SMART data structure (see '-b' option above).

Bit 3: SMART status check returned "DISK FAILING".

Bit 4: We found prefail Attributes <= threshold.

Bit 5: SMART status check returned "DISK OK" but we found that some (usage or prefail) Attributes have been <= threshold at some time in the past.

Bit 6: The device error log contains records of errors.

Bit 7: The device self-test log contains records of errors. [ATA only] Failed self-tests outdated by a newer successful extended self-test are ignored.

Dependent item smart.disk.es[{#NAME}]

Preprocessing

  • JSON Path: $.exit_status

  • Discard unchanged with heartbeat: 6h

SMART [{#NAME}]: Raw_Read_Error_Rate

Stores data related to the rate of hardware read errors that occurred when reading data from a disk surface. The raw value has different structure for different vendors and is often not meaningful as a decimal number. For some drives, this number may increase during normal operation without necessarily signifying errors.

Dependent item smart.disk.attribute.raw_read_error_rate[{#NAME}]

Preprocessing

  • JSON Path: $.raw_read_error_rate.value

  • Discard unchanged with heartbeat: 6h

SMART [{#NAME}]: Spin_Up_Time

Average time of spindle spin up (from zero RPM to fully operational [milliseconds]).

Dependent item smart.disk.attribute.spin_up_time[{#NAME}]

Preprocessing

  • JSON Path: $.spin_up_time.value

  • Discard unchanged with heartbeat: 6h

SMART [{#NAME}]: Start_Stop_Count

A tally of spindle start/stop cycles. The spindle turns on, and hence the count is increased, both when the hard disk is turned on after having before been turned entirely off (disconnected from power source) and when the hard disk returns from having previously been put to sleep mode.

Dependent item smart.disk.attribute.start_stop_count[{#NAME}]

Preprocessing

  • JSON Path: $.start_stop_count.value

  • Discard unchanged with heartbeat: 6h

SMART [{#NAME}]: Power_Cycle_Count

This attribute indicates the count of full hard disk power on/off cycles.

Dependent item smart.disk.attribute.power_cycle_count[{#NAME}]

Preprocessing

  • JSON Path: $.power_cycle_count.value

  • Discard unchanged with heartbeat: 6h

SMART [{#NAME}]: Reported_Uncorrect

The count of errors that could not be recovered using hardware ECC.

Dependent item smart.disk.attribute.reported_uncorrect[{#NAME}]

Preprocessing

  • JSON Path: $.reported_uncorrect.value

  • Discard unchanged with heartbeat: 6h

SMART [{#NAME}]: Seek_Error_Rate

Rate of seek errors of the magnetic heads. If there is a partial failure in the mechanical positioning system, then seek errors will arise. Such a failure may be due to numerous factors, such as damage to a servo, or thermal widening of the hard disk. The raw value has different structure for different vendors and is often not meaningful as a decimal number. For some drives, this number may increase during normal operation without necessarily signifying errors.

Dependent item smart.disk.attribute.seek_error_rate[{#NAME}]

Preprocessing

  • JSON Path: $.seek_error_rate.value

  • Discard unchanged with heartbeat: 6h

SMART [{#NAME}]: Bad_Block_Rate

Percentage of used reserve blocks divided by total reserve blocks.

Dependent item smart.disk.attribute.bad_block_rate[{#NAME}]

Preprocessing

  • JSON Path: $.bad_block_rate.value

  • Discard unchanged with heartbeat: 6h

SMART [{#NAME}]: Program_Fail_Count_Chip

The total number of flash program operation failures since the drive was deployed.

Dependent item smart.disk.attribute.program_fail_count_chip[{#NAME}]

Preprocessing

  • JSON Path: $.program_fail_count_chip.value

  • Discard unchanged with heartbeat: 6h

SMART [{#NAME}]: Reallocated_Sector_Ct

Disk discovered attribute.

Dependent item smart.disk.attribute.reallocated_sector_ct[{#NAME}]

Preprocessing

  • JSON Path: $.reallocated_sector_ct.value

  • Discard unchanged with heartbeat: 6h

Trigger prototypes for Disk discovery

Name Description Expression Severity Dependencies and additional info
SMART [{#NAME}]: Disk has been replaced

Device serial number has changed. Acknowledge to close the problem manually.

last(/SMART by Zabbix agent 2/smart.disk.sn[{#NAME}],#1)<>last(/SMART by Zabbix agent 2/smart.disk.sn[{#NAME}],#2) and length(last(/SMART by Zabbix agent 2/smart.disk.sn[{#NAME}]))>0 Info Manual close: Yes
SMART [{#NAME}]: Disk self-test is not passed last(/SMART by Zabbix agent 2/smart.disk.test[{#NAME}])="false" High
SMART [{#NAME}]: Average disk temperature is too high avg(/SMART by Zabbix agent 2/smart.disk.temperature[{#NAME}],5m)>{$SMART.TEMPERATURE.MAX.WARN} Warning Depends on:
  • SMART [{#NAME}]: Average disk temperature is critical
SMART [{#NAME}]: Average disk temperature is critical avg(/SMART by Zabbix agent 2/smart.disk.temperature[{#NAME}],5m)>{$SMART.TEMPERATURE.MAX.CRIT} Average
SMART [{#NAME}]: NVMe disk percentage using is over 90% of estimated endurance last(/SMART by Zabbix agent 2/smart.disk.percentage_used[{#NAME}])>90 Average
SMART [{#NAME}]: Command line did not parse

Command line did not parse.

( count(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}],#2) = 1 and bitand(last(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}]),1) = 1 ) or ( bitand(last(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}]),1) = 1 and bitand(last(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}]),1) > bitand(last(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}],#2),1) ) High Manual close: Yes
SMART [{#NAME}]: Device open failed

Device open failed, device did not return an IDENTIFY DEVICE structure, or device is in a low-power mode.

( count(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}],#2) = 1 and bitand(last(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}]),2) = 2 ) or ( bitand(last(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}]),2) = 2 and bitand(last(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}]),2) > bitand(last(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}],#2),2) ) High Manual close: Yes
SMART [{#NAME}]: Some command to the disk failed

Some SMART or other ATA command to the disk failed,
or there was a checksum error in a SMART data structure.

( count(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}],#2) = 1 and bitand(last(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}]),4) = 4 ) or ( bitand(last(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}]),4) = 4 and bitand(last(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}]),4) > bitand(last(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}],#2),4) ) High Manual close: Yes
SMART [{#NAME}]: Check returned "DISK FAILING"

SMART status check returned "DISK FAILING".

( count(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}],#2) = 1 and bitand(last(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}]),8) = 8 ) or ( bitand(last(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}]),8) = 8 and bitand(last(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}]),8) > bitand(last(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}],#2),8) ) High Manual close: Yes
SMART [{#NAME}]: Some prefail Attributes <= threshold

We found prefail Attributes <= threshold.

( count(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}],#2) = 1 and bitand(last(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}]),16) = 16 ) or ( bitand(last(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}]),16) = 16 and bitand(last(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}]),16) > bitand(last(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}],#2),16) ) High Manual close: Yes
SMART [{#NAME}]: Some Attributes have been <= threshold

SMART status check returned "DISK OK" but we found that some (usage
or prefail) Attributes have been <= threshold at some time in the past.

( count(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}],#2) = 1 and bitand(last(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}]),32) = 32 ) or ( bitand(last(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}]),32) = 32 and bitand(last(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}]),32) > bitand(last(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}],#2),32) ) High Manual close: Yes
SMART [{#NAME}]: Error log contains records

The device error log contains records of errors.

( count(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}],#2) = 1 and bitand(last(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}]),64) = 64 ) or ( bitand(last(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}]),64) = 64 and bitand(last(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}]),64) > bitand(last(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}],#2),64) ) High Manual close: Yes
SMART [{#NAME}]: Self-test log contains records

The device self-test log contains records of errors. [ATA only]
Failed self-tests outdated by a newer successful extended self-test are ignored.

( count(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}],#2) = 1 and bitand(last(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}]),128) = 128 ) or ( bitand(last(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}]),128) = 128 and bitand(last(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}]),128) > bitand(last(/SMART by Zabbix agent 2/smart.disk.es[{#NAME}],#2),128) ) High Manual close: Yes

Feedback

Please report any issues with the template at https://support.zabbix.com

You can also provide feedback, discuss the template, or ask for help at ZABBIX forums

This template is for Zabbix version: 5.4
Also available for: 6.2 6.0 5.0

Source: https://git.zabbix.com/projects/ZBX/repos/zabbix/browse/templates/module/smart_agent2?at=release/5.4

SMART by Zabbix agent 2

Overview

For Zabbix version: 5.4 and higher
The template for monitoring S.M.A.R.T. attributes of physical disk that works without any external scripts. It collects metrics by Zabbix agent 2 version 5.0 and later with Smartmontools version 7.1 and later. Disk discovery LLD rule finds all HDD, SSD, NVMe disks with S.M.A.R.T. enabled. Attribute discovery LLD rule finds all Vendor Specific Attributes for each disk. If you want to skip some attributes, please set regular expressions with disk names in {$SMART.DISK.NAME.MATCHES} and with attribute IDs in {$SMART.ATTRIBUTE.ID.MATCHES} macros on the host level.

This template was tested on:

  • Smartmontools, version 7.1 and later

Setup

See Zabbix template operation for basic instructions.

Install the Zabbix agent 2 and Smartmontools 7.1.

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name Description Default
{$SMART.ATTRIBUTE.ID.MATCHES}

This macro is used in overrides of attribute discovery for filtering IDs. It can be overridden on the host or linked template level.

CHANGE_IF_NEEDED
{$SMART.DISK.NAME.MATCHES}

This macro is used in overrides of attribute discovery for filtering IDs. It can be overridden on the host or linked template level.

CHANGE_IF_NEEDED
{$SMART.TEMPERATURE.MAX.CRIT}

This macro is used for trigger expression. It can be overridden on the host or linked template level.

65
{$SMART.TEMPERATURE.MAX.WARN}

This macro is used for trigger expression. It can be overridden on the host or linked template level.

50

Template links

There are no template links in this template.

Discovery rules

Name Description Type Key and additional info
Disk discovery

Discovery SMART disks.

ZABBIX_PASSIVE smart.disk.discovery

Overrides:

Self-test
- {#DISKTYPE} MATCHES_REGEX nvme
- ITEM_PROTOTYPE LIKE Self-test - NO_DISCOVER

Not NVMe
- {#DISKTYPE} NOT_MATCHES_REGEX nvme
- ITEM_PROTOTYPE REGEXP `Media

Attribute discovery

Discovery SMART Vendor Specific Attributes of disks.

ZABBIX_PASSIVE smart.attribute.discovery

Overrides:

ID filter
- {#ID} MATCHES_REGEX {$SMART.ATTRIBUTE.ID.MATCHES} - {#NAME} MATCHES_REGEX {$SMART.DISK.NAME.MATCHES}
- ITEM_PROTOTYPE REGEXP `` - NO_DISCOVER

Items collected

Group Name Description Type Key and additional info
Zabbix_raw_items SMART: Get attributes

-

ZABBIX_PASSIVE smart.disk.get
Zabbix_raw_items SMART [{#NAME}]: Device model

-

DEPENDENT smart.disk.model[{#NAME}]

Preprocessing:

- JSONPATH: $[?(@.disk_name=='{#NAME}')].model_name.first()

- DISCARD_UNCHANGED_HEARTBEAT: 6h

Zabbix_raw_items SMART [{#NAME}]: Serial number

-

DEPENDENT smart.disk.sn[{#NAME}]

Preprocessing:

- JSONPATH: $[?(@.disk_name=='{#NAME}')].serial_number.first()

- DISCARD_UNCHANGED_HEARTBEAT: 6h

Zabbix_raw_items SMART [{#NAME}]: Self-test passed

The disk is passed the SMART self-test or not.

DEPENDENT smart.disk.test[{#NAME}]

Preprocessing:

- JSONPATH: $[?(@.disk_name=='{#NAME}')].ata_smart_data.self_test.status.passed.first()

- DISCARD_UNCHANGED_HEARTBEAT: 6h

Zabbix_raw_items SMART [{#NAME}]: Temperature

Current drive temperature.

DEPENDENT smart.disk.temperature[{#NAME}]

Preprocessing:

- JSONPATH: $[?(@.disk_name=='{#NAME}')].temperature.current.first()

- DISCARD_UNCHANGED_HEARTBEAT: 6h

Zabbix_raw_items SMART [{#NAME}]: Power on hours

Count of hours in power-on state. The raw value of this attribute shows total count of hours (or minutes, or seconds, depending on manufacturer) in power-on state. "By default, the total expected lifetime of a hard disk in perfect condition is defined as 5 years (running every day and night on all days). This is equal to 1825 days in 24/7 mode or 43800 hours." On some pre-2005 drives, this raw value may advance erratically and/or "wrap around" (reset to zero periodically). https://en.wikipedia.org/wiki/S.M.A.R.T.#Known_ATA_S.M.A.R.T._attributes

DEPENDENT smart.disk.hours[{#NAME}]

Preprocessing:

- JSONPATH: $[?(@.disk_name=='{#NAME}')].power_on_time.hours.first()

- DISCARD_UNCHANGED_HEARTBEAT: 6h

Zabbix_raw_items SMART [{#NAME}]: Percentage used

Contains a vendor specific estimate of the percentage of NVM subsystem life used based on the actual usage and the manufacturer's prediction of NVM life. A value of 100 indicates that the estimated endurance of the NVM in the NVM subsystem has been consumed, but may not indicate an NVM subsystem failure. The value is allowed to exceed 100. Percentages greater than 254 shall be represented as 255. This value shall be updated once per power-on hour (when the controller is not in a sleep state).

DEPENDENT smart.disk.percentage_used[{#NAME}]

Preprocessing:

- JSONPATH: $[?(@.disk_name=='{#NAME}')].nvme_smart_health_information_log.percentage_used.first()

- DISCARD_UNCHANGED_HEARTBEAT: 6h

Zabbix_raw_items SMART [{#NAME}]: Critical warning

This field indicates critical warnings for the state of the controller.

DEPENDENT smart.disk.critical_warning[{#NAME}]

Preprocessing:

- JSONPATH: $[?(@.disk_name=='{#NAME}')].nvme_smart_health_information_log.critical_warning.first()

- DISCARD_UNCHANGED_HEARTBEAT: 6h

Zabbix_raw_items SMART [{#NAME}]: Media errors

Contains the number of occurrences where the controller detected an unrecovered data integrity error. Errors such as uncorrectable ECC, CRC checksum failure, or LBA tag mismatch are included in this field.

DEPENDENT smart.disk.media_errors[{#NAME}]

Preprocessing:

- JSONPATH: $[?(@.disk_name=='{#NAME}')].nvme_smart_health_information_log.media_errors.first()

- DISCARD_UNCHANGED_HEARTBEAT: 6h

Zabbix_raw_items SMART [{#NAME}]: ID {#ID} {#ATTRNAME}

-

DEPENDENT smart.disk.error[{#NAME},{#ID}]

Preprocessing:

- JSONPATH: $[?(@.disk_name=='{#NAME}')].ata_smart_attributes.table[?(@.id=={#ID})].value.first()

- DISCARD_UNCHANGED_HEARTBEAT: 6h

Zabbix_raw_items SMART [{#NAME}]: ID {#ID} {#ATTRNAME} raw value

-

DEPENDENT smart.disk.attr.raw[{#NAME},{#ID}]

Preprocessing:

- JSONPATH: $[?(@.disk_name=='{#NAME}')].ata_smart_attributes.table[?(@.id=={#ID})].raw.string.first()

- DISCARD_UNCHANGED_HEARTBEAT: 6h

Triggers

Name Description Expression Severity Dependencies and additional info
SMART [{#NAME}]: Disk has been replaced (new serial number received)

Device serial number has changed. Ack to close.

last(/SMART by Zabbix agent 2/smart.disk.sn[{#NAME}],#1)<>last(/SMART by Zabbix agent 2/smart.disk.sn[{#NAME}],#2) and length(last(/SMART by Zabbix agent 2/smart.disk.sn[{#NAME}]))>0 INFO

Manual close: YES

SMART [{#NAME}]: Disk self-test is not passed

-

last(/SMART by Zabbix agent 2/smart.disk.test[{#NAME}])="false" HIGH
SMART [{#NAME}]: Average disk temperature is too high (over {$SMART.TEMPERATURE.MAX.WARN}°C for 5m)

-

avg(/SMART by Zabbix agent 2/smart.disk.temperature[{#NAME}],5m)>{$SMART.TEMPERATURE.MAX.WARN} WARNING

Depends on:

- SMART [{#NAME}]: Average disk temperature is critical (over {$SMART.TEMPERATURE.MAX.CRIT}°C for 5m)

SMART [{#NAME}]: Average disk temperature is critical (over {$SMART.TEMPERATURE.MAX.CRIT}°C for 5m)

-

avg(/SMART by Zabbix agent 2/smart.disk.temperature[{#NAME}],5m)>{$SMART.TEMPERATURE.MAX.CRIT} AVERAGE
SMART [{#NAME}]: NVMe disk percentage using is over 90% of estimated endurance

-

last(/SMART by Zabbix agent 2/smart.disk.percentage_used[{#NAME}])>90 AVERAGE
SMART [{#NAME}]: Attribute {#ID} {#ATTRNAME} is failed

The value should be greater than THRESH.

last(/SMART by Zabbix agent 2/smart.disk.error[{#NAME},{#ID}]) <= {#THRESH} WARNING

Feedback

Please report any issues with the template at https://support.zabbix.com

You can also provide a feedback, discuss the template or ask for help with it at ZABBIX forums.

References

https://www.smartmontools.org/

This template is for Zabbix version: 5.0
Also available for: 6.2 6.0 5.4

Source: https://git.zabbix.com/projects/ZBX/repos/zabbix/browse/templates/module/smart_agent2?at=release/5.0

Template Module SMART by Zabbix agent 2

Overview

For Zabbix version: 5.0 and higher
The template for monitoring S.M.A.R.T. attributes of physical disk that works without any external scripts. It collects metrics by Zabbix agent 2 version 5.0 and later with Smartmontools version 7.1 and later. Disk discovery LLD rule finds all HDD, SSD, NVMe disks with S.M.A.R.T. enabled. Attribute discovery LLD rule have pre-defined Vendor Specific Attributes for each disk, and will be discovered if attribute is present.

This template was tested on:

  • Smartmontools, version 7.1 and later

Setup

See Zabbix template operation for basic instructions.

Install the Zabbix agent 2 and Smartmontools 7.1. Grant Zabbix agent 2 super/admin user privileges for smartctl utility.

Linux example:

sudo dnf install smartmontools sudo visudo

zabbix ALL=(ALL) NOPASSWD:/usr/sbin/smartctl

Zabbix configuration

No specific Zabbix configuration is required.

Macros used

Name Description Default
{$SMART.DISK.NAME.MATCHES}

This macro is used in the filter of attribute and disk discoveries. It can be overridden on the host or linked on the template level.

^.*$
{$SMART.DISK.NAME.NOT_MATCHES}

This macro is used in the filter of attribute and disk discoveries. It can be overridden on the host or linked on the template level.

CHANGE_IF_NEEDED
{$SMART.TEMPERATURE.MAX.CRIT}

This macro is used for trigger expression. It can be overridden on the host or linked on the template level.

65
{$SMART.TEMPERATURE.MAX.WARN}

This macro is used for trigger expression. It can be overridden on the host or linked on the template level.

50

Template links

There are no template links in this template.

Discovery rules

Name Description Type Key and additional info
Disk discovery

Discovery SMART disks.

ZABBIX_PASSIVE smart.disk.discovery

Filter:

AND

- A: {#NAME} MATCHES_REGEX {$SMART.DISK.NAME.MATCHES}

- B: {#NAME} NOT_MATCHES_REGEX {$SMART.DISK.NAME.NOT_MATCHES}

Overrides:

Self-test
- {#DISKTYPE} MATCHES_REGEX nvme
- ITEM_PROTOTYPE LIKE Self-test - NO_DISCOVER

Not NVMe
- {#DISKTYPE} NOT_MATCHES_REGEX nvme
- ITEM_PROTOTYPE REGEXP `Media

Items collected

Group Name Description Type Key and additional info
Zabbix_raw_items SMART [{#NAME}]: Smartctl error

This metric will contain smartctl errors.

DEPENDENT smart.disk.error[{#NAME}]

Preprocessing:

- JSONPATH: $.error

- DISCARD_UNCHANGED_HEARTBEAT: 1h

Zabbix_raw_items SMART [{#NAME}]: Get disk attributes

-

ZABBIX_PASSIVE smart.disk.get[{#PATH},"{#RAIDTYPE}"]
Zabbix_raw_items SMART [{#NAME}]: Device model

-

DEPENDENT smart.disk.model[{#NAME}]

Preprocessing:

- JSONPATH: $.model_name

- DISCARD_UNCHANGED_HEARTBEAT: 6h

Zabbix_raw_items SMART [{#NAME}]: Serial number

-

DEPENDENT smart.disk.sn[{#NAME}]

Preprocessing:

- JSONPATH: $.serial_number

- DISCARD_UNCHANGED_HEARTBEAT: 6h

Zabbix_raw_items SMART [{#NAME}]: Self-test passed

The disk is passed the SMART self-test or not.

DEPENDENT smart.disk.test[{#NAME}]

Preprocessing:

- JSONPATH: $.self_test_passed

- DISCARD_UNCHANGED_HEARTBEAT: 6h

Zabbix_raw_items SMART [{#NAME}]: Temperature

Current drive temperature.

DEPENDENT smart.disk.temperature[{#NAME}]

Preprocessing:

- JSONPATH: $.temperature

- DISCARD_UNCHANGED_HEARTBEAT: 6h

Zabbix_raw_items SMART [{#NAME}]: Power on hours

Count of hours in power-on state. The raw value of this attribute shows total count of hours (or minutes, or seconds, depending on manufacturer) in power-on state. "By default, the total expected lifetime of a hard disk in perfect condition is defined as 5 years (running every day and night on all days). This is equal to 1825 days in 24/7 mode or 43800 hours." On some pre-2005 drives, this raw value may advance erratically and/or "wrap around" (reset to zero periodically). https://en.wikipedia.org/wiki/S.M.A.R.T.#Known_ATA_S.M.A.R.T._attributes

DEPENDENT smart.disk.hours[{#NAME}]

Preprocessing:

- JSONPATH: $.power_on_time

- DISCARD_UNCHANGED_HEARTBEAT: 6h

Zabbix_raw_items SMART [{#NAME}]: Percentage used

Contains a vendor specific estimate of the percentage of NVM subsystem life used based on the actual usage and the manufacturer's prediction of NVM life. A value of 100 indicates that the estimated endurance of the NVM in the NVM subsystem has been consumed, but may not indicate an NVM subsystem failure. The value is allowed to exceed 100. Percentages greater than 254 shall be represented as 255. This value shall be updated once per power-on hour (when the controller is not in a sleep state).

DEPENDENT smart.disk.percentage_used[{#NAME}]

Preprocessing:

- JSONPATH: $.percentage_used

- DISCARD_UNCHANGED_HEARTBEAT: 6h

Zabbix_raw_items SMART [{#NAME}]: Critical warning

This field indicates critical warnings for the state of the controller.

DEPENDENT smart.disk.critical_warning[{#NAME}]

Preprocessing:

- JSONPATH: $.critical_warning

- DISCARD_UNCHANGED_HEARTBEAT: 6h

Zabbix_raw_items SMART [{#NAME}]: Media errors

Contains the number of occurrences where the controller detected an unrecovered data integrity error. Errors such as uncorrectable ECC, CRC checksum failure, or LBA tag mismatch are included in this field.

DEPENDENT smart.disk.media_errors[{#NAME}]

Preprocessing:

- JSONPATH: $.media_errors

- DISCARD_UNCHANGED_HEARTBEAT: 6h

Zabbix_raw_items SMART [{#NAME}]: Exit status

The exit statuses of smartctl are defined by a bitmask but in decimal value. The eight different bits in the exit status have the following meanings for ATA disks; some of these values may also be returned for SCSI disks.

Bit 0: Command line did not parse.

Bit 1: Device open failed, device did not return an IDENTIFY DEVICE structure, or device is in a low-power mode (see '-n' option above).

Bit 2: Some SMART or other ATA command to the disk failed, or there was a checksum error in a SMART data structure (see '-b' option above).

Bit 3: SMART status check returned "DISK FAILING".

Bit 4: We found prefail Attributes <= threshold.

Bit 5: SMART status check returned "DISK OK" but we found that some (usage or prefail) Attributes have been <= threshold at some time in the past.

Bit 6: The device error log contains records of errors.

Bit 7: The device self-test log contains records of errors. [ATA only] Failed self-tests outdated by a newer successful extended self-test are ignored.

DEPENDENT smart.disk.es[{#NAME}]

Preprocessing:

- JSONPATH: $.exit_status

- DISCARD_UNCHANGED_HEARTBEAT: 6h

Zabbix_raw_items SMART [{#NAME}]: Raw_Read_Error_Rate

Stores data related to the rate of hardware read errors that occurred when reading data from a disk surface. The raw value has different structure for different vendors and is often not meaningful as a decimal number. For some drives, this number may increase during normal operation without necessarily signifying errors.

DEPENDENT smart.disk.attribute.raw_read_error_rate[{#NAME}]

Preprocessing:

- JSONPATH: $.raw_read_error_rate.value

- DISCARD_UNCHANGED_HEARTBEAT: 6h

Zabbix_raw_items SMART [{#NAME}]: Spin_Up_Time

Average time of spindle spin up (from zero RPM to fully operational [milliseconds]).

DEPENDENT smart.disk.attribute.spin_up_time[{#NAME}]

Preprocessing:

- JSONPATH: $.spin_up_time.value

- DISCARD_UNCHANGED_HEARTBEAT: 6h

Zabbix_raw_items SMART [{#NAME}]: Start_Stop_Count

A tally of spindle start/stop cycles. The spindle turns on, and hence the count is increased, both when the hard disk is turned on after having before been turned entirely off (disconnected from power source) and when the hard disk returns from having previously been put to sleep mode.

DEPENDENT smart.disk.attribute.start_stop_count[{#NAME}]

Preprocessing:

- JSONPATH: $.start_stop_count.value

- DISCARD_UNCHANGED_HEARTBEAT: 6h

Zabbix_raw_items SMART [{#NAME}]: Power_Cycle_Count

This attribute indicates the count of full hard disk power on/off cycles.

DEPENDENT smart.disk.attribute.power_cycle_count[{#NAME}]

Preprocessing:

- JSONPATH: $.power_cycle_count.value

- DISCARD_UNCHANGED_HEARTBEAT: 6h

Zabbix_raw_items SMART [{#NAME}]: Reported_Uncorrect

The count of errors that could not be recovered using hardware ECC.

DEPENDENT smart.disk.attribute.reported_uncorrect[{#NAME}]

Preprocessing:

- JSONPATH: $.reported_uncorrect.value

- DISCARD_UNCHANGED_HEARTBEAT: 6h

Zabbix_raw_items SMART [{#NAME}]: Seek_Error_Rate

Rate of seek errors of the magnetic heads. If there is a partial failure in the mechanical positioning system, then seek errors will arise. Such a failure may be due to numerous factors, such as damage to a servo, or thermal widening of the hard disk. The raw value has different structure for different vendors and is often not meaningful as a decimal number. For some drives, this number may increase during normal operation without necessarily signifying errors.

DEPENDENT smart.disk.attribute.seek_error_rate[{#NAME}]

Preprocessing:

- JSONPATH: $.seek_error_rate.value

- DISCARD_UNCHANGED_HEARTBEAT: 6h

Zabbix_raw_items SMART [{#NAME}]: Bad_Block_Rate

Percentage of used reserve blocks divided by total reserve blocks.

DEPENDENT smart.disk.attribute.bad_block_rate[{#NAME}]

Preprocessing:

- JSONPATH: $.bad_block_rate.value

- DISCARD_UNCHANGED_HEARTBEAT: 6h

Zabbix_raw_items SMART [{#NAME}]: Program_Fail_Count_Chip

The total number of flash program operation failures since the drive was deployed.

DEPENDENT smart.disk.attribute.program_fail_count_chip[{#NAME}]

Preprocessing:

- JSONPATH: $.program_fail_count_chip.value

- DISCARD_UNCHANGED_HEARTBEAT: 6h

Zabbix_raw_items SMART [{#NAME}]: Reallocated_Sector_Ct

Disk discovered attribute.

DEPENDENT smart.disk.attribute.reallocated_sector_ct[{#NAME}]

Preprocessing:

- JSONPATH: $.reallocated_sector_ct.value

- DISCARD_UNCHANGED_HEARTBEAT: 6h

Triggers

Name Description Expression Severity Dependencies and additional info
SMART [{#NAME}]: Disk has been replaced (new serial number received)

Device serial number has changed. Ack to close.

{TEMPLATE_NAME:smart.disk.sn[{#NAME}].diff()}=1 and {TEMPLATE_NAME:smart.disk.sn[{#NAME}].strlen()}>0 INFO

Manual close: YES

SMART [{#NAME}]: Disk self-test is not passed

-

{TEMPLATE_NAME:smart.disk.test[{#NAME}].last()}="false" HIGH
SMART [{#NAME}]: Average disk temperature is too high (over {$SMART.TEMPERATURE.MAX.WARN}°C for 5m)

-

{TEMPLATE_NAME:smart.disk.temperature[{#NAME}].avg(5m)}>{$SMART.TEMPERATURE.MAX.WARN} WARNING

Depends on:

- SMART [{#NAME}]: Average disk temperature is critical (over {$SMART.TEMPERATURE.MAX.CRIT}°C for 5m)

SMART [{#NAME}]: Average disk temperature is critical (over {$SMART.TEMPERATURE.MAX.CRIT}°C for 5m)

-

{TEMPLATE_NAME:smart.disk.temperature[{#NAME}].avg(5m)}>{$SMART.TEMPERATURE.MAX.CRIT} AVERAGE
SMART [{#NAME}]: NVMe disk percentage using is over 90% of estimated endurance

-

{TEMPLATE_NAME:smart.disk.percentage_used[{#NAME}].last()}>90 AVERAGE

Feedback

Please report any issues with the template at https://support.zabbix.com

You can also provide a feedback, discuss the template or ask for help with it at ZABBIX forums.

References

https://www.smartmontools.org/

Articles and documentation

+ Propose new article
👁 Image

Request custom integration

Zabbix integration team will develop custom integration based on your requirements and Zabbix best practices.

Request
👁 Image

Propose integration

Have you already developed high quality integration and want to submit to Zabbix integration repository?

Propose