Teilen

Last Updated September 19, 2023

Artikel

Solidigm™ (Formerly Intel®) SSD Data Center Family SMART Attributes

Summarizes SMART Attributes For Data Center Products

Summary

Solidigm Solid State Drive Data Center for PCIe* SMART (Self-Monitoring, Analysis and Reporting Technology) attributes.

Note: products released by Solidigm in 2022 could have modified SMART attributes.

Please, check the Solidigm website for additional documentation for each individual product.

Overview

SMART is an open standard used by drives and hosts to monitor drive health and report potential problems. This document lists and describes the SMART attributes supported by Solidigm Data Center Solid State Drives (SSDs) for PCIe*.

SMART Attributes

The following table lists the SMART attributes supported by the Solidigm Data Center SSDs for PCIe*.

SMART
Attributes (LoByte)
# of Bytes
Attribute

Description





0





1
Critical Warning: These bits if set, flag various warning sources.
Bit 0: Available Spare is below Threshold Bit 1: Temperature has exceeded Threshold
Bit 2: Reliability is degraded due to excessive media or internal errors
Bit 3: Media is placed in Read- Only Mode
Bit 4: Volatile Memory Backup System has failed (e.g., enhanced power loss capacitor test failure)
Bits 5-7: Reserved



Any of the critical warning can be tied to asynchronous event notification. Drive Health Indicator defined under bytes 3095-3076 of Identify Controller may still indicate “healthy” status when the critical warning flag is set.

1

2

Temperature:
Overall Device current temperature in Kelvin.
This reports media temperature.
For AIC, it reports the NAND temperature, for 2.5” FF,
it is the case temperature

3

1
Available Spare:
Contains a normalized percentage (0 to 100%) of the remaining spare capacity available
Available Spare will be set at 100% and decrements

4

1

Available Spare Threshold
Available Spare Threshold will be set at 0%




5




1



Percentage Used Estimate (Value allowed to exceed 100%)
A value of 100 indicates that the estimated endurance of the device has been consumed but may not indicate a device failure. The value is allowed to exceed 100. Percentages greater than 254 shall be represented as 255. This value shall be updated once per power-on hour (when the controller is not in a sleep state). If the value reaches or exceeds 105, drive will enter a write protect mode with write bandwidth reaching <10MB/sec



32



16



Data Units Read (in LBAs)
Contains the number of 512 byte data units the host has read from the controller; this value does not include metadata. This value is reported in thousands (i.e., a value of 1 corresponds to 1000 units of 512 bytes read) and is rounded up. When the LBA size is a value other than 512 bytes, the controller shall convert the amount of data read to 512 byte units.





48





16





Data Units Write (in LBAs)
Contains the number of 512 byte data units the host has written to the controller; this value does not include metadata. This value is reported in thousands (i.e., a value of 1 corresponds to 1000 units of 512 bytes written) and is rounded up. When the LBA size is a value other than 512 bytes, the controller shall convert the amount of data written to 512 byte units. For the NVM command set, logical blocks written as part of Write operations shall be included in this value. Write Uncorrectable commands shall not impact this value

64

16

Host Read Commands
Contains the number of read commands issued to the controller.

80

16

Host Write Commands

Contains the number of write commands issued to the controller.



96



16



Controller Busy Time (in minutes)
Contains the amount of time the controller is busy with I/O commands. The controller is busy when there is a command outstanding to an I/O Queue (specifically, a command was issued by way of an I/O Submission Queue Tail doorbell write and the corresponding completion queue entry has not been posted yet to the associated I/O Completion Queue). This value is reported in minutes.

112

16

Power Cycles

Contains the number of power cycles.

128

16

Power On Hours
Contains the number of power-on hours. This does not include time that the controller was powered and in a low power state condition.


144


16


Unsafe shutdowns

Contains the number of unsafe shutdowns. This count is incremented when a shutdown notification (CC.SHN) is not received prior to loss of power.


160


16


Media Errors
Contains the number of occurrences where the controller detected an unrecovered data integrity error. Errors such as uncorrectable ECC, CRC checksum failure, or LBA tag mismatch are included in this field.

176

16

Number of Error Information Log Entries
Contains the number of Error Information log entries over the life of the controller.



192



4



Warning Composite Temperature Time
Contains the amount of time in minutes that the controller is operational, and the Composite Temperature is greater than or equal to the Warning Composite Temperature Threshold (WCTEMP) field and less than the Critical Composite Temperature Threshold (CCTEMP) field in the Identify Controller data structure. (P3100)


196


4


Critical Composite Temperature Time
Contains the amount of time in minutes that the controller is operational. and the Composite Temperature is greater than the Critical Composite Temperature Threshold (CCTEMP) filed in the identify Controller data structure.

Additional SMART Attributes (Log Identifier CAh)


Byte

# of Bytes

Attribute

Description
0 1 AB (Program Fail Count)
Raw value: shows total count of program fails.
Normalized value: beginning at 100, shows the percent remaining of allowable program fails.
3 1 Normalized Value

5

6

Current Raw Value
12 1 AC (Erase Fail Count) Raw value: shows total count of erase fails.
Normalized value: beginning at 100, shows the percent remaining of allowable erase fails.
15 1 Normalized Value


17


Questions?

Contact Solidigm™ Customer Support

Solidigm™ (Formerly Intel®) SSD Data Center Family SMART Attributes