Post
by Dr.Flay » Tue Jul 19, 2016 7:56 am
This illustrates my complaint of many years, that people do not use SMART to prevent downtime, but use it during downtime to diagnose how broken it is.
2 common problems;
1) No OS is setup to use SMART fully, and will only inform you of disk errors when the disk is failing, and SMART is giving-up on the drive.
2) Most drives do not have all 3 SMART options enabled by default, and people never check (have you on your PC ?).
These 3 switches are usually; 1) auto self test. 2) make a log. 3) keep previous logs.
If the ability to keep logs is not enabled, you have no history of the drive failure info.
The best use of SMART for a server (or any PC) is to predict failure based on constant monitoring.
Watching for spikes in temperature, or acceleration of the block retirement can all be good indicators before your drives actually fail.
With monitoring software you can set alerts to be sent when certain thresholds are passed.
Personally I would request that my servers are being SMART monitored proactively not retroactively, and what the status is of the 3 SMART settings on each of the drives.