windows下有HDTune可以查看磁盘的状态,防止磁盘挂掉才会自己知道,CentOS下有SMART (Self-Monitoring, Analysis and Reporting Technology System) 同样对磁盘做状态检测。
http://www.smartmontools.org/
下面以dell R720服务器举例,/dev/sda是1T的scsi接口普通硬盘,/dev/sdd 是三块盘做的raid5
# df -h #查看磁盘的名字
# dmesg |grep sdd #查看开机信息里面的磁盘info
sd 0:2:0:0: [sdd] Attached SCSI disk
# hdparm -I /dev/sda #查看磁盘硬件信息、开启的功能等,信息特别详细
下面用smart查看磁盘的状态:
# yum install smartmontools //安装SMART # smartctl -H /dev/sdd //磁盘健康状况查看 smartctl 5.43 2012-06-30 r3573 [x86_64-linux-3.10.56-11.el6.centos.alt.x86_64] ( local build) Copyright (C) 2002-12 by Bruce Allen, http: //smartmontools .sourceforge.net SMART Health Status: OK |
# smartctl -A /dev/sda 或者 smartctl –all /dev/sda #硬盘的smart信息
# smartctl -a /dev/sdd smartctl 5.43 2012-06-30 r3573 [x86_64-linux-3.10.56-11.el6.centos.alt.x86_64] ( local build) Copyright (C) 2002-12 by Bruce Allen, http: //smartmontools .sourceforge.net Vendor: DELL Product: PERC H310 Revision: 2.12 User Capacity: 598,879,502,336 bytes [598 GB] Logical block size: 512 bytes Logical Unit id : Serial number: Device type : disk Local Time is: Wed Jan 14 15:37:39 2015 CST Device does not support SMART Error Counter logging not supported Device does not support Self Test logging |
这里提示Device does not support SMART,所以按下面方式查看
查看raid5中第一块磁盘的状态
# smartctl -a -d megaraid,0 /dev/sdd
同样查看第二块、第三块磁盘的状态,根据自己的监控情况,加速nagios、zabbix报警
# smartctl -a -d megaraid,1 /dev/sdd
# smartctl -a -d megaraid,2 /dev/sdd
除此之外的smartctl用法,介绍的很详细:
# smartctl -h Usage: smartctl [options] device ============================================ SHOW INFORMATION OPTIONS ===== -h, --help, --usage Display this help and exit -V, --version, --copyright, --license Print license, copyright, and version information and exit -i, --info Show identity information for device -g NAME, --get=NAME Get device setting: all, aam, apm, lookahead, security, wcache -a, --all Show all SMART information for device -x, --xall Show all information for device --scan Scan for devices --scan- open Scan for devices and try to open each device ================================== SMARTCTL RUN-TIME BEHAVIOR OPTIONS ===== -q TYPE, --quietmode=TYPE (ATA) Set smartctl quiet mode to one of: errorsonly, silent, noserial -d TYPE, --device=TYPE Specify device type to one of: ata, scsi, sat[,auto][,N][ TYPE], usbcypress[,X], usbjmicron[,x][,N], usbsunplus, marvell, areca,N /E , 3ware,N, hpt,L /M/N , megaraid,N, cciss,N, auto, test -T TYPE, --tolerance=TYPE (ATA) Tolerance: normal, conservative, permissive, verypermissive -b TYPE, --badsum=TYPE (ATA) Set action on bad checksum to one of: warn, exit , ignore -r TYPE, --report=TYPE Report transactions (see man page) -n MODE, --nocheck=MODE (ATA) No check if : never, sleep , standby, idle (see man page) ============================== DEVICE FEATURE ENABLE /DISABLE COMMANDS ===== -s VALUE, --smart=VALUE Enable /disable SMART on device (on /off ) -o VALUE, --offlineauto=VALUE (ATA) Enable /disable automatic offline testing on device (on /off ) -S VALUE, --saveauto=VALUE (ATA) Enable /disable Attribute autosave on device (on /off ) -s NAME[,VALUE], -- set =NAME[,VALUE] Enable /disable/change device setting: aam,[N|off], apm,[N|off], lookahead,[on|off], security-freeze, standby,[N|off|now], wcache,[on|off] ======================================= READ AND DISPLAY DATA OPTIONS ===== -H, --health Show device SMART health status -c, --capabilities (ATA) Show device SMART capabilities -A, --attributes Show device SMART vendor-specific Attributes and values -f FORMAT, -- format =FORMAT (ATA) Set output format for attributes: old, brief, hex[, id |val] -l TYPE, --log=TYPE Show device log. TYPE: error, selftest, selective, directory[,g|s], xerror[,N][,error], xselftest[,N][,selftest], background, sasphy[,reset], sataphy[,reset], scttemp[sts,hist], scttempint,N[,p], scterc[,N,M], devstat[,N], ssd, gplog,N[,RANGE], smartlog,N[,RANGE] - v N,OPTION , --vendorattribute=N,OPTION (ATA) Set display OPTION for vendor Attribute N (see man page) -F TYPE, --firmwarebug=TYPE (ATA) Use firmware bug workaround: none, samsung, samsung2, samsung3, swapid -P TYPE, --presets=TYPE (ATA) Drive-specific presets: use, ignore, show, showall -B [ ]FILE, --drivedb=[ ]FILE (ATA) Read and replace [add] drive database from FILE [default is /etc/smart_drivedb .h and then /usr/share/smartmontools/drivedb .h] ============================================ DEVICE SELF-TEST OPTIONS ===== -t TEST, -- test =TEST Run test . TEST: offline, short, long, conveyance, force, vendor,N, select ,M-N, pending,N, afterselect,[on|off] -C, --captive Do test in captive mode (along with -t) -X, --abort Abort any non-captive test on device =================================================== SMARTCTL EXAMPLES ===== smartctl --all /dev/hda (Prints all SMART information) smartctl --smart=on --offlineauto=on --saveauto=on /dev/hda (Enables SMART on first disk) smartctl -- test =long /dev/hda (Executes extended disk self- test ) smartctl --attributes --log=selftest --quietmode=errorsonly /dev/hda (Prints Self-Test & Attribute errors) smartctl --all --device=3ware,2 /dev/sda smartctl --all --device=3ware,2 /dev/twe0 smartctl --all --device=3ware,2 /dev/twa0 smartctl --all --device=3ware,2 /dev/twl0 (Prints all SMART info for 3rd ATA disk on 3ware RAID controller) smartctl --all --device=hpt,1 /1/3 /dev/sda (Prints all SMART info for the SATA disk attached to the 3rd PMPort of the 1st channel on the 1st HighPoint RAID controller) smartctl --all --device=areca,3 /1 /dev/sg2 (Prints all SMART info for 3rd ATA disk of the 1st enclosure on Areca RAID controller) |