gs_check工具帮助用户进行集群运行状态(集群、双机和CM状态)、集群部署巡检项(目录权限、数据库版本、环境变量和参数等)、运行巡检项(连接状态、锁数量、游标数量和连接数量等)、管理数据库对象等选项的检查,以确保数据库处于正常可用状态。
前提条件
- 集群预安装成功。
- 集群安装成功。
- 执行集群间的gs_check检测,需要各主机间互信状态正常。如果不确定互信状态,可以参考gs_sshexkey检查和创建互信。
语法
- 健康检查(安装集群的用户)
gs_check -i ITEM [...] [-U USER] [-L] [-X XMLFILE] [-l LOGFILE] [-o OUTPUTDIR] [--skip-root-items] [--format]
gs_check -e SCENE_NAME [-U USER] [-L] [-X XMLFILE] [-l LOGFILE] [-o OUTPUTDIR] [--skip-root-items] [--format] [--time-out=SECS] - 显示帮助信息
gs_check { -? | --help }
显示版本号信息
gs_check { -V | --version }
参数说明
- -U
运行集群的用户名称。
取值范围:本集群用户名称。不指定该参数时,默认是本集群用户。
- -i
指定检查项。-i参数值区分大小写,支持多指定项同时查询,以“,”隔开。格式:-i item或-i item1,item2。详细的集群运行环境检查项请参见表1 集群状态检查表。
- -e
指定检查组。-e参数值区分大小写,不支持多检查组同时查询。格式:-e SCENE_NAME。详细的集群运行环境检查组请参见表1 集群状态检查表。
- -X
指定配置文件。
格式:-X XMLFILE。
- -l
指定日志文件。
未指定该参数时,日志文件的默认存放目录是$GPHOME/script/gspylib/inspection/output/log。检查操作正常时不收集日志。
- -o
指定检查报告的输出目录。
未指定该参数时,输出目录默认是$GPHOME/script/gspylib/inspection/output。
- -L
指定只执行本节点检查。
- --time-out
设置超时时间。
- --format
设置输出格式,默认default,只支持default。
- --cid
指定检查的ID,巡检内部使用。
- --skip-root-items
跳过具有root权限的项目。
- -?,--help
显示帮助信息。
- -V,--version
显示版本号信息。
检查分组 | 检查组(SCENE_NAME) | 检查项 | 描述(OK:符合预期;NG:与预期不符) | 返回值NG条件 |
---|---|---|---|---|
运行环境 | os | CheckTimeZone | 检查时区一致性。 | 时区不一致。 |
CheckEncoding | 检查编码格式。 | 编码格式不一致。 | ||
CheckFirewall | 检查防火墙状态。 | 防火墙关闭。 | ||
CheckKernelVer | 检查内核版本。 | OS内核版本不一致。 | ||
CheckMaxHandle | 检查句柄大设置。 | 文件句柄配置不满足要求。 | ||
CheckSysParams | 检查系统参数。 | OS参数配置和建议值不一致。 | ||
CheckNTPD | 检查NTPD服务。 | NTPD服务未开启。 | ||
CheckPing | 检查网络通畅。 | 节点间网络不通。 | ||
CheckDirPermissions | 检查目录权限。 | 目录权限与标准不符。 | ||
CheckEnvProfile | 检查环境变量。 | 环境变量不存在或者配置不正确。 | ||
CheckProcessCount | 检查系统进程数。 | 配置不符合参数规范脚本定义。 | ||
CheckClusterVer | 检查数据库版本。 | 数据库版本不符合补丁矩阵。 | ||
CheckCpuUsage | CPU使用率。 | CPU使用率大于等于80%。 | ||
CheckDiskUsage | 系统的磁盘占用率。 | 数据库相关的所有挂载盘使用率超过85%。 | ||
CheckHandleUsage | 文件句柄使用率。 | 文件句柄使用率大于等于80% | ||
CheckMemUsage | 内存使用率。 | 内存使用率大于等于80% | ||
CheckProcessStatus | 数据库实例进程状态。 | 数据库实例不存在僵死进程。 | ||
CheckSwapUsage | 检查Swap使用率。 | 不涉及。 | ||
CheckSyslog | 检查syslog协议状态。 | 使用“service syslog status”命令检查syslog协议状态,协议状态不是running。 | ||
CheckTime | 检查时间同步情况。 | 各节点之间的时间偏差超过2秒。 | ||
CheckNetPort | 检查网络端口状态。 | 存在以下三种情况中的任意一种:
| ||
CheckMemConsistency | 检查内存一致性。 | 各节点之间的内存大小差异小于512 MB。 | ||
CheckNodes | 检查节点状态。 | 节点状态异常。 | ||
CheckPhyMem | 检查物理内存。 | 存在以下三种情况中的任意一种:
| ||
集群状态 | cluster | CheckClusterState | 检查集群状态。 | 集群状态异常或进程不存在。 |
CheckDnVersion | 检查数据库版本。 | 主备端数据库版本不一致。 | ||
CheckClusterBalance | 检查主备平衡。 | 集群主备不平衡。 | ||
实例状态 | instStatus | CheckConnCount | 检查实例连接数。 | 返回实际连接数和大连接数。 |
数据库状态 | dbStatus | CheckTableCount | 检查数据库表数量。 | 返回数据库表数量。 |
CheckUncommXacts | 检查未决事务。 | 返回未决事务数。 | ||
CheckBufferUsage | 缓冲池命中率。 | 返回缓冲区使用率。 | ||
CheckBackup | 检查数据库备份。 | 近一周内未做过备份(安装不足一周不检查)。 | ||
CheckDBConn | 检查数据库可连接性(本地/远程)。 | 所有CN、DN不可执行本地/远程连接。 | ||
CheckDBStatus | 检查数据库状态。 | 存在CN或DN的状态为非OPEN,或者主DN或CN的开启状态为非READ WRITE,备DN的开启状态为非READ ONLY。 | ||
CheckDBUser | 检查数据库用户状态。 | 数据库用户状态不正常或者密码过期时间不足7天。 | ||
数据库对象 | dbInst | CheckTablespaceUsage | 表空间使用率。 | 返回表空间使用率。 |
CheckLockNum | 检查数据库锁数量。 | 返回锁数量。 | ||
CheckInstMemUsage | 检查实例内存使用率。 | 不涉及。 | ||
CheckLogLogicalLimit | 检查数据库日志逻辑限制。 | 不涉及。 | ||
CheckArchiveLogSpace | 检查归档日志空间设置。 | 不涉及。 | ||
RAID卡检查 | raid | CheckRaidHlth | 检查RAID卡健康状态。 | raid卡System Overview信息的Hlth不为Opt或raid组Virtual Drives信息的State不为Optl。 |
CheckCacheHlthDetail | 检查超级电容具体健康信息。 | 不满足每个raid的Firmware_Status信息中:NVCache State为ok,Replacement required为no,No space to cache offload为no,Module microcode update required为no | ||
CheckCacheDischarge | 检查超级电容充放电信息。 | 上次充放电不正常。 | ||
CheckCacheHlth | 检查raid的超级电容健康。 | 电容的Cachevault_Info信息中State不为Optimal。 | ||
CheckDiskFault | 检查硬盘失效。 | 返回SMART Health Status: FAILURE。 | ||
CheckNetwork | 检查网端性能。 | GE网口在全双工模式下工作速率不足1000Mbit/s。 | ||
CheckCPU | 检查CPU状态。 | 存在CPU状态异常(CPU正常状态码为0x8080)的服务器。 | ||
CheckFan | 检查风扇模块状态。 | 节点风扇不足4个,或者风扇状态异常(风扇状态正常状态码为0x0080)。 | ||
CheckPower | 检查电源模块状态。 | 每个节点有非两个电源模块或者电源模块状态码不是0x0180。 | ||
CheckSDR | 检查智能接口。 | ipmitool sdr命令执行失败。 | ||
CheckSESVersion | 检查SES版本。 | - | ||
CheckTemperature | 检查进气口和排气口的温度。 | - | ||
CheckDiskConsistency | 检查磁盘一致性。 | 除SSD之外的所有磁盘的类型和大小存在不一致。 | ||
CheckDiskPlace | 检查系统磁盘。 | 双系统盘异常,或者磁盘状态异常(物理在位状态码为0x0180,业务在位:healthStatus: "Optl", runningStatus: "Onln"})。 | ||
CheckDiskStatus | 检查磁盘状态。 | 存在非正常状态(正常状态下healthStatus为Normal,runningStatus为Online)磁盘。 | ||
CheckFWDVersion | 检查磁盘固件(网卡、raid卡、BMC)版本。 | - | ||
CheckFWVersion | 检查固件版本。 | - | ||
CheckSASVersion | 检查SAS版本一致性。 | 服务器的SAS固件版本不一致。 | ||
CheckSSD | 检查SSD。 | 在节点上存在SSD情况下,SSD状态异常(SSD正常状态下,healthStatus为Normal,runningStatus为Online)或未插入插槽0。 | ||
CheckECC | 检查ECC错误。 | 发生ECC错误,或更正的内存错误超过25,000。 | ||
CheckAPR | 网络ARP分析。 | 存在ARP分析异常的节点。 | ||
CheckOptMod | 检查光模块。 | 没有光学模块可用,或者可用光模块状态异常。 | ||
CheckPCIe | 检查PCIe。 | 实际PCIe协商速率、NIC的位宽与标准PCIe协商速率、NIC的位宽不同。 | ||
CheckNegotiationRate | 检查协商速率。 | 端口的协商速率小于物理带宽。 | ||
- | CheckRaidEnv | 检查满足raid条件的环境。 | 不满足raid卡的检查条件。 |
- NG:与预期不符;OK:与预期一致。
- 节点级别检查(包括运行环境、RAID卡检查)以节点为单位返回查询结果;集群级别检查(包括集群状态)以集群为单位返回查询结果;实例级别的检查(包括实例状态、数据库状态、数据库对象)以实例对象为单位返回查询结果。
- RAID卡检查:对raid组的检查项进行检查时,gs_check会先对以下条件进行检查。如果以下各项条件均满足,则继续执行raid组检查;检测到任何一项条件不满足,则检查结束,并返回查询结果。
- 服务器型号为TaiShan 2280 V2。不满足该条件则检查结束,结果为OK;
- CPU为ARM架构。不满足该条件则检查结束,结果为OK;
- 操作系统为EulerOS 2.0 SP8以上版本。不满足该条件则检查结束,结果为OK;
- RAID卡型号为SAS3508。不满足该条件则检查结束,结果为NG。
示例
- 以-i执行指定项查询:
omm@plat1:/opt/software/gaussdb/script> gs_check -i CheckEncoding -X /opt/software/gaussdb/clusterconfig.xml
Parsing the check items config file successfully
Distribute the context file to remote hosts successfully
Start to health check for the cluster. Total Items:1 Nodes:3
plat1 [=========================] 1/1
plat2 [=========================] 1/1
plat3 [=========================] 1/1
Start to analysis the check result
CheckEncoding...............................OK
{"description": "The encoding of each node in the cluster is consistent.", "item": "CheckEncoding", "error_msg": [], "type": 2, "result": "OK", "time": "2019-06-04 14:09:27", "data": {"plat1": {"encoding": "LANG=en_US.UTF-8"}, "plat2": {"encoding": "LANG=en_US.UTF-8"}, "plat3": {"encoding": "LANG=en_US.UTF-8"}}}
==============================================
Success. All check items run completed. Total:1 Success:1
For more infomation please refer to /home/dbdata/zenith/om/script/inspection/output/CheckReport_2018020973511.tar.gzomm@plat1:/opt/software/gaussdb/script> gs_check -i CheckEncoding -X /opt/software/gaussdb/clusterconfig.xml -L
start check ping
successful check ping.
[HOST] plat1
[NAM] CheckEncoding
[RST] OK
[VAL]
{"encoding": "LANG=en_US.UTF-8"}
[DESC]
The encoding of each node in the cluster is consistent.
[TIME]
2019-06-04 14:11:05
[RAW]
bash -c "unset LANG;unset SSH_SENDS_LOCALE;source /etc/profile.d/lang.sh > /dev/null 2>&1;source /etc/profile > /dev/null 2>&1;source ~/.bashrc; > /dev/null 2>&1;source ~/.profile > /dev/null 2>&1;source ~/.bash_profile > /dev/null 2>&1;locale | grep '^LANG='"
- 以-e执行分组查询:
omm@plat1:/opt/software/gaussdb/script> gs_check -e dbStatus -X /opt/software/gaussdb/clusterconfig.xml
Parsing the check items config file successfully
Distribute the context file to remote hosts successfully
Start to health check for the cluster. Total Items:3 Nodes:3
plat1 [=========================] 3/3
plat2 [=========================] 3/3
plat3 [=========================] 3/3
Start to analysis the check result
CheckTableCount.............................OK
{"description": "The number of database tables in the cluster does not exceed 1000.", "item": "CheckTableCount", "result": "OK", "time": "2019-07-15 19:54:43", "data": {"cn_401": {"GAUSS": "0"}}, "error_msg": []}
==============================================
CheckUncommXacts............................OK
{"description": "There are no pending transactions for CN instances in the cluster.", "item": "CheckUncommXacts", "result": "OK", "time": "2019-07-15 19:54:43", "data": {"cn_401": {"uncommitActs": }}, "error_msg": []}
==============================================
CheckBufferUsage............................OK
{"description": "The database buffer pool hit rate in the cluster is not less than 80%.", "item": "CheckBufferUsage", "result": "OK", "time": "2019-07-15 19:54:43", "data": {"DB2_3":
{"USAGE": "96.69%", "BUFFER_GETS": "72624", "DISK_READS": "2402"}, "DB1_1": {"USAGE": "96.70%", "BUFFER_GETS": "72721", "DISK_READS": "2401"}}, "error_msg": []}
==============================================
CheckBackup.................................OK
{"description": "Check if the backup service process starts.", "item": "CheckBackup", "result": "OK", "time": "2019-07-15 19:54:43", "data": {"DB2_3": "", "DB2_4": "", "DB1_2": "", "DB1_1": "", "cn_401": ""}, "error_msg": []}
==============================================
CheckDBConn.................................OK
{"description": "Check the database connection status.", "item": "CheckDBConn", "result": "OK", "time": "2019-07-15 19:54:43", "data": {"plat1": {"DB2_3": "connectable", "DB2_4": "connectable", "cn_401": "connectable", "DB1_1": "connectable", "DB1_2": "connectable"}, "plat1": {"DB2_3": "connectable", "DB2_4": "connectable",
"cn_401": "connectable", "DB1_1": "connectable", "DB1_2": "connectable"}, "plat3": {"DB2_3": "connectable", "DB2_4": "connectable", "cn_401": "connectable", "DB1_1": "connectable", "DB1_2": "connectable"}}, "error_msg": []}
==============================================
CheckDBStatus...............................OK
{"description": "The database is OPEN and master DN and CN open_status are READ WRITE, and the standby DN status is READ ONLY.", "item": "CheckDBStatus", "result": "OK", "time": "2019-07-15 19:54:43",
"data": {"DB2_3": {"GAUSS": {"status": "OPEN", "open_status": "READ WRITE"}}, "DB2_4": {"GAUSS": {"status": "OPEN", "open_status": "READ ONLY"}}, "DB1_2": {"GAUSS": {"status": "OPEN", "open_status": "READ ONLY"}},
"DB1_1": {"GAUSS": {"status": "OPEN", "open_status": "READ WRITE"}}, "cn_401": {"GAUSS": {"status": "OPEN", "open_status": "READ WRITE"}}}, "error_msg": []}
==============================================
CheckDBUser.................................OK
{"description": "Check database user status.", "item": "CheckDBUser", "result": "OK", "time": "2019-07-15 19:54:43", "data": {"DB2_3": {"SYS": {"account_status": "OPEN", "cryptoperiod": "+0000179"},
"PERFADM": {"account_status": "OPEN", "cryptoperiod": "+0000179"}, "PUBLIC": {"account_status": "OPEN", "cryptoperiod": "+0000179"}}, "DB2_4": {"SYS": {"account_status": "OPEN", "cryptoperiod": "+0000179"},
"PERFADM": {"account_status": "OPEN", "cryptoperiod": "+0000179"}, "PUBLIC": {"account_status": "OPEN", "cryptoperiod": "+0000179"}}, "DB1_2": {"SYS": {"account_status": "OPEN", "cryptoperiod": "+0000179"},
"PERFADM": {"account_status": "OPEN", "cryptoperiod": "+0000179"}, "PUBLIC": {"account_status": "OPEN", "cryptoperiod": "+0000179"}}, "DB1_1": {"SYS": {"account_status": "OPEN", "cryptoperiod": "+0000179"},
"PERFADM": {"account_status": "OPEN", "cryptoperiod": "+0000179"}, "PUBLIC": {"account_status": "OPEN", "cryptoperiod": "+0000179"}}, "cn_401": {"SYS": {"account_status": "OPEN", "cryptoperiod": "+0000179"},
"PERFADM": {"account_status": "OPEN", "cryptoperiod": "+0000179"}, "PUBLIC": {"account_status": "OPEN", "cryptoperiod": "+0000179"}}}, "error_msg": []}
==============================================
Analysis the check result successfully
Success. All check items run completed. Total:7 Success:7
For more information please refer to /home/dbdata/zenith/om/script/gspylib/inspection/output/CheckReport_dbStatus_201907157167126183.tar.gzomm@plat1:/opt/software/gaussdb/script> gs_check -e dbStatus -X /opt/software/gaussdb/clusterconfig.xml -L
[HOST] plat1
[NAM] CheckTableCount
[RST] NONE
[VAL]
[DESC]
The number of database tables in the cluster does not exceed 1000.
[TIME]
2019-07-15 20:07:24
[RAW]
[HOST] plat1
[NAM] CheckUncommXacts
[RST] NONE
[VAL]
[DESC]
There are no pending transactions for CN instances in the cluster.
[TIME]
2019-07-15 20:07:24
[RAW]
[HOST] plat1
[NAM] CheckBufferUsage
[RST] OK
[VAL]
{"DB2_3": {"USAGE": "97.21%", "BUFFER_GETS": "86370", "DISK_READS": "2410"}}
[DESC]
The database buffer pool hit rate in the cluster is not less than 80%.
[TIME]
2019-07-15 20:07:24
[RAW]
SELECT SUM(DISK_READS), SUM(BUFFER_GETS) FROM DV_SESSIONS;
[HOST] plat1
[NAM] CheckBackup
[RST] OK
[VAL]
{"DB2_3": "", "DB1_2": ""}
[DESC]
Check if the backup service process starts.
[TIME]
2019-07-15 20:07:24
[RAW]
SELECT START_TIME FROM SYS_BACKUP_SETS;
SELECT CREATED FROM ADM_USERS WHERE USERNAME = 'SYS';
[HOST] plat1
[NAM] CheckDBConn
[RST] OK
[VAL]
{"plat1": {"DB2_3": "connectable", "DB2_4": "connectable", "cn_401": "connectable", "DB1_1": "connectable", "DB1_2": "connectable"}}
[DESC]
Check the database connection status.
[TIME]
2019-07-15 20:07:24
[RAW]
SELECT * FROM DV_VERSION;
[HOST] plat1
[NAM] CheckDBStatus
[RST] OK
[VAL]
{"DB2_3": {"GAUSS": {"status": "OPEN", "open_status": "READ WRITE"}}, "DB1_2": {"GAUSS": {"status": "OPEN", "open_status": "READ ONLY"}}}
[DESC]
The database is OPEN and master DN and CN open_status are READ WRITE, and the standby DN status is READ ONLY.
[TIME]
2019-07-15 20:07:24
[RAW]
SELECT NAME,STATUS,OPEN_STATUS FROM DV_DATABASE;
[HOST] plat1
[NAM] CheckDBUser
[RST] OK
[VAL]
{"DB2_3": {"SYS": {"account_status": "OPEN", "cryptoperiod": "+0000179"}, "PERFADM": {"account_status": "OPEN", "cryptoperiod": "+0000179"}, "PUBLIC": {"account_status": "OPEN", "cryptoperiod": "+0000179"}},
"DB1_2": {"SYS": {"account_status": "OPEN", "cryptoperiod": "+0000179"}, "PERFADM": {"account_status": "OPEN", "cryptoperiod": "+0000179"}, "PUBLIC": {"account_status": "OPEN", "cryptoperiod": "+0000179"}}}
[DESC]
Check database user status.
[TIME]
2019-07-15 20:07:24
[RAW]
SELECT DB_USERS.USERNAME,ADM_USERS.ACCOUNT_STATUS,DB_USERS.CRYPTOPERIOD FROM DB_USERS,ADM_USERS WHERE ADM_USERS.USERNAME = DB_USERS.USERNAME;
- 以-e执行集群状态检查:
omm@plat1:/opt/software/gaussdb/script> gs_check -e cluster -X /opt/software/gaussdb/clusterconfig.xml
Parsing the check items config file successfully
Distribute the context file to remote hosts successfully
Start to health check for the cluster. Total Items:3 Nodes:3
plat1 [=========================] 3/3
plat2 [=========================] 3/3
plat3 [=========================] 3/3
Start to analysis the check result
CheckClusterState...........................OK
{"description": "The cluster status is normal.", "item": "CheckClusterState", "error_msg": [], "type": 1, "result": "OK", "time": "2019-06-04 14:20:10", "data": {"clusterStatus": "OK"}}
==============================================
CheckDnVersion..............................OK
{"description": "The database is in the same version.", "item": "CheckDnVersion", "error_msg": [], "type": 1, "result": "OK", "time": "2019-06-04 14:20:10", "data": {"DB1_1": "3e2b1c1", "DB1_2": "3e2b1c1", "DB2_1":
"3e2b1c1", "DB2_2": "3e2b1c1"}}
==============================================
CheckClusterBalance.........................OK
{"description": "Cluster-based backup state.", "item": "CheckClusterBalance", "error_msg": [], "type": 1, "result": "OK", "time": "2019-06-04 14:20:10", "data": {"balanced": "OK"}}
==============================================
Analysis the check result successfully
Success. All check items run completed. Total:3 Success:3
For more infomation please refer to /home/dbdata/zenith/om/script/inspection/output/CheckReport_cluster_2018020973893.tar.gz
- 以-U指定所查询集群的安装用户执行查询(-U指定为非本集群用户时,执行查询失败。不指定-U参数时,默认是本集群用户):
omm@plat1:/opt/software/gaussdb/script> gs_check -i CheckEncoding -X /opt/software/gaussdb/clusterconfig.xml -U omm
Parsing the check items config file successfully
Distribute the context file to remote hosts successfully
Start to health check for the cluster. Total Items:1 Nodes:3
plat1 [=========================] 1/1
plat2 [=========================] 1/1
plat3 [=========================] 1/1
Start to analysis the check result
CheckEncoding...............................OK
{"description": "The encoding of each node in the cluster is consistent.", "item": "CheckEncoding", "error_msg": [], "type": 2, "result": "OK", "time": "2019-06-04 14:22:24", "data": {"plat1": {"encoding": "LANG=en_US.UTF-8"},
"plat2": {"encoding": "LANG=en_US.UTF-8"}, "plat3": {"encoding": "LANG=en_US.UTF-8"}}}
==============================================
Analysis the check result successfully
Success. All check items run completed. Total:1 Success:1
For more infomation please refer to /home/dbdata/zenith/om/script/inspection/output/CheckReport_2018020974137.tar.gz
- 以--skip-root-items跳过分组内root检查项查询(假设CheckFirewall检查项为root检查项):
omm@plat1:/opt/software/gaussdb/script> gs_check -i CheckProcessCount,CheckTimeZone,CheckFirewall -X /opt/software/gaussdb/clusterconfig.xml --skip-root-items
Parsing the check items config file successfully
Distribute the context file to remote hosts successfully
Start to health check for the cluster. Total Items:2 Nodes:3
plat1 [=========================] 2/2
plat2 [=========================] 2/2
plat3 [=========================] 2/2
Start to analysis the check result
CheckProcessCount...........................OK
{"description": "The current number of processes and the maximum number of processes on each node in the cluster.", "item": "CheckProcessCount", "time": "2019-06-04 14:25:53", "data": {"plat1": {"maxuproc": "62712", "pscount": "27"}, "plat2":
{"maxuproc": "62712", "pscount": "14"}, "plat3": {"maxuproc": "62712", "pscount": "24"}}, "error_msg": [], "result": "OK"}
==============================================
CheckTimeZone...............................OK
{"description": "The time zones of each node in the cluster are consistent.", "item": "CheckTimeZone", "time": "2019-06-04 14:25:53", "data": {"plat1": {"time_zone": "+0800"}, "plat2": {"time_zone": "+0800"}, "plat3":
{"time_zone": "+0800"}}, "result": "OK", "error_msg": []}
==============================================
Analysis the check result successfully
Success. All check items run completed. Total:2 Success:2
For more infomation please refer to /home/dbdata/zenith/om/script/inspection/output/CheckReport_201809274251549598.tar.gz - 硬盘失效故障检查
--查询RAID信息列表:
omm@plat1:/opt/software/gaussdb/script> gs_check -i CheckDiskFault -X /opt/software/gaussdb/clusterconfig.xml
CheckDiskFault...............................NG
{"description": "Check the hard disk health.", "item": "CheckDiskFault", "result": "NG", "time": "2019-07-22 21:56:50", "data": {"plat1": {"exception": "NG raid environment does not match"}}, "plat2": {"v0": {"3":
{"SMART Health Status":"OK"},"2":{"SMART Health Status":"FAILURE","Warning":"the /dev/sda have been damaged,please repair it"},"os_driver_name" : "/dev/sda","raid_did" :["2","3"]}}
--替换FAILURE的磁盘。