收到客户反馈Ceph集群状态异常 连接集群进行检查
root@ceph03:~# ceph -s
cluster:
id: 8b60db4b-5df4-45e8-a6c1-03395e17eda3
health: HEALTH_ERR
6 scrub errors
Possible data damage: 1 pg inconsistent
169 pgs not deep-scrubbed in time
169 pgs not scrubbed in time
services:
mon: 4 daemons, quorum ceph02,ceph03,ceph04,ceph05 (age 12d)
mgr: ceph02(active, since 12d), standbys: ceph03, ceph04
osd: 40 osds: 39 up (since 5h), 39 in (since 5h); 66 remapped pgs
data:
pools: 2 pools, 2048 pgs
objects: 18.48M objects, 70 TiB
usage: 239 TiB used, 220 TiB / 459 TiB avail
pgs: 1220585/55429581 objects misplaced (2.202%)
1976 active+clean
60 active+remapped+backfill_wait
6 active+remapped+backfilling
5 active+clean+scrubbing+deep
1 active+clean+inconsistent
io:
client: 4.0 KiB/s rd, 1.5 MiB/s wr, 14 op/s rd, 103 op/s wr
recovery: 80 MiB/s, 20 objects/s
root@ceph03:~#
根据提示
存在6个scrub错误,这可能导致数据一致性问题。
有1个PG标记为不一致,表示可能存在数据损坏或丢失。
169个PG未能及时完成深度scrub和常规scrub,这可能影响数据的完整性和可用性。
输入ceph health detail 进行详细检查
root@ceph03:~# ceph health detail
HEALTH_ERR 6 scrub errors; Possible data damage: 1 pg inconsistent; 169 pgs not deep-scrubbed in time; 169 pgs not scrubbed in time
[ERR] OSD_SCRUB_ERRORS: 6 scrub errors
[ERR] PG_DAMAGED: Possible data damage: 1 pg inconsistent
pg 2.19b is active+clean+inconsistent, acting [55,47,54]
[WRN] PG_NOT_DEEP_SCRUBBED: 169 pgs not deep-scrubbed in time
解决思路:
修复不一致的PG:
运行命令 ceph pg 2.19b query 查看PG的详细信息,以识别不一致的原因。
尝试使用 ceph pg repair 2.19b 修复不一致的PG。请注意,修复过程中可能会增加I/O负载,最好在低峰时段进行。
确保所有的OSD都正常运行,检查是否有OSD下线或不响应。
检查pg 2.19b中 osd.55 47 54磁盘是否有出现异常
检查发现在03节点上的osd.55出现严重故障
root@ceph03:~# dmesg |grep error
[ 7.740481] EXT4-fs (dm-9): re-mounted. Opts: errors=remount-ro. Quota mode: none.
[831126.792545] sd 0:0:4:0: [sde] tag#1534 Add. Sense: Unrecovered read error
[831126.792565] blk_update_request: critical medium error, dev sde, sector 12613179976 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
[831130.200831] sd 0:0:4:0: [sde] tag#1529 Add. Sense: Unrecovered read error
[831130.200846] blk_update_request: critical medium error, dev sde, sector 12613182688 op 0x0:(READ) flags 0x0 phys_seg 14 prio class 0
[831134.814213] sd 0:0:4:0: [sde] tag#1483 Add. Sense: Unrecovered read error
[831134.814225] blk_update_request: critical medium error, dev sde, sector 12613189064 op 0x0:(READ) flags 0x0 phys_seg 33 prio class 0
[831178.101373] sd 0:0:4:0: [sde] tag#1498 Add. Sense: Unrecovered read error
[831178.101394] blk_update_request: critical medium error, dev sde, sector 10611476496 op 0x0:(READ) flags 0x0 phys_seg 24 prio class 0
[831351.913095] sd 0:0:4:0: [sde] tag#1570 Add. Sense: Unrecovered read error
[831351.913107] blk_update_request: critical medium error, dev sde, sector 15419569648 op 0x0:(READ) flags 0x0 phys_seg 25 prio class 0
[831399.002169] sd 0:0:4:0: [sde] tag#1758 Add. Sense: Unrecovered read error
[831399.002186] blk_update_request: critical medium error, dev sde, sector 10616047688 op 0x0:(READ) flags 0x0 phys_seg 97 prio class 0
[831407.091442] sd 0:0:4:0: [sde] tag#1818 Add. Sense: Unrecovered read error
[831407.091461] blk_update_request: critical medium error, dev sde, sector 10616091160 op 0x0:(READ) flags 0x0 phys_seg 7 prio class 0
[831521.899028] sd 0:0:4:0: [sde] tag#1879 Add. Sense: Unrecovered read error
[831521.899044] blk_update_request: critical medium error, dev sde, sector 12620843200 op 0x0:(READ) flags 0x0 phys_seg 2 prio class 0
[831530.016826] sd 0:0:4:0: [sde] tag#1815 Add. Sense: Unrecovered read error
[831530.016838] blk_update_request: critical medium error, dev sde, sector 12620884800 op 0x0:(READ) flags 0x0 phys_seg 34 prio class 0
[831594.377511] sd 0:0:4:0: [sde] tag#1521 Add. Sense: Unrecovered read error
[831594.377531] blk_update_request: critical medium error, dev sde, sector 10619805632 op 0x0:(READ) flags 0x0 phys_seg 50 prio class 0
[831599.211869] sd 0:0:4:0: [sde] tag#1857 Add. Sense: Unrecovered read error
[831599.211874] blk_update_request: critical medium error, dev sde, sector 10619820608 op 0x0:(READ) flags 0x0 phys_seg 2 prio class 0
[831607.107088] sd 0:0:4:0: [sde] tag#1884 Add. Sense: Unrecovered read error
[831607.107097] blk_update_request: critical medium error, dev sde, sector 10619867680 op 0x0:(READ) flags 0x0 phys_seg 6 prio class 0
[831610.973597] sd 0:0:4:0: [sde] tag#1892 Add. Sense: Unrecovered read error
[831610.973611] blk_update_request: critical medium error, dev sde, sector 10619871136 op 0x0:(READ) flags 0x0 phys_seg 38 prio class 0
[831670.636650] sd 0:0:4:0: [sde] tag#1895 Add. Sense: Unrecovered read error
[831670.636667] blk_update_request: critical medium error, dev sde, sector 12623684528 op 0x0:(READ) flags 0x0 phys_seg 52 prio class 0
[836560.516472] sd 0:0:4:0: [sde] tag#1949 Add. Sense: Unrecovered read error
[836560.516477] blk_update_request: critical medium error, dev sde, sector 21702402384 op 0x0:(READ) flags 0x0 phys_seg 8 prio class 0
[836911.696770] sd 0:0:4:0: [sde] tag#2359 Add. Sense: Unrecovered read error
[836911.696787] blk_update_request: critical medium error, dev sde, sector 21711922008 op 0x0:(READ) flags 0x0 phys_seg 29 prio class 0
[836949.301804] sd 0:0:4:0: [sde] tag#2325 Add. Sense: Unrecovered read error
[836949.301821] blk_update_request: critical medium error, dev sde, sector 21712645720 op 0x0:(READ) flags 0x0 phys_seg 45 prio class 0
[836953.466236] sd 0:0:4:0: [sde] tag#2288 Add. Sense: Unrecovered read error
[836953.466242] blk_update_request: critical medium error, dev sde, sector 21712652592 op 0x0:(READ) flags 0x0 phys_seg 17 prio class 0
[836958.247583] sd 0:0:4:0: [sde] tag#2312 Add. Sense: Unrecovered read error
[836958.247600] blk_update_request: critical medium error, dev sde, sector 21712668824 op 0x0:(READ) flags 0x0 phys_seg 21 prio class 0
[836965.522676] sd 0:0:4:0: [sde] tag#2353 Add. Sense: Unrecovered read error
[836965.522681] blk_update_request: critical medium error, dev sde, sector 21712726416 op 0x0:(READ) flags 0x0 phys_seg 22 prio class 0
[836968.794844] sd 0:0:4:0: [sde] tag#2334 Add. Sense: Unrecovered read error
[836968.794863] blk_update_request: critical medium error, dev sde, sector 21712726144 op 0x0:(READ) flags 0x0 phys_seg 24 prio class 0
[837135.238193] sd 0:0:4:0: [sde] tag#2374 Add. Sense: Unrecovered read error
[837135.238211] blk_update_request: critical medium error, dev sde, sector 21717129944 op 0x0:(READ) flags 0x0 phys_seg 13 prio class 0
[837139.553614] sd 0:0:4:0: [sde] tag#2369 Add. Sense: Unrecovered read error
[837139.553630] blk_update_request: critical medium error, dev sde, sector 21717138816 op 0x0:(READ) flags 0x0 phys_seg 8 prio class 0
[837143.809629] sd 0:0:4:0: [sde] tag#2422 Add. Sense: Unrecovered read error
[837143.809636] blk_update_request: critical medium error, dev sde, sector 21717152808 op 0x0:(READ) flags 0x0 phys_seg 3 prio class 0
[837378.533201] sd 0:0:4:0: [sde] tag#2323 Add. Sense: Unrecovered read error
[837378.533219] blk_update_request: critical medium error, dev sde, sector 21722984968 op 0x0:(READ) flags 0x0 phys_seg 7 prio class 0
[837385.343446] sd 0:0:4:0: [sde] tag#2326 Add. Sense: Unrecovered read error
[837385.343451] blk_update_request: critical medium error, dev sde, sector 21723035760 op 0x0:(READ) flags 0x0 phys_seg 10 prio class 0
[837486.727594] sd 0:0:4:0: [sde] tag#2375 Add. Sense: Unrecovered read error
[837486.727613] blk_update_request: critical medium error, dev sde, sector 21725617184 op 0x0:(READ) flags 0x0 phys_seg 20 prio class 0
[995605.782476] sd 0:0:4:0: [sde] tag#3292 Add. Sense: Unrecovered read error
[995605.782495] blk_update_request: critical medium error, dev sde, sector 8347884512 op 0x0:(READ) flags 0x0 phys_seg 15 prio class 0
[995787.012868] sd 0:0:4:0: [sde] tag#3300 Add. Sense: Unrecovered read error
[995787.012876] blk_update_request: critical medium error, dev sde, sector 8359010136 op 0x0:(READ) flags 0x0 phys_seg 32 prio class 0
[996584.983876] sd 0:0:4:0: [sde] tag#3074 Add. Sense: Unrecovered read error
[996584.983881] blk_update_request: critical medium error, dev sde, sector 8400422928 op 0x0:(READ) flags 0x0 phys_seg 25 prio class 0
[996611.976025] sd 0:0:4:0: [sde] tag#3114 Add. Sense: Unrecovered read error
[996611.976050] blk_update_request: critical medium error, dev sde, sector 8401488288 op 0x0:(READ) flags 0x0 phys_seg 55 prio class 0
[996684.078459] sd 0:0:4:0: [sde] tag#3081 Add. Sense: Unrecovered read error
[996684.078471] blk_update_request: critical medium error, dev sde, sector 8404591280 op 0x0:(READ) flags 0x0 phys_seg 37 prio class 0
[996711.054747] sd 0:0:4:0: [sde] tag#3113 Add. Sense: Unrecovered read error
[996711.054765] blk_update_request: critical medium error, dev sde, sector 8405833160 op 0x0:(READ) flags 0x0 phys_seg 2 prio class 0
root@ceph03:~#
将osd.55踢出 等待集群数据平衡后再观察
评论 (0)