Thiết kế website giá rẻ

Question

With a clean install of Ceph version 18.2.4, I tried to add a new OSD in a HEALTH_OK cluster without triggering data movement.

Setting the “norebalance” flag didn’t prevent ceph to immediately move data to the new OSD.

Here is the trace of the commands I used :

<code># initial state: 3 hosts, 4 OSDs

$ ceph orch host ls

HOST ADDR LABELS STATUS

ceph01 192.168.101.10 _admin

ceph02 192.168.101.11 _admin

ceph03 192.168.101.12 _admin

3 hosts in cluster

$ ceph osd tree

ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF

-1 0.02197 root default

-3 0.01099 host ceph01

0 hdd 0.00549 osd.0 up 1.00000 1.00000

1 hdd 0.00549 osd.1 up 1.00000 1.00000

-5 0.01099 host ceph02

2 hdd 0.00549 osd.2 up 1.00000 1.00000

3 hdd 0.00549 osd.3 up 1.00000 1.00000

$ ceph -s

cluster:

id: 1ae2e354-6b87-11ef-aa6e-738d3ffe6d7d

health: HEALTH_OK

services:

mon: 3 daemons, quorum ceph01,ceph03,ceph02 (age 6m)

mgr: ceph01.mbuuym(active, since 7m), standbys: ceph03.acyoah

osd: 4 osds: 4 up (since 6m), 4 in (since 6m)

data:

pools: 2 pools, 33 pgs

objects: 111 objects, 404 MiB

usage: 1007 MiB used, 21 GiB / 22 GiB avail

pgs: 33 active+clean

$ ceph osd df

ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS

0 hdd 0.00549 1.00000 5.6 GiB 219 MiB 158 MiB 1 KiB 61 MiB 5.4 GiB 3.83 0.87 12 up

1 hdd 0.00549 1.00000 5.6 GiB 293 MiB 245 MiB 1 KiB 48 MiB 5.3 GiB 5.12 1.16 21 up

2 hdd 0.00549 1.00000 5.6 GiB 173 MiB 130 MiB 1 KiB 44 MiB 5.4 GiB 3.03 0.69 11 up

3 hdd 0.00549 1.00000 5.6 GiB 321 MiB 273 MiB 1 KiB 48 MiB 5.3 GiB 5.62 1.28 22 up

TOTAL 22 GiB 1007 MiB 806 MiB 6.2 KiB 201 MiB 21 GiB 4.40

MIN/MAX VAR: 0.69/1.28 STDDEV: 1.03

# adding a new OSD

$ ceph osd set norebalance

$ ceph orch daemon add osd ceph03:/dev/sdb

Created osd(s) 4 on host 'ceph03'

# final state: 3 hosts, 5 OSDs

$ ceph osd tree

ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF

-1 0.02747 root default

-3 0.01099 host ceph01

0 hdd 0.00549 osd.0 up 1.00000 1.00000

1 hdd 0.00549 osd.1 up 1.00000 1.00000

-5 0.01099 host ceph02

2 hdd 0.00549 osd.2 up 1.00000 1.00000

3 hdd 0.00549 osd.3 up 1.00000 1.00000

-7 0.00549 host ceph03

4 hdd 0.00549 osd.4 up 1.00000 1.00000

$ ceph -s

cluster:

id: 1ae2e354-6b87-11ef-aa6e-738d3ffe6d7d

health: HEALTH_WARN

norebalance flag(s) set

Reduced data availability: 3 pgs inactive, 1 pg peering

services:

mon: 3 daemons, quorum ceph01,ceph03,ceph02 (age 11m)

mgr: ceph01.mbuuym(active, since 12m), standbys: ceph03.acyoah

osd: 5 osds: 5 up (since 7s), 5 in (since 19s)

flags norebalance

data:

pools: 2 pools, 33 pgs

objects: 104 objects, 378 MiB

usage: 988 MiB used, 27 GiB / 28 GiB avail

pgs: 9.091% pgs unknown

15.152% pgs not active

23 active+clean

4 peering

3 unknown

2 active+undersized+remapped

1 remapped+peering

$ sleep 5m

$ ceph osd df

ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS

0 hdd 0.00549 1.00000 5.6 GiB 182 MiB 121 MiB 1 KiB 61 MiB 5.4 GiB 3.19 0.87 9 up

1 hdd 0.00549 1.00000 5.6 GiB 221 MiB 173 MiB 1 KiB 48 MiB 5.4 GiB 3.86 1.05 14 up

2 hdd 0.00549 1.00000 5.6 GiB 163 MiB 102 MiB 1 KiB 61 MiB 5.4 GiB 2.85 0.78 8 up

3 hdd 0.00549 1.00000 5.6 GiB 246 MiB 198 MiB 1 KiB 48 MiB 5.3 GiB 4.30 1.17 18 up

4 hdd 0.00549 1.00000 5.6 GiB 240 MiB 214 MiB 1 KiB 26 MiB 5.4 GiB 4.20 1.14 17 up

TOTAL 28 GiB 1.0 GiB 808 MiB 7.8 KiB 245 MiB 27 GiB 3.68

MIN/MAX VAR: 0.78/1.17 STDDEV: 0.57

# new OSD #4 is already filled despite the "norebalance" flag

</code>

<code># initial state: 3 hosts, 4 OSDs $ ceph orch host ls HOST ADDR LABELS STATUS ceph01 192.168.101.10 _admin ceph02 192.168.101.11 _admin ceph03 192.168.101.12 _admin 3 hosts in cluster $ ceph osd tree ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF -1 0.02197 root default -3 0.01099 host ceph01 0 hdd 0.00549 osd.0 up 1.00000 1.00000 1 hdd 0.00549 osd.1 up 1.00000 1.00000 -5 0.01099 host ceph02 2 hdd 0.00549 osd.2 up 1.00000 1.00000 3 hdd 0.00549 osd.3 up 1.00000 1.00000 $ ceph -s cluster: id: 1ae2e354-6b87-11ef-aa6e-738d3ffe6d7d health: HEALTH_OK services: mon: 3 daemons, quorum ceph01,ceph03,ceph02 (age 6m) mgr: ceph01.mbuuym(active, since 7m), standbys: ceph03.acyoah osd: 4 osds: 4 up (since 6m), 4 in (since 6m) data: pools: 2 pools, 33 pgs objects: 111 objects, 404 MiB usage: 1007 MiB used, 21 GiB / 22 GiB avail pgs: 33 active+clean $ ceph osd df ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS 0 hdd 0.00549 1.00000 5.6 GiB 219 MiB 158 MiB 1 KiB 61 MiB 5.4 GiB 3.83 0.87 12 up 1 hdd 0.00549 1.00000 5.6 GiB 293 MiB 245 MiB 1 KiB 48 MiB 5.3 GiB 5.12 1.16 21 up 2 hdd 0.00549 1.00000 5.6 GiB 173 MiB 130 MiB 1 KiB 44 MiB 5.4 GiB 3.03 0.69 11 up 3 hdd 0.00549 1.00000 5.6 GiB 321 MiB 273 MiB 1 KiB 48 MiB 5.3 GiB 5.62 1.28 22 up TOTAL 22 GiB 1007 MiB 806 MiB 6.2 KiB 201 MiB 21 GiB 4.40 MIN/MAX VAR: 0.69/1.28 STDDEV: 1.03 # adding a new OSD $ ceph osd set norebalance $ ceph orch daemon add osd ceph03:/dev/sdb Created osd(s) 4 on host 'ceph03' # final state: 3 hosts, 5 OSDs $ ceph osd tree ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF -1 0.02747 root default -3 0.01099 host ceph01 0 hdd 0.00549 osd.0 up 1.00000 1.00000 1 hdd 0.00549 osd.1 up 1.00000 1.00000 -5 0.01099 host ceph02 2 hdd 0.00549 osd.2 up 1.00000 1.00000 3 hdd 0.00549 osd.3 up 1.00000 1.00000 -7 0.00549 host ceph03 4 hdd 0.00549 osd.4 up 1.00000 1.00000 $ ceph -s cluster: id: 1ae2e354-6b87-11ef-aa6e-738d3ffe6d7d health: HEALTH_WARN norebalance flag(s) set Reduced data availability: 3 pgs inactive, 1 pg peering services: mon: 3 daemons, quorum ceph01,ceph03,ceph02 (age 11m) mgr: ceph01.mbuuym(active, since 12m), standbys: ceph03.acyoah osd: 5 osds: 5 up (since 7s), 5 in (since 19s) flags norebalance data: pools: 2 pools, 33 pgs objects: 104 objects, 378 MiB usage: 988 MiB used, 27 GiB / 28 GiB avail pgs: 9.091% pgs unknown 15.152% pgs not active 23 active+clean 4 peering 3 unknown 2 active+undersized+remapped 1 remapped+peering $ sleep 5m $ ceph osd df ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS 0 hdd 0.00549 1.00000 5.6 GiB 182 MiB 121 MiB 1 KiB 61 MiB 5.4 GiB 3.19 0.87 9 up 1 hdd 0.00549 1.00000 5.6 GiB 221 MiB 173 MiB 1 KiB 48 MiB 5.4 GiB 3.86 1.05 14 up 2 hdd 0.00549 1.00000 5.6 GiB 163 MiB 102 MiB 1 KiB 61 MiB 5.4 GiB 2.85 0.78 8 up 3 hdd 0.00549 1.00000 5.6 GiB 246 MiB 198 MiB 1 KiB 48 MiB 5.3 GiB 4.30 1.17 18 up 4 hdd 0.00549 1.00000 5.6 GiB 240 MiB 214 MiB 1 KiB 26 MiB 5.4 GiB 4.20 1.14 17 up TOTAL 28 GiB 1.0 GiB 808 MiB 7.8 KiB 245 MiB 27 GiB 3.68 MIN/MAX VAR: 0.78/1.17 STDDEV: 0.57 # new OSD #4 is already filled despite the "norebalance" flag </code>

# initial state: 3 hosts, 4 OSDs

$ ceph orch host ls
HOST       ADDR            LABELS  STATUS  
ceph01  192.168.101.10  _admin          
ceph02  192.168.101.11  _admin          
ceph03  192.168.101.12  _admin          
3 hosts in cluster

$ ceph osd tree
ID  CLASS  WEIGHT   TYPE NAME           STATUS  REWEIGHT  PRI-AFF
-1         0.02197  root default                                 
-3         0.01099      host ceph01                           
 0    hdd  0.00549          osd.0           up   1.00000  1.00000
 1    hdd  0.00549          osd.1           up   1.00000  1.00000
-5         0.01099      host ceph02                           
 2    hdd  0.00549          osd.2           up   1.00000  1.00000
 3    hdd  0.00549          osd.3           up   1.00000  1.00000

$ ceph -s
  cluster:
    id:     1ae2e354-6b87-11ef-aa6e-738d3ffe6d7d
    health: HEALTH_OK
 
  services:
    mon: 3 daemons, quorum ceph01,ceph03,ceph02 (age 6m)
    mgr: ceph01.mbuuym(active, since 7m), standbys: ceph03.acyoah
    osd: 4 osds: 4 up (since 6m), 4 in (since 6m)
 
  data:
    pools:   2 pools, 33 pgs
    objects: 111 objects, 404 MiB
    usage:   1007 MiB used, 21 GiB / 22 GiB avail
    pgs:     33 active+clean

$ ceph osd df
ID  CLASS  WEIGHT   REWEIGHT  SIZE     RAW USE   DATA     OMAP     META     AVAIL    %USE  VAR   PGS  STATUS
 0    hdd  0.00549   1.00000  5.6 GiB   219 MiB  158 MiB    1 KiB   61 MiB  5.4 GiB  3.83  0.87   12      up
 1    hdd  0.00549   1.00000  5.6 GiB   293 MiB  245 MiB    1 KiB   48 MiB  5.3 GiB  5.12  1.16   21      up
 2    hdd  0.00549   1.00000  5.6 GiB   173 MiB  130 MiB    1 KiB   44 MiB  5.4 GiB  3.03  0.69   11      up
 3    hdd  0.00549   1.00000  5.6 GiB   321 MiB  273 MiB    1 KiB   48 MiB  5.3 GiB  5.62  1.28   22      up
                       TOTAL   22 GiB  1007 MiB  806 MiB  6.2 KiB  201 MiB   21 GiB  4.40                   
MIN/MAX VAR: 0.69/1.28  STDDEV: 1.03

# adding a new OSD

$ ceph osd set norebalance
$ ceph orch daemon add osd ceph03:/dev/sdb 
Created osd(s) 4 on host 'ceph03'

# final state: 3 hosts, 5 OSDs

$ ceph osd tree
ID  CLASS  WEIGHT   TYPE NAME           STATUS  REWEIGHT  PRI-AFF
-1         0.02747  root default                                 
-3         0.01099      host ceph01                           
 0    hdd  0.00549          osd.0           up   1.00000  1.00000
 1    hdd  0.00549          osd.1           up   1.00000  1.00000
-5         0.01099      host ceph02                           
 2    hdd  0.00549          osd.2           up   1.00000  1.00000
 3    hdd  0.00549          osd.3           up   1.00000  1.00000
-7         0.00549      host ceph03                           
 4    hdd  0.00549          osd.4           up   1.00000  1.00000

$ ceph -s
  cluster:
    id:     1ae2e354-6b87-11ef-aa6e-738d3ffe6d7d
    health: HEALTH_WARN
            norebalance flag(s) set
            Reduced data availability: 3 pgs inactive, 1 pg peering
 
  services:
    mon: 3 daemons, quorum ceph01,ceph03,ceph02 (age 11m)
    mgr: ceph01.mbuuym(active, since 12m), standbys: ceph03.acyoah
    osd: 5 osds: 5 up (since 7s), 5 in (since 19s)
         flags norebalance
 
  data:
    pools:   2 pools, 33 pgs
    objects: 104 objects, 378 MiB
    usage:   988 MiB used, 27 GiB / 28 GiB avail
    pgs:     9.091% pgs unknown
             15.152% pgs not active
             23 active+clean
             4  peering
             3  unknown
             2  active+undersized+remapped
             1  remapped+peering

$ sleep 5m
$ ceph osd df
ID  CLASS  WEIGHT   REWEIGHT  SIZE     RAW USE  DATA     OMAP     META     AVAIL    %USE  VAR   PGS  STATUS
 0    hdd  0.00549   1.00000  5.6 GiB  182 MiB  121 MiB    1 KiB   61 MiB  5.4 GiB  3.19  0.87    9      up
 1    hdd  0.00549   1.00000  5.6 GiB  221 MiB  173 MiB    1 KiB   48 MiB  5.4 GiB  3.86  1.05   14      up
 2    hdd  0.00549   1.00000  5.6 GiB  163 MiB  102 MiB    1 KiB   61 MiB  5.4 GiB  2.85  0.78    8      up
 3    hdd  0.00549   1.00000  5.6 GiB  246 MiB  198 MiB    1 KiB   48 MiB  5.3 GiB  4.30  1.17   18      up
 4    hdd  0.00549   1.00000  5.6 GiB  240 MiB  214 MiB    1 KiB   26 MiB  5.4 GiB  4.20  1.14   17      up
                       TOTAL   28 GiB  1.0 GiB  808 MiB  7.8 KiB  245 MiB   27 GiB  3.68                   
MIN/MAX VAR: 0.78/1.17  STDDEV: 0.57
# new OSD #4 is already filled despite the "norebalance" flag

I’m particularly interested in understanding what happens in Ceph when a new
OSD is added to a healthy cluster, and why the “norebalance” flag doesn’t prevent
data movement in this situation.

Here are a few questions I’d like answered:

what “events” (rebalancing, backfilling, recovering…) are triggered when
adding a new OSD to a healthy cluster ?
why are some PGs in “undersized+remapped” state just after adding the new
OSD, whereas they were not “undersized” before ?
what does “rebalancing” really mean in Ceph ? In which situation does it
occur ?
why doesn’t the “norebalance” flag prevent data movement when adding a new OSD to a healthy cluster ?
in which situation is the “norebalance” flag usefull ?

Thiết kế website giá rẻ

Danh mục

Why doesn’t the “norebalance” flag prevent a healthy Ceph cluster from moving data to a newly added osd?