Akemi

iSCSI+多路径存储使用HA-LVM

2025/08/25

HA-LVM(高可用LVM)是LVM在集群环境中的扩展,通过主/从配置实现共享存储的故障转移。主节点独占访问存储,从节点在主节点故障时接管。

对于使用传统文件系统(如ext4或xfs)的Active/Passive应用程序来说是一个不错的选择。为防止数据损坏,一次只有一个节点可以访问卷组挂载文件系统。

拓扑

环境说明

1
2
3
4
5
6
7
eve-ng

10.163.2.125 10.163.3.113 iscsi-target CentOS7.9
10.163.2.100 10.163.3.114 pcs1 CentOS8.0
10.163.2.102 10.163.3.108 pcs2 CentOS8.0

pcs version 0.11.10

流程说明

  • 配置iscsi-target(提供共享存储
  • 配置dm-multipath多路径存储
  • 创建与配置LVM
  • 部署Pacemaker集群
  • 添加LVM-activate、FileSystem集群资源

配置iscsi-target

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
yum -y install targetcli
firewall-cmd --permanent --add-port=3260/tcp
firewall-cmd --reload

targetcli
/> backstores/block create multipath-iscsi dev=/dev/sdb
/> iscsi/ create iqn.2024-08.com.example:block.target
/> iscsi/iqn.2024-08.com.example:block.target/tpg1/luns/ create /backstores/block/multipath-iscsi
/> iscsi/iqn.2024-08.com.example:block.target/tpg1/acls/ create iqn.2024-08.com.example:client.pcs1
/> iscsi/iqn.2024-08.com.example:block.target/tpg1/acls/ create iqn.2024-08.com.example:client.pcs2
/> iscsi/iqn.2024-08.com.example:block.target/tpg1/portals/ delete 0.0.0.0 ip_port=3260
/> iscsi/iqn.2024-08.com.example:block.target/tpg1/portals/ create 10.163.2.125
/> iscsi/iqn.2024-08.com.example:block.target/tpg1/portals/ create 10.163.3.113
/> ls
o- / ......................................................................................... [...]
o- backstores .............................................................................. [...]
| o- block .................................................................. [Storage Objects: 1]
| | o- multipath-iscsi ................................ [/dev/sdb (500.0GiB) write-thru activated]
| | o- alua ................................................................... [ALUA Groups: 1]
| | o- default_tg_pt_gp ....................................... [ALUA state: Active/optimized]
| o- fileio ................................................................. [Storage Objects: 0]
| o- pscsi .................................................................. [Storage Objects: 0]
| o- ramdisk ................................................................ [Storage Objects: 0]
o- iscsi ............................................................................ [Targets: 1]
| o- iqn.2024-08.com.example:block.target .............................................. [TPGs: 1]
| o- tpg1 ............................................................... [no-gen-acls, no-auth]
| o- acls .......................................................................... [ACLs: 2]
| | o- iqn.2024-08.com.example:client.pcs1 .................................. [Mapped LUNs: 1]
| | | o- mapped_lun0 ....................................... [lun0 block/multipath-iscsi (rw)]
| | o- iqn.2024-08.com.example:client.pcs2 .................................. [Mapped LUNs: 1]
| | o- mapped_lun0 ....................................... [lun0 block/multipath-iscsi (rw)]
| o- luns .......................................................................... [LUNs: 1]
| | o- lun0 ............................ [block/multipath-iscsi (/dev/sdb) (default_tg_pt_gp)]
| o- portals .................................................................... [Portals: 2]
| o- 10.163.2.125:3260 ................................................................ [OK]
| o- 10.163.3.113:3260 ................................................................ [OK]
o- loopback ......................................................................... [Targets: 0]

systemctl enable target --now

ss -tunlp | grep 3260
tcp LISTEN 0 256 10.163.2.125:3260 *:*
tcp LISTEN 0 256 10.163.3.113:3260 *:*

配置dm-multipath

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
更新一下yum源
rm -rf /etc/yum.repos.d/*
curl -o /etc/yum.repos.d/CentOS-Base.repo https://mirrors.aliyun.com/repo/Centos-vault-8.5.2111.repo
yum makecache

两个节点
yum -y install iscsi-initiator-utils device-mapper-multipath

# pcs1
echo "InitiatorName=iqn.2024-08.com.example:client.pcs1" > /etc/iscsi/initiatorname.iscsi
# pcs2
echo "InitiatorName=iqn.2024-08.com.example:client.pcs2" > /etc/iscsi/initiatorname.iscsi

systemctl enable iscsid --now
systemctl restart iscsid

iscsiadm -m discovery -t st -p 10.163.2.125
iscsiadm -m node -l

被映射到了sda和sdb
ls -l /dev/disk/by-path/
total 0
lrwxrwxrwx. 1 root root 9 Aug 25 11:58 ip-10.163.2.125:3260-iscsi-iqn.2024-08.com.example:block.target-lun-0 -> ../../sdb
lrwxrwxrwx. 1 root root 9 Aug 25 11:58 ip-10.163.3.113:3260-iscsi-iqn.2024-08.com.example:block.target-lun-0 -> ../../sda

# 获取WWIDs
/lib/udev/scsi_id -g -u /dev/sda
3600140501e5ade793394f42be42c3353
# 查看设备厂商信息
cat /sys/block/sdb/device/vendor
LIO-ORG
cat /sys/block/sdb/device/model
multipath-iscsi

cat > /etc/multipath.conf <<'EOF'
defaults {
user_friendly_names yes
find_multipaths yes
}
blacklist {
devnode "^vd[abcd]"
devnode "^nvme"
}
blacklist_exceptions {
wwid "3600140501e5ade793394f42be42c3353"
}
devices {
device {
vendor "LIO-ORG"
product "multipath-iscsi"
path_grouping_policy "multibus" # 共享模式
path_checker "tur"
features "0"
hardware_handler "0"
no_path_retry "fail"
rr_min_io 100
rr_weight "uniform"
prio "const"
}
}

EOF
systemctl restart multipathd

multipath -ll
mpatha (3600140501e5ade793394f42be42c3353) dm-3 LIO-ORG,multipath-iscsi
size=500G features='0' hwhandler='1 alua' wp=rw
`-+- policy='service-time 0' prio=50 status=active
|- 2:0:0:0 sda 8:0 active ready running
`- 3:0:0:0 sdb 8:16 active ready running

# 多路径设备
/dev/mapper/mpatha

配置LVM

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
vim /etc/lvm/lvm.conf
...
global {
system_id_source = uname
...
}
...

lvm systemid
system ID: pcs1

scp /etc/lvm/lvm.conf pcs2:/etc/lvm/

# 单节点运行,创建LVM结构
pvcreate /dev/mapper/mpatha
vgcreate clustervg /dev/mapper/mpatha
lvcreate -L 10G -n clusterlv clustervg
mkfs.xfs /dev/clustervg/clusterlv

vgcreate --shared clustervg /dev/mapper/mpatha

部署Pacemaker

只有双节点,但问题不大

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
# 添加HA的软件源
cat > /etc/yum.repos.d/HighAvailability.repo << EOF
[HighAvailability]
name=CentOS-8 - HighAvailability
baseurl=http://vault.centos.org/8.5.2111/HighAvailability/$basearch/os/
gpgcheck=1
enabled=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-centosofficial
EOF

yum -y install pcs pacemaker corosync fence-agents-all
systemctl enable pcsd --now

firewall-cmd --permanent --add-service=high-availability
firewall-cmd --reload

echo 123456 | passwd --stdin hacluster

# 一个节点上执行认证
pcs host auth pcs1 pcs2 -u hacluster -p 123456

pcs cluster setup mycluster pcs1 pcs2
pcs cluster start --all
pcs cluster enable --all

# 验证状态
pcs cluster status
Cluster Status:
Cluster Summary:
* Stack: corosync (Pacemaker is running)
* Current DC: pcs2 (version 2.1.10-1.el9-5693eaeee) - partition with quorum
* Last updated: Thu Aug 21 13:54:58 2025 on pcs1
* Last change: Thu Aug 21 13:53:15 2025 by hacluster via hacluster on pcs2
* 2 nodes configured
* 0 resource instances configured
Node List:
* Online: [ pcs1 pcs2 ]

PCSD Status:
pcs2: Online
pcs1: Online

创建集群资源

创建资源,所需参数可以通过pcs resource describe 查看

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
# 创建LVM-activate资源
pcs resource create halvm LVM-activate \
vgname=clustervg vg_access_mode=system_id \
--group=halvmfs

# 创建FileSystem集群资源
pcs resource create xfsfs FileSystem \
device=/dev/clustervg/clusterlv fstype=xfs \
directory=/mnt --group=halvmfs

# 刷新资源状态
pcs resource debug-start xfsfs
pcs resource refresh xfsfs

pcs resource status
* Resource Group: halvmfs:
* halvm (ocf:heartbeat:LVM-activate): Started pcs1
* xfsfs (ocf:heartbeat:Filesystem): Started pcs1

df -Th |grep mnt
/dev/mapper/clustervg-clusterlv xfs 10G 104M 9.9G 2% /mnt

测试资源迁移

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
# 测试迁移到pcs2
pcs resource move halvmfs pcs2

# 但是没有反应,通过查看日志
tail -f /var/log/messages
Aug 25 12:54:06 localhost pacemaker-schedulerd[30285]: error: Resource start-up disabled since no STONITH resources have been defined
Aug 25 12:54:06 localhost pacemaker-schedulerd[30285]: error: Either configure some or disable STONITH with the stonith-enabled option
Aug 25 12:54:06 localhost pacemaker-schedulerd[30285]: error: NOTE: Clusters with shared data need STONITH to ensure data integrity

Pacemaker要求配置STONITH设备来确保数据完整性
没有配置STONITH时,Pacemaker会阻止任何资源启动或迁移,以防止"脑裂"发生

# 禁用stonith检测
pcs property set stonith-enabled=false

pcs resource status
* Resource Group: halvmfs:
* halvm (ocf::heartbeat:LVM-activate): Started pcs2
* xfsfs (ocf::heartbeat:Filesystem): Started pcs2

[root@pcs2 ~]# df -Th | grep mnt
/dev/mapper/clustervg-clusterlv xfs 10G 104M 9.9G 2% /mnt

CATALOG
  1. 1. 配置iscsi-target
  2. 2. 配置dm-multipath
  3. 3. 配置LVM
  4. 4. 部署Pacemaker
  5. 5. 创建集群资源
  6. 6. 测试资源迁移