[ 개요 ]
* PCS를 통해 VIP를 구성하여 다중 서버(ap01, ap02) 자원 관리
* VIP를 통해 clickhouse-client 작업 쿼리가 수행되면, HAProxy로 clickhouse-server에 로드밸런싱
[ PCS 구성 ]
1. PCS 패키지 설치 (ap01 / ap02)
apt install -y corosync pcs pacemaker
2. PCS 데몬 실행 (ap01 / ap02)
sudo systemctl start pcsd
3. PCS 데몬 default 계정인 hacluster PW 설정 (ap01 / ap02)
sudo passwd hacluster
4. 클러스터 계정 인증 (ap01)
pcs host auth ap01.test.com ap02.test.com -u hacluster
Password:
ap01.test.com: Authorized
ap02.test.com: Authorized
5. cluster setup
pcs cluster setup clickhosue_cluster ap01.test.com ap02.test.com --force
---
만약, Warning: Unable to read the known-hosts file: No such file or directory: '/var/lib/pcsd/known-hosts'
해당 에러 발생하면 일부 노드가 삭제 중이니
pcs cluster destroy
---
6. pcs start
pcs cluster start --all
7. pcs 상태 확인
sudo pcs status
---
Cluster name: clickhouse_cluster
WARNINGS:
No stonith devices and stonith-enabled is not false
Cluster Summary:
* Stack: corosync
* Current DC: ap01.test.com (version 2.0.3-4b1f869f0f) - partition with quorum
* Last updated: Tue Jul 2 16:37:13 2024
* Last change: Tue Jul 2 16:36:33 2024 by hacluster via crmd on ap01.test.com
* 2 nodes configured
* 0 resource instances configured
Node List:
* Online: [ ap01.test.com ap02.test.com ]
Full List of Resources:
* No resources
Daemon Status:
corosync: active/disabled
pacemaker: active/disabled
pcsd: active/enabled
[ VIP 설정 ]
1. VIP 설정
* ap01.test.com [192.168.56.62]
* ap02.test.com [192.168.56.63]
sudo pcs resource create VirtualIP ocf:heartbeat:IPaddr2 ip=192.168.56.100 cidr_netmask=24 nic=eth0 op monitor interval=30s
2. PCS 상태에 생성한 VIP 확인
sudo pcs status
---
Cluster name: clickhouse_cluster
Cluster Summary:
* Stack: corosync
* Current DC: ap01.test.com (version 2.0.3-4b1f869f0f) - partition with quorum
* Last updated: Thu Jul 4 15:27:01 2024
* Last change: Thu Jul 4 15:26:54 2024 by root via cibadmin on ap01.test.com
* 2 nodes configured
* 1 resource instance configured
Node List:
* Online: [ ap01.test.com ap02.test.com ]
Full List of Resources:
* VirtualIP (ocf::heartbeat:IPaddr2): Started ap01.test.com
Daemon Status:
corosync: active/disabled
pacemaker: active/disabled
pcsd: active/enabled
* vip 리소스가 ap01.test.com 에서 실행
2-1) vip를 잘못 생성 시, 삭제 및 update 방법
sudo pcs resource restart my_vip
* 리소스를 재시작해서 해결하는 방법
sudo pcs resource delete VirtualIP
* 리소스를 삭제 후, 재 생성 하는 방법
3. ip 확인
ip a
---
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether 08:00:27:77:03:29 brd ff:ff:ff:ff:ff:ff
inet 10.0.2.15/24 brd 10.0.2.255 scope global dynamic eth0
valid_lft 75732sec preferred_lft 75732sec
inet 192.168.56.100/24 scope global eth0
valid_lft forever preferred_lft forever
inet6 fe80::a00:27ff:fe77:329/64 scope link
valid_lft forever preferred_lft forever
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether 08:00:27:6a:eb:c2 brd ff:ff:ff:ff:ff:ff
inet 192.168.56.62/24 brd 192.168.56.255 scope global eth1
valid_lft forever preferred_lft forever
inet6 fe80::a00:27ff:fe6a:ebc2/64 scope link
valid_lft forever preferred_lft forever
* eth0에 vip(192.168.56.100) 생성된 것 확인
[ FailOver Test ]
1. 임의로 리소스가 할당된, ap01.test.com down
sudo pcs cluster stop ap01.test.com
sudo pcs status
---
Error: error running crm_mon, is pacemaker running?
Error: cluster is not available on this node
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether 08:00:27:77:03:29 brd ff:ff:ff:ff:ff:ff
inet 10.0.2.15/24 brd 10.0.2.255 scope global dynamic eth0
valid_lft 73872sec preferred_lft 73872sec
inet6 fe80::a00:27ff:fe77:329/64 scope link
valid_lft forever preferred_lft forever
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether 08:00:27:6a:eb:c2 brd ff:ff:ff:ff:ff:ff
inet 192.168.56.62/24 brd 192.168.56.255 scope global eth1
valid_lft forever preferred_lft forever
inet6 fe80::a00:27ff:fe6a:ebc2/64 scope link
valid_lft forever preferred_lft forever
* 설정했던 eth0에서 vip(192.168.56.100)이 사라진 것을 확인
2. ap02.test.com에서 리소스 상태 확인
sudo pcs status
---
Cluster name: clickhouse_cluster
Cluster Summary:
* Stack: corosync
* Current DC: ap02.test.com (version 2.0.3-4b1f869f0f) - partition with quorum
* Last updated: Thu Jul 4 15:59:44 2024
* Last change: Thu Jul 4 15:27:12 2024 by root via cibadmin on ap01.test.com
* 2 nodes configured
* 1 resource instance configured
Node List:
* Online: [ ap02.test.com ]
* OFFLINE: [ ap01.test.com ]
Full List of Resources:
* VirtualIP (ocf::heartbeat:IPaddr2): Started ap02.test.com
Daemon Status:
corosync: active/disabled
pacemaker: active/disabled
pcsd: active/enabled
* ap02.test.com에 vip 리소스가 할당된 것을 확인할 수 있음
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether 08:00:27:77:03:29 brd ff:ff:ff:ff:ff:ff
inet 10.0.2.15/24 brd 10.0.2.255 scope global dynamic eth0
valid_lft 73674sec preferred_lft 73674sec
inet 192.168.56.100/24 scope global eth0
valid_lft forever preferred_lft forever
inet6 fe80::a00:27ff:fe77:329/64 scope link
valid_lft forever preferred_lft forever
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether 08:00:27:db:21:11 brd ff:ff:ff:ff:ff:ff
inet 192.168.56.63/24 brd 192.168.56.255 scope global eth1
valid_lft forever preferred_lft forever
inet6 fe80::a00:27ff:fedb:2111/64 scope link
valid_lft forever preferred_lft forever
* ip를 확인했을 때, eth0에서 vip(192.168.56.100)이 할당된 것을 확인할 수 있음
ssh vagrant@192.168.56.100
The authenticity of host '192.168.56.100 (192.168.56.100)' can't be established.
ED25519 key fingerprint is SHA256:vNxcOXmhwLLQIsyN2ZLOOejwhwL4azPKjtxYkjCCxco.
This host key is known by the following other names/addresses:
~/.ssh/known_hosts:4: 192.168.56.60
~/.ssh/known_hosts:6: 192.168.56.61
~/.ssh/known_hosts:7: 192.168.56.62
~/.ssh/known_hosts:8: 192.168.56.63
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
Warning: Permanently added '192.168.56.100' (ED25519) to the list of known hosts.
(vagrant@192.168.56.100) Password:
Welcome to Ubuntu 20.04.6 LTS (GNU/Linux 5.4.0-176-generic x86_64)
* Documentation: https://help.ubuntu.com
* Management: https://landscape.canonical.com
* Support: https://ubuntu.com/pro
System information disabled due to load higher than 1.0
This system is built by the Bento project by Chef Software
More information can be found at https://github.com/chef/bento
Last login: Thu Jul 4 15:59:37 2024 from 192.168.56.1
vagrant@ap02:~$
* vip(192.168.56.100)로 접근 시, ap02.test.com으로 접근하는 것을 확인할 수 있음
3. 다시 ap01.test.com 서버에서 vip 리소스를 start 시키면..
sudo pcs cluster ap01.test.com start
sudo pcs status
---
Cluster name: clickhouse_cluster
Cluster Summary:
* Stack: corosync
* Current DC: ap02.test.com (version 2.0.3-4b1f869f0f) - partition with quorum
* Last updated: Thu Jul 4 16:07:14 2024
* Last change: Thu Jul 4 15:27:12 2024 by root via cibadmin on ap01.test.com
* 2 nodes configured
* 1 resource instance configured
Node List:
* Online: [ ap01.test.com ap02.test.com ]
Full List of Resources:
* VirtualIP (ocf::heartbeat:IPaddr2): Started ap02.test.com
Daemon Status:
corosync: active/disabled
pacemaker: active/disabled
pcsd: active/enabled
* 둘 다, 온라인 상태로 변경 후, vip 리소스는 ap02.test.com에서 머물러 있는 것을 확인