Greenplum HA Implementation Document

1. Introduction to Greenplum HA

1.1 Master Mirroring

When High Availability (HA) is enabled for the Master node, there are two types of master instances: primary and standby. Clients can only connect to the primary master and execute queries there. The standby master maintains data consistency with the primary master through Write-Ahead Logging (WAL) streaming replication.
When the primary master fails, the standby master does not switch to become the primary master automatically. Administrators can use the gpactivatestandby tool to switch the standby master to become the new primary master.
For detailed information, refer to Overview of Master Mirroring.

1.2 Segment Mirroring

The Greenplum database stores data across multiple segment instances, each of which is a PostgreSQL instance of the Greenplum database. Data is distributed across segment nodes according to the distribution policy defined in the table creation statement. If HA is not enabled, when a segment node fails, the Greenplum database cannot be started until manual recovery is performed.
If HA is enabled, each segment has a replica node, referred to as a mirror node. Each segment instance comprises a pair of primary and mirror. The mirror segment uses WAL streaming replication to maintain data consistency with the primary segment.
For detailed information, refer to Overview of Segment Mirroring.

2. Enabling HA

Assuming our Hengshi Sense is installed in the directory /opt/hengshi on the host named host1, with the user named hengshi, and Greenplum HA is not enabled, with 2 segment instances.
Another host acting as a mirror is named host2.

2.1 Install and Initialize the Mirror Host

If it is an existing host, not a new one, the operations below are not required.

Use root to create hengshi user

# grep hengshi /etc/passwd > /dev/null || sudo useradd -m hengshi

Configure password-less login between host1 and host2
(Omitted)
Create the installation path /opt/hengshi

$ sudo mkdir -p /opt/hengshi && sudo chown hengshi:hengshi /opt/hengshi

Install Hengshi Sense

$ sudo su - hengshi                      # Switch to the product runtime user
$ cd ~/pkgs/hengshi-sense-[version]      # Switch to the decompression target directory
$ ./hs_install -p /opt/hengshi           # Execute installation

Initialize OS

$ sudo su - hengshi                      # Switch to the product runtime user
$ cd /opt/hengshi                        # Enter the installation target directory
$ bin/hengshi-sense-bin init-os all      # Initialize OS

Note:

It is not necessary to execute bin/hengshi-sense-bin init all here.

2.2 Enabling Segment HA

Mirror segment instances can be distributed in different ways across cluster hosts based on configuration:

One way is group (default method), where the mirrors of the primary segments on each host are all placed on another host. If one host fails, the number of active segments on the host taking over the services of the mirrors will double.
Another way is spread, which ensures that at most one mirror is promoted to primary segment on each machine. This method can prevent sudden increases in load on other hosts after a single host fails. For the spread mode of mirror distribution, the number of cluster hosts must be more than the number of segments per host.
For an introduction to these two methods, refer to Overview of Segment Mirroring.
Here we deploy using the group method.

To enable mirror segments, we use the gpaddmirrors tool.

Create a configuration file
The configuration file format is as follows:

contentID|address|port|data_dir

Here, assuming the configuration file is named mirrors.txt, the contents of the configuration are

0|host2|26432|/opt/hengshi/engine-cluster/mirror/SegDataDir0
1|host2|26433|/opt/hengshi/engine-cluster/mirror/SegDataDir1

Run gpaddmirrors to enable segment mirroring

source /opt/hengshi/engine-cluster/export-cluster.sh
gpaddmirrors -a -i mirrors.txt

If successful, you will see information like this

20200313:15:52:54:007684 gpaddmirrors:host1:hengshi-[INFO]:-Process results...
20200313:15:52:54:007684 gpaddmirrors:host1:hengshi-[INFO]:-******************************************************************
20200313:15:52:54:007684 gpaddmirrors:host1:hengshi-[INFO]:-Mirror segments have been added; data synchronization is in progress.
20200313:15:52:54:007684 gpaddmirrors:host1:hengshi-[INFO]:-Data synchronization will continue in the background.
20200313:15:52:54:007684 gpaddmirrors:host1:hengshi-[INFO]:-Use  gpstate -s  to check the resynchronization progress.
20200313:15:52:54:007684 gpaddmirrors:host1:hengshi-[INFO]:-******************************************************************

Verify the synchronization status of the mirrors
Execute the following command

gpstate -m

We can see information like this

20200313:15:54:12:007841 gpstate:host1:hengshi-[INFO]:-Starting gpstate with args: -m
20200313:15:54:12:007841 gpstate:host1:hengshi-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 6.2.1 build dev'
20200313:15:54:12:007841 gpstate:host1:hengshi-[INFO]:-master Greenplum Version: 'PostgreSQL 9.4.24 (Greenplum Database 6.2.1 build dev) on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-39), 64-bit compiled on Dec 23 2019 17:10:46'
20200313:15:54:12:007841 gpstate:host1:hengshi-[INFO]:-Obtaining Segment details from master...
20200313:15:54:12:007841 gpstate:host1:hengshi-[INFO]:--------------------------------------------------------------
20200313:15:54:12:007841 gpstate:host1:hengshi-[INFO]:--Current GPDB mirror list and status
20200313:15:54:12:007841 gpstate:host1:hengshi-[INFO]:--Type = Group
20200313:15:54:12:007841 gpstate:host1:hengshi-[INFO]:--------------------------------------------------------------
20200313:15:54:12:007841 gpstate:host1:hengshi-[INFO]:-   Mirror   Datadir                                        Port    Status    Data Status
20200313:15:54:12:007841 gpstate:host1:hengshi-[INFO]:-   host2     /opt/hengshi/engine-cluster/mirror/SegDataDir0   26432   Passive   Synchronized
20200313:15:54:12:007841 gpstate:host1:hengshi-[INFO]:-   host2     /opt/hengshi/engine-cluster/mirror/SegDataDir1   26433   Passive   Synchronized
20200313:15:54:12:007841 gpstate:host1:hengshi-[INFO]:--------------------------------------------------------------

Sometimes, we might find the status as Failed, which may be due to data synchronization in progress or the system just being started.

Note:

If there is a lot of segment data, it can lead to high pressure on Greenplum, so it is recommended to enable HA when there is no business pressure.

2.3 Enabling Master HA

Enabling HA for the master is simpler, just use the gpinitstandby command.

Enable Mirroring for the Master

Execute the following command

gpinitstandby -s host2

You would see information like this

20200313:16:09:55:008076 gpinitstandby:host1:hengshi-[INFO]:-Validating environment and parameters for standby initialization...
20200313:16:09:55:008076 gpinitstandby:host1:hengshi-[INFO]:-Checking for data directory /opt/hengshi/engine-cluster/data/SegDataDir-1 on host2
20200313:16:09:56:008076 gpinitstandby:host1:hengshi-[INFO]:------------------------------------------------------
20200313:16:09:56:008076 gpinitstandby:host1:hengshi-[INFO]:-Greenplum standby master initialization parameters
20200313:16:09:56:008076 gpinitstandby:host1:hengshi-[INFO]:------------------------------------------------------
20200313:16:09:56:008076 gpinitstandby:host1:hengshi-[INFO]:-Greenplum master hostname               = bdp2
20200313:16:09:56:008076 gpinitstandby:host1:hengshi-[INFO]:-Greenplum master data directory         = /opt/hengshi/engine-cluster/data/SegDataDir-1
20200313:16:09:56:008076 gpinitstandby:host1:hengshi-[INFO]:-Greenplum master port                   = 15432
20200313:16:09:56:008076 gpinitstandby:host1:hengshi-[INFO]:-Greenplum standby master hostname       = host2
20200313:16:09:56:008076 gpinitstandby:host1:hengshi-[INFO]:-Greenplum standby master port           = 15432
20200313:16:09:56:008076 gpinitstandby:host1:hengshi-[INFO]:-Greenplum standby master data directory = /opt/hengshi/engine-cluster/data/SegDataDir-1
20200313:16:09:56:008076 gpinitstandby:host1:hengshi-[INFO]:-Greenplum update system catalog         = On
Do you want to continue with standby master initialization? Yy|Nn (default=N):
> y
20200313:16:09:57:008076 gpinitstandby:host1:hengshi-[INFO]:-Syncing Greenplum Database extensions to standby
20200313:16:09:57:008076 gpinitstandby:host1:hengshi-[INFO]:-The packages on host2 are consistent.
20200313:16:09:57:008076 gpinitstandby:host1:hengshi-[INFO]:-Adding standby master to catalog...
20200313:16:09:57:008076 gpinitstandby:host1:hengshi-[INFO]:-Database catalog updated successfully.
20200313:16:09:57:008076 gpinitstandby:host1:hengshi-[INFO]:-Updating pg_hba.conf file...
20200313:16:09:58:008076 gpinitstandby:host1:hengshi-[INFO]:-pg_hba.conf files updated successfully.
20200313:16:09:59:008076 gpinitstandby:host1:hengshi-[INFO]:-Starting standby master
20200313:16:09:59:008076 gpinitstandby:host1:hengshi-[INFO]:-Checking if standby master is running on host: host2  in directory: /opt/hengshi/engine-cluster/data/SegDataDir-1
20200313:16:10:00:008076 gpinitstandby:host1:hengshi-[INFO]:-Cleaning up pg_hba.conf backup files...
20200313:16:10:01:008076 gpinitstandby:host1:hengshi-[INFO]:-Backup files of pg_hba.conf cleaned up successfully.
20200313:16:10:01:008076 gpinitstandby:host1:hengshi-[INFO]:-Successfully created standby master on host2

Use gpstate to check the status of the mirror master

Execute the following command

gpstate -f

You would see information like this

20200313:16:13:07:008235 gpstate:host1:hengshi-[INFO]:-Starting gpstate with args: -f
20200313:16:13:07:008235 gpstate:host1:hengshi-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 6.2.1 build dev'
20200313:16:13:07:008235 gpstate:host1:hengshi-[INFO]:-master Greenplum Version: 'PostgreSQL 9.4.24 (Greenplum Database 6.2.1 build dev) on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-39), 64-bit compiled on Dec 23 2019 17:10:46'
20200313:16:13:07:008235 gpstate:host1:hengshi-[INFO]:-Obtaining Segment details from master...
20200313:16:13:07:008235 gpstate:host1:hengshi-[INFO]:-Standby master details
20200313:16:13:07:008235 gpstate:host1:hengshi-[INFO]:-----------------------
20200313:16:13:07:008235 gpstate:host1:hengshi-[INFO]:-   Standby address          = host2
20200313:16:13:07:008235 gpstate:host1:hengshi-[INFO]:-   Standby data directory   = /opt/hengshi/engine-cluster/data/SegDataDir-1
20200313:16:13:07:008235 gpstate:host1:hengshi-[INFO]:-   Standby port             = 15432
20200313:16:13:07:008235 gpstate:host1:hengshi-[INFO]:-   Standby PID              = 3050
20200313:16:13:07:008235 gpstate:host1:hengshi-[INFO]:-   Standby status           = Standby host passive
20200313:16:13:07:008235 gpstate:host1:hengshi-[INFO]:--------------------------------------------------------------
20200313:16:13:07:008235 gpstate:host1:hengshi-[INFO]:--pg_stat_replication
20200313:16:13:07:008235 gpstate:host1:hengshi-[INFO]:--------------------------------------------------------------
20200313:16:13:07:008235 gpstate:host1:hengshi-[INFO]:--WAL Sender State: streaming
20200313:16:13:07:008235 gpstate:host1:hengshi-[INFO]:--Sync state: sync
20200313:16:13:07:008235 gpstate:host1:hengshi-[INFO]:--Sent Location: 0/C000000
20200313:16:13:07:008235 gpstate:host1:hengshi-[INFO]:--Flush Location: 0/C000000
20200313:16:13:07:008235 gpstate:host1:hengshi-[INFO]:--Replay Location: 0/C000000
20200313:16:13:07:008235 gpstate:host1:hengshi-[INFO]:--------------------------------------------------------------

See the gpinitstandby command usage instructions.
See the gpactivatestandby command usage instructions.

4. FAQ

4.1 Common Issues with Master HA

If the primary master goes down, how to switch to the standby master

When the primary master goes down for various reasons, Greenplum does not automatically switch to the standby master; a manual switch is required.
On the standby master node host2, execute the following command:

source /opt/hengshi/engine-cluster/export-cluster.sh               # If there is no such file, copy it over from the original primary master
gpactivatestandby -d /opt/hengshi/engine-cluster/data/SegDataDir-1

How to restore the original primary master

Once the original primary master has gone down, it is necessary to first switch the standby master to the new primary master (here we call it the back-up master), then use the following method to make the original primary master the new standby master, and manually execute the switch.

a. Back up the data directory of the original primary master

mv /opt/hengshi/engine-cluster/data/SegDataDir-1 /opt/hengshi/engine-cluster/data/SegDataDir-1-backup

b. On the back-up master node, execute the following command to make the original primary master the mirror master

gpinitstandby -s host1

c. On the back-up master node, stop the master

gpstop -m

d. On the original primary master node, execute the switch command

gpactivatestandby -d /opt/hengshi/engine-cluster/data/SegDataDir-1

e. On the back-up master node, back up the data directory

mv /opt/hengshi/engine-cluster/data/SegDataDir-1 /opt/hengshi/engine-cluster/data/SegDataDir-1-backup

f. On the original primary master node, create the back-up master as the mirror master

gpinitstandby -s host2