database - Automatic recovery of the failed postgresql master node is not working with pgpool II -
i new postgresql , pgpool ii setup. have configured postgresql ha/load balancing using pgpool , repmgr.
the setup consist of 3 nodes , verison of application , os mentioned below: **pgpool node** => 192.168.0.4, **postgresql nodes** => 192.168.0.6, 192.168.0.7 **os version** => centos 6.8 (on 3 nodes) **pgpool ii version** => pgpool-ii version 3.5.0 (ekieboshi). **postgresql version** => postgresql 9.4.8 **repmgr version** => repmgr 3.1.3 (postgresql 9.4.8)
i have followed link setup.
when bring down master node, failover happens , slave node takes on new master node.
after failover, have recover failed node manually , sync new master node.
i want automate recovery process.
the pgpool.conf file on pgpool node contains parameter recovery_1st_stage_command. have searched sources online , found paramater "recovery_1st_stage_command" should set in configuration file pgpool.conf on pgpool node.
i have set parameter recovery_1st_stage_command = 'basebackup.sh'. have placed script 'basebackup.sh' file on both postgresql node under data directory '/var/lib/pgsql/9.4/data'.
also have placed script 'pgpool_remote_start' on both database node under directory '/var/lib/pgsql/9.4/data'.
also created pgpool extension pgpool_recovery , pgpool_adm on both database node.
when master node stopped, failover happens recovery script 'basebackup.sh' not executed.
have checked pgpool logs , enabled debug level well. still cannot find whether script got executed or not.
please me automatic online recovery of failed node. find scripts used me.
basebackup.sh
#!/bin/bash # first stage recovery # $1 datadir # $2 desthost # $3 destdir #as i'm using repmgr it's not necessary me know datadir(master) $1 recovery_node=$2 cluster_path=$3 #repmgr needs know master's ip masternode=`/sbin/ifconfig eth0 | grep inet | awk '{print $2}' | sed 's/addr://'` cmd1=`ssh postgres@$recovery_node "repmgr -d $cluster_path --force standby clone $masternode"` echo $cmd1
pgpool_remote_start script.
#! /bin/sh if [ $# -ne 2 ] echo "pgpool_remote_start remote_host remote_datadir" exit 1 fi dest=$1 destdir=$2 pgctl=/usr/pgsql-9.4/bin/pg_ctl ssh -t $dest $pgctl -w -d $destdir start 2>/dev/null 1>/dev/null < /dev/null &
thanks.
i think designed. when master fails, there failover , slave gets promoted. old master not automatically recovered slave. @ contrary, failover script try shutdown failed master , disable restarting (if possible, maybe node down , not possible connect to), avoid split-brain.
if want modify failover script in such way pcp_recovery operation on old master after slave promoted. in fact doing switchover... should scripted series of step. failover when there real issue master (like machine not responding)
Comments
Post a Comment