2013年2月4日 星期一

PGPool主從流複製(高可用、讀寫分離)測試


1.master資料庫關閉

/usr/local/pgsql9.1/pg_ctl -D /data/masterdata stop -m immediate
現象:pgpool自動連接到備份庫,備份庫並具有讀寫許可權。
2.先關閉pgpool,刪除pgpool status檔,重啟原master資料庫,再啟pgpool
pgpool -f ../etc/pgpool.conf -a ../etc/pool_hba.conf  -F ../etc/pcp.conf stop
rm -rf /usr/local/pgpool/log/pgpool_status
/usr/local/pgsql9.1/pg_ctl -D /data/masterdata start
pgpool -f ../etc/pgpool.conf -a ../etc/pool_hba.conf  -F ../etc/pcp.conf
現象:通過pgpool建立表和對原有表進行insert操作均出現無法提交。原因原slave庫已不處於slave狀態,而此主從又設置的即時主從導致事務無法提交。
3.master庫複製到slave庫份,重啟slave。建立pgpool連接,在連接狀態關閉master資料庫
cd /data/slavedata
 mv postgresql.conf postgresql.conf_bak
 mv pg_hba.conf pg_hba.conf_bak
cd ..
\cp -rp masterdata/* slavedata
cd slavedata/
mv postgresql.conf_bak postgresql.conf
mv pg_hba.conf_bak pg_hba.conf
mv recovery.done recovery.conf
su - postgres
rm -rf /data/slavedata/postmaster.pid
/usr/local/pgsql9.1/bin/pg_ctl -D /data/slavedata/ start
pgpool -f ../etc/pgpool.conf -a ../etc/pool_hba.conf  -F ../etc/pcp.conf -n
psql -p 9998 -U postgres
postgres=# create table test ( id serial8);
NOTICE:  CREATE TABLE will create implicit sequence "test_id_seq" for serial column "test.id"
CREATE TABLE
[root@database1 ~]# psql -p 5433 -U postgres 
psql (9.1.2)
Type "help" for help.
postgres=# select * from test;
 id 
----
(0 rows)
[root@database1 ~]# psql -p 5432 -U postgres 
psql (9.1.2)
Type "help" for help.
postgres=# select * from test;
 id 
----
(0 rows)
su - postgres
-bash-3.2$ /usr/local/pgsql9.1/bin/pg_ctl -D /data/masterdata/ stop -m immediate
waiting for server to shut down... done
server stopped
[root@database1 ~]# psql -p 9998 -U postgres 
psql (9.1.2)
Type "help" for help.
postgres=# select * from test;
 id 
----
(0 rows)
postgres=# select * from test;
server closed the connection unexpectedly
 This probably means the server terminated abnormally
 before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.
現象:如果master連接斷開則原用戶端到pgpool端的連接將斷開。連接須重新建立。

3.主從流複製及pgpool工作正常,通過函數向表中插入資料。
psql -p 9998 -U postgres
 create table test ( id serial8,name varchar(50));
CREATE OR REPLACE FUNCTION test_insert(rcount int8)
  RETURNS int8 AS
$BODY$
DECLARE 
 i int8;
BEGIN
  i=1;
         while (i<=rcount) loop
            insert into test(name) values(i||'chen');
            i=i+1;
         end loop;
         return i;
END;
$BODY$
  LANGUAGE plpgsql VOLATILE
  COST 100;
select * from test_insert(1);
ERROR:  cannot execute INSERT in a read-only transaction
CONTEXT:  SQL statement "insert into test(name) values(i||'chen')"
PL/pgSQL function "test_insert" line 7 at SQL statement
insert into test (name) values('czg');
INSERT 0 1
select  test_insert(100000) for share;
 test_insert 
-------------
      100001
(1 row)
結論:如函數對資料庫表有updatedeleteinsert操作或對資料庫其它物件修改將不能直接通過select調用,pgpool默認select 為查詢語句在Slave庫執行,如需執行此類函數需添加函數到黑名單。黑名單通過pgpool.confblack_function_list)檔配置。

4.主從流複製及pgpool工作正常,通過函數批量插入500萬條記錄,在函數執行過程中強制關閉Master資料庫
psql -p 9998 -U postgres
select count(*) from test;
  count  
---------
 1200013
(1 row)
select  test_insert(5000000);
此時執行通過其它用戶端關閉主要資料庫
su - postgres
/usr/local/pgsql9.1/bin/pg_ctl -D /data/masterdata/ -m immediate stop

這在執行函數的psql用戶端報以下錯誤
WARNING:  terminating connection because of crash of another server process
DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
HINT:  In a moment you should be able to reconnect to the database and repeat your command.
CONTEXT:  SQL statement "insert into test(name) values(i||'chen')"
PL/pgSQL function "test_insert" line 7 at SQL statement
The connection to the server was lost. Attempting reset: Failed.

\q
psql -p 9998 -U postgres
select count(*) from test;
  count  
---------
 1200013
(1 row)

結論:如在事務執行過程中master資料庫異常斷開,則事務自動回滾,pgpool並自動斷開用戶端連接。

pgpool 流複製主從安裝與配置(高可用、讀寫分離)


Centos5.4(x64)+postgresql9.12+pgpool 3.1.2
以下配置pgpool與主從均在一台伺服器,主要資料庫埠為5433 ;從資料庫埠為5432
1.下載並安裝postgresql9.1.2資料庫
cd /tmp
./configure --prefix=/usr/local/pgsql9.1
make
make install
2.下載並安裝pgpool
tar -zxvf pgpool-II-3.1.2.tar.gz
cd pgpool-II-3.1.2
./configure --prefix=/usr/local/pgpool --with-pgsql=/usr/local/pgsql9.1/
make
make install
3.資料庫配置主從流即時複製
創建postgres用戶
useradd postgres 
cd /data
建立主要資料庫存放目錄
mkdir masterdata
建立slave資料庫存放目錄
mkdir slavedata
對主要資料庫目錄賦予postgres許可權
chown -R postgres:postgres masterdata
slave資料庫目錄賦予postgres許可權
chown -R postgres:postgres slavedata
su - postgres
初始化主要資料庫
/usr/local/pgsql9.1/bin/initdb -D /data/masterdata
修改主要資料庫參數
vi /data/masterdata/postgresql.conf
listen_addresses = '*'
port = 5433 
logging_collector = on
wal_level = hot_standby
synchronous_commit = on
max_wal_senders = 2
wal_keep_segments = 32
synchronous_standby_names = '*'

vi /data/masterdata/pg_hba.conf
host    all             all             192.168.100.0/24        md5
host     replication     postgres        192.168.100.24/32       trust
啟動主要資料庫
以下操作在postgres用戶下執行
su - postgres
/usr/local/pgsql9.1/bin/pg_ctl -D /data/masterdata/ start
創建測試庫並將主庫設置為備份狀態
/usr/local/pgsql9.1/bin/psql -p 5433 -U postgres
create database test;
select pg_start_backup('/data/masterdata/');
\q
將主庫整體拷貝為slave
cp -rp /data/masterdata/* /data/slavedata/
配置slave資料庫
vi /data/slavedata/postgresql.conf
port = 5432
wal_level = minimal
hot_standby = on 
停止主要資料庫backup狀態
/usr/local/pgsql9.1/bin/psql -p 5433 -U postgres
select pg_stop_backup();
\q
設置slave伺服器流複製
vi  /data/slavedata/ recovery.conf

standby_mode = 'on'
primary_conninfo = 'host=192.168.100.24 port=5432 user=postgres'
trigger_file = '/data/slavedata/trigger_activestb'
啟動重資料庫
/usr/local/pgsql9.1/bin/pg_ctl -D /data/slavedata/ start
4.配置pgpool
配置pgpool.conf
cp /usr/local/pgpool/etc/pgpool.conf.sample-stream  /usr/local/pgpool/etc/pgpool.conf
listen_addresses = '*'
port = 9998
backend_hostname0 = '127.0.0.1'
backend_port0 = 5433
backend_weight0 = 1
backend_data_directory0 = '/data/masterdata
backend_hostname1 = '127.0.0.1'
backend_port1 = 5432
backend_weight1 = 1
backend_data_directory1 = '/data/slavedata'
log_statement = on
log_per_node_statement = on
pid_file_name = '/usr/local/pgpool/pgpool.pid'
logdir = '/usr/local/pgpool/log'
log_connections = on
log_hostname = on
debug_level = 1
failover_command = '/usr/local/pgpool/etc/failedcommand.sh %d %H /data/slavedata/trigger_activestab'
sr_check_user = 'postgres'
delay_threshold = 10000


mkdir /usr/local/pgpool/log/
編輯Master庫失敗強制轉到slave
vi /usr/local/pgpool/etc/failedcommand.sh
#! /bin/sh
# Failover command for streaming replication.
# This script assumes that DB node 0 is primary, and 1 is standby.
#
# If standby goes down, do nothing. If primary goes down, create a
# trigger file so that standby takes over primary node.
#
# Arguments: $1: failed node id. $2: new master hostname. $3: path to
# trigger file.
failed_node=$1
new_master=$2
trigger_file=$3
# Do nothing if standby goes down.
if [ $failed_node = 1 ]; then
        exit 0;
fi
 /usr/bin/ssh -T $new_master /bin/touch $trigger_file

exit 0;
賦予檔可執行許可權
chmod a+x  /usr/local/pgpool/etc/failedcommand.sh 
設置Master伺服器與Slave金鑰登錄認證,請查看http://blog.163.com/czg_e/blog/static/46104561201101922246729/
配置pgpool連接認證
mv /usr/local/pgpool/etc/pool_hba.conf.sample  /usr/local/pgpool/etc/pool_hba.conf
vi  /usr/local/pgpool/etc/pool_hba.conf
host    all         all         192.168.100.0/24      md5
設置pgpool pcp用戶管理許可權
mv /usr/local/pgpool/etc/pcp.conf.sample  /usr/local/pgpool/etc/pcp.conf
通過pg md5對密碼加密
/usr/local/pgpool/bin/pg_md5 -p
vi  /usr/local/pgpool/etc/pcp.conf
postgres:剛才加密後的密碼
 5.啟動pgpool
pgpool -f /usr/local/pgpool/etc/pgpool.conf -a /usr/local/pgpool/etc/pool_hba.conf  -F /usr/local/pgpool/etc/pcp.conf
測試pgpool 是否正常運行
psql -p 9998 -U postgres


異常處理
2012-03-24 12:52:47 ERROR: pid 2424: pool_read_message_length: message length (8) in slot 1 does not match with slot 0(12)
2012-03-24 12:52:47 ERROR: pid 2424: Failed to read the authentication packet length. This is likely caused by the inconsistency of auth method among DB nodes. In this case you can check the previous error messages (hint: length field) from pool_read_message_length and recheck the pg_hba.conf settings.
通過pgsql不能登錄且日誌中輸入如上資訊,經排查發現為master庫中pg_hba.conf設置採用md5 方式認證,而slave pg_hba.con配置的為trust認證方式,後將slavepg_hba.conf檔配置為了md5方式即可通過驗證登錄。

psql: ERROR:  MD5 authentication is unsupported in replication, master-slave and parallel modes.
HINT:  check pg_hba.conf
通過pgsql不能登錄且提示如上資訊,解決上面錯誤需將pgpool.conf enable_pool_hba設置為 onpool_hba.conf中加入md5驗證(host    all         all         10.10.10.0/24         md5);pool_passwd中設置用戶名密碼驗證(/usr/local/pgpool/bin/pg_md5 -m -p -u postgres pool_passwd )。