一、部署资源规划
1.内存:官方建议每台16GB,每个primary30GB。
2.磁盘空间:GP软件安装:2GB,GP数据盘需要保持使用量不超过70%。
3.网络要求:官方建议万兆以太网,多个网口做bond
4.文件目录:官方建议使用XFS文件系统。
5.安装rhel7要求:
操作系统版本:rhel7.9
挂载点:
/boot /sda1 XFS 2048MB
/ /sda2 XFS 剩余全部
SWAP /sdb SWAP 内存/2
SWAP /sdd SWAP 内存/2
语言选择:英文
时区:shanghai
软件选择:file and print server
附加组件选择:development tools
初始root密码:123456
1、系统版本:redhat7.9
2、硬件:3台虚拟机,2核,16G内存,50G硬盘
3、实验节点规划一个master, 4个segment,4个mirror,无standby
主机ip | host | 节点规划 |
192.168.31.201 | mdw | master |
192.168.31.202 | sdw1 | seg1,seg2,mirror3,mirror4 |
192.168.31.203 | sdw2 | seg3,seg4,mirror1,mirror2 |
二、部署参数配置
依赖:
## 与旧版本差异点
gp4.x 无安装依赖检查步骤
gp5.x 使用rpm安装需要检查安装依赖
gp6.2 使用rpm需要检查安装依赖,使用yum install安装 会自动安装依赖,前提条件是需要联网
复制代码
apr
apr-util
bash
bzip2
curl
krb5
libcurl
libevent (or libevent2 on RHEL/CentOS 6)
libxml2
libyaml
zlib
openldap
openssh
openssl
openssl-libs (RHEL7/Centos7)
perl
readline
rsync
R
sed (used by gpinitsystem)
tar
zip
mount /dev/cdrom /mnt
mv /etc/yum.repos.d/* /tmp/
echo "[local]" >> /etc/yum.repos.d/local.repo
echo "name = local" >> /etc/yum.repos.d/local.repo
echo "baseurl = file:///mnt/" >> /etc/yum.repos.d/local.repo
echo "enabled = 1" >> /etc/yum.repos.d/local.repo
echo "gpgcheck = 0" >> /etc/yum.repos.d/local.repo
yum clean all
yum repolist all
yum install -y apr apr-util bash bzip2 curl krb5 libcurl libevent libxml2 libyaml zlib openldap openssh openssl openssl-libs perl readline rsync R sed tar zip krb5-devel
复制代码
1.禁用防火墙和selinux
systemctl stop firewalld.service
systemctl disable firewalld.service
systemctl status firewalld.service
复制代码
2.修改主机名
hostnamectl set-hostname mdw
hostnamectl set-hostname sdw1
hostnamectl set-hostname sdw2
复制代码
3.修改/etc/hosts文件
vim /etc/hosts
192.168.31.201 mdw
192.168.31.202 sdw1
192.168.31.203 sdw2
复制代码
4.配置系统参数文件sysctl.conf
根据系统实际情况来修改系统参数(gp 5.0 之前都是官方给出的默认值,5.0 之后给出了部分计算公式。)
官方推荐配置,设置完成后 重载参数( sysctl -p):
# kernel.shmall = _PHYS_PAGES / 2 # See Shared Memory Pages # 共享内存
kernel.shmall = 4000000000
# kernel.shmmax = kernel.shmall * PAGE_SIZE # 共享内存
kernel.shmmax = 500000000
kernel.shmmni = 4096
vm.overcommit_memory = 2 # See Segment Host Memory # 主机内存
vm.overcommit_ratio = 95 # See Segment Host Memory # 主机内存
net.ipv4.ip_local_port_range = 10000 65535 # See Port Settings 端口设定
kernel.sem = 500 2048000 200 40960
kernel.sysrq = 1
kernel.core_uses_pid = 1
kernel.msgmnb = 65536
kernel.msgmax = 65536
kernel.msgmni = 2048
net.ipv4.tcp_syncookies = 1
net.ipv4.conf.default.accept_source_route = 0
net.ipv4.tcp_max_syn_backlog = 4096
net.ipv4.conf.all.arp_filter = 1
net.core.netdev_max_backlog = 10000
net.core.rmem_max = 2097152
net.core.wmem_max = 2097152
vm.swappiness = 10
vm.zone_reclaim_mode = 0
vm.dirty_expire_centisecs = 500
vm.dirty_writeback_centisecs = 100
vm.dirty_background_ratio = 0 # See System Memory # 系统内存
vm.dirty_ratio = 0
vm.dirty_background_bytes = 1610612736
vm.dirty_bytes = 4294967296
复制代码
--共享内存
$ echo $(expr $(getconf _PHYS_PAGES) / 2)
$ echo $(expr $(getconf _PHYS_PAGES) / 2 \* $(getconf PAGE_SIZE))
[root@mdw ~]# echo $(expr $(getconf _PHYS_PAGES) / 2)
2053918
[root@mdw ~]# echo $(expr $(getconf _PHYS_PAGES) / 2 \* $(getconf PAGE_SIZE))
8412848128
复制代码
--主机内存
vm.overcommit_memory 系统使用该参数来确定可以为进程分配多少内存。对于GP数据库,此参数应设置为2。
vm.overcommit_ratio 以为进程分配内的百分比,其余部分留给操作系统。在Red Hat上,默认值为50。建议设置95
--计算 vm.overcommit_ratio
vm.overcommit_ratio = (RAM-0.026*gp_vmem) / RAM
复制代码
--端口设定
为避免在Greenplum初始化期间与其他应用程序之间的端口冲突,指定的端口范围 net.ipv4.ip_local_port_range。使用gpinitsystem初始化Greenplum时,请不要在该范围内指定Greenplum数据库端口。
例如,如果net.ipv4.ip_local_port_range = 10000 65535,将Greenplum数据库基本端口号设置为这些值。
PORT_BASE = 6000
MIRROR_PORT_BASE = 7000
复制代码
--系统内存
系统内存大于64G ,建议以下配置
vm.dirty_background_ratio = 0
vm.dirty_ratio = 0
vm.dirty_background_bytes = 1610612736 # 1.5GB
vm.dirty_bytes = 4294967296 # 4GB
系统内存小于等于 64GB,移除vm.dirty_background_bytes 设置,并设置以下参数
vm.dirty_background_ratio = 3
vm.dirty_ratio = 10
复制代码
增加 vm.min_free_kbytes ,确保网络和存储驱动程序PF_MEMALLOC得到分配。这对内存大的系统尤其重要。一般系统上,默认值通常太低。可以使用awk命令计算vm.min_free_kbytes的值,通常是建议的系统物理内存的3%:
awk 'BEGIN {OFMT = "%.0f";} /MemTotal/ {print "vm.min_free_kbytes =", $2 * .03;}'
/proc/meminfo >> /etc/sysctl.conf
不要设置 vm.min_free_kbytes 超过系统内存的5%,这样做可能会导致内存不足。
复制代码
本次实验使用redhat7.9 ,16G内存,配置如下:
vim /etc/sysctl.conf
kernel.shmall = 2053918
kernel.shmmax = 8412848128
kernel.shmmni = 4096
vm.overcommit_memory = 2
vm.overcommit_ratio = 95
net.ipv4.ip_local_port_range = 10000 65535
kernel.sem = 500 2048000 200 4096
kernel.sysrq = 1
kernel.core_uses_pid = 1
kernel.msgmnb = 65536
kernel.msgmax = 65536
kernel.msgmni = 2048
net.ipv4.tcp_syncookies = 1
net.ipv4.conf.default.accept_source_route = 0
net.ipv4.tcp_max_syn_backlog = 4096
net.ipv4.conf.all.arp_filter = 1
net.core.netdev_max_backlog = 10000
net.core.rmem_max = 2097152
net.core.wmem_max = 2097152
vm.swappiness = 10
vm.zone_reclaim_mode = 0
vm.dirty_expire_centisecs = 500
vm.dirty_writeback_centisecs = 100
vm.dirty_background_ratio = 3
vm.dirty_ratio = 10
复制代码
5.修改/etc/security/limits.conf
vim /etc/security/limits.conf
* soft nofile 524288
* hard nofile 524288
* soft nproc 131072
* hard nproc 131072
复制代码
RHEL / CentOS 7 修改:/etc/security/limits.d/20-nproc.conf 文件的nproc 为131072
[root@mdw ~]# cat /etc/security/limits.d/20-nproc.conf
# Default limit for number of user's processes to prevent
# accidental fork bombs.
# See rhbz #432903 for reasoning.
* soft nproc 131072
root soft nproc unlimited
复制代码
linux模块pam_limits 通过读取 limits.conf文件来设置用户限制.
重启后生效,ulimit -u 命令显示每个用户可用的大进程数max user processes。验证返回值为131072。
6.XFS挂载
XFS相比较ext4具有如下优点:
XFS的扩展性明显优于ext4,ext4的单个文件目录超过200W个性能下降明显
ext4作为传统文件系统确实非常稳定,但是随着存储需求的越来越大,ext4渐渐不在适应
由于历史磁盘原因,ext4的inode个数限制(32位),多只能支持40多亿个文件,单个文件大支持到16T
XFS使用的是64位管理空间,文件系统规模可以达到EB级别,XFS是基于B+Tree管理元数据
GP 需要使用XFS的文件系统,RHEL/CentOS 7 和Oracle Linux将XFS作为默认文件系统,SUSE/openSUSE已经为XFS做了长期支持。
由于本次虚拟机只有一块盘,并且是系统盘,无法再改文件系统。此处略过挂在xfs。
[root@mdw ~]# cat /etc/fstab
#
# /etc/fstab
# Created by anaconda on Sat Feb 27 08:37:50 2021
#
# Accessible filesystems, by reference, are maintained under '/dev/disk'
# See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info
#
/dev/mapper/rhel-root / xfs defaults 0 0
UUID=8553f10d-0334-4cd5-8e3f-b915b6e0ccaa /boot xfs defaults 0 0
/dev/mapper/rhel-swap swap swap defaults 0 0
/dev/mapper/rhel-swap00 swap swap defaults 0 0
复制代码
## 与旧版本差异点
gp6 无gpcheck 检查工具,所以不改文件系统,不影响集群安装
gp6 之前版本 gpcheck检查文件系统不通过时,可注释掉gpcheck脚本检查文件系统的部分代码。
文件系统一般在安装操作系统的时候指定,或者挂载新的盘的时候格式化。也可以将非系统盘的其他磁盘格式化成指定的文件系统。
例如挂载新xfs步骤:
mkfs.xfs /dev/sda3
mkdir -p /data/master
vi /etc/fstab
dev/data /data xfs nodev,noatime,nobarrier,inode64 0 0
复制代码
7.Disk I/O Settings
磁盘文件预读设置:16384,不同系统的磁盘目录不一样,可以使用 lsblk 查看磁盘挂在情况
[root@mdw ~]# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 100G 0 disk
├─sda1 8:1 0 2G 0 part /boot
└─sda2 8:2 0 98G 0 part
├─rhel-root 253:0 0 82G 0 lvm /
├─rhel-swap 253:1 0 8G 0 lvm [SWAP]
└─rhel-swap00 253:2 0 8G 0 lvm [SWAP]
sr0 11:0 1 4.2G 0 rom /mnt
[root@mdw ~]# /sbin/blockdev --setra 16384 /dev/sda
[root@mdw ~]# /sbin/blockdev --getra /dev/sda
16384
--生效
vi /etc/rc.local
/sbin/blockdev --setra 16384 /dev/sda
chmod +x /etc/rc.d/rc.local
复制代码
8.Disk I/O scheduler 磁盘IO调度算法
--RHEL 7.x or CentOS 7.x, 使用 grub2 ,可以使用系统工具grubby来修改;
grubby --update-kernel=ALL --args="elevator=deadline"
--重启后使用一下命令检查
grubby --info=ALL
复制代码
9.Transparent Huge Pages (THP) 禁用THP
禁用THP,因为它会降低Greenplum数据库的性能。
--RHEL 7.x or CentOS 7.x, 使用 grub2 ,可以使用系统工具grubby来修改;
grubby --update-kernel=ALL --args="transparent_hugepage=never"
--重启后查看
$ cat /sys/kernel/mm/*transparent_hugepage/enabled
always [never]
复制代码
10.IPC Object Removal
Disable IPC object removal for RHEL 7.2 or CentOS 7.2, or Ubuntu. The default systemd setting RemoveIPC=yes removes IPC connections when non-system user accounts log out. This causes the Greenplum Database utility gpinitsystem to fail with semaphore errors. Perform one of the following to avoid this issue.
When you add the gpadmin operating system user account to the master node in Creating the Greenplum Administrative User, create the user as a system account.
Disable RemoveIPC. Set this parameter in /etc/systemd/logind.conf on the Greenplum Database host systems.
vi /etc/systemd/logind.conf
RemoveIPC=no
service systemd-logind restart
复制代码
11.SSH Connection Threshold SSH连接阈值
Greenplum数据库管理程序中的gpexpand‘ gpinitsystem、gpaddmirrors,使用 SSH连接来执行任务。在规模较大的Greenplum集群中,程序的ssh连接数可能会超出主机的未认证连接的大阈值。发生这种情况时,会收到以下错误:ssh_exchange_identification:连接被远程主机关闭。
为避免这种情况,可以更新 /etc/ssh/sshd_config 或者 /etc/sshd_config 文件的 MaxStartups 和 MaxSessions 参数
If you specify MaxStartups and MaxSessions using a single integer value, you identify the maximum number of concurrent unauthenticated connections (MaxStartups) and maximum number of open shell, login, or subsystem sessions permitted per network connection (MaxSessions). For example:
MaxStartups 200
MaxSessions 200
复制代码
If you specify MaxStartups using the "start:rate:full" syntax, you enable random early connection drop by the SSH daemon. start identifies the maximum number of unauthenticated SSH connection attempts allowed. Once start number of unauthenticated connection attempts is reached, the SSH daemon refuses rate percent of subsequent connection attempts. full identifies the maximum number of unauthenticated connection attempts after which all attempts are refused. For example:
Max Startups 10:30:200
MaxSessions 200
复制代码
vi /etc/ssh/sshd_config or /etc/sshd_config
Max Startups 10:30:200
MaxSessions 200
--重启sshd,使参数生效
# systemctl reload sshd.service
复制代码
12.Synchronizing System Clocks 同步集群时钟(NTP)
为了保证集群各个服务的时间一致,首先在master 服务器上,编辑 /etc/ntp.conf,配置时钟服务器为数据中心的ntp服务器。若没有,先修改master 服务器的时间到正确的时间,再修改其他节点的 /etc/ntp.conf,让他们跟随master服务器的时间。
--root登录master主机
vi /etc/ntp.conf
#10.6.220.20为你的时间服务器ip
server 10.6.220.20
--root登录到segment主机
server mdw prefer # 优先主节点
server smdw # 其次standby 节点,若没有standby ,可以配置成数据中心的时钟服务器
service ntpd restart # 修改完重启ntp服务
复制代码
13.检查字符集
--如果不是请配置 /etc/sysconfig/language 增加 RC_LANG=en_US.UTF-8
[root@mdw greenplum-db]# echo $LANG
en_US.UTF-8
复制代码
14.Creating the Greenplum Administrative Use 创建greenplum用户
# 与旧版本差异点
gp4.x/gp5.x 可以在gpseginstall 时,通过-U 参数创建gpamdin 用户
gp6.2 无gpseginstall 工具,必须在安装前创建gpadmin 用户
在每个节点上创建gpadmin用户,用于管理和运行gp集群,好给与sudo权限。
也可以先在主节点上创建,等到主节点gp安装完成后,使用gpssh 批量在其他节点上创建。
示例:
groupadd gpadmin
useradd gpadmin -r -m -g gpadmin
passwd gpadmin
echo "gpadmin" |passwd gpadmin --stdin
复制代码
三、配置安装GP
1.上传安装文件并安装
[root@mdw ~]# mkdir /soft
[root@mdw ~]#
[root@mdw ~]# id gpadmin
uid=995(gpadmin) gid=1000(gpadmin) groups=1000(gpadmin)
[root@mdw ~]# chown -R gpadmin:gpadmin /soft/
[root@mdw ~]# chmod 775 /soft/
[root@mdw ~]# cd /soft/
[root@mdw soft]# ls
open-source-greenplum-db-6.14.1-rhel7-x86_64.rpm
--安装
[root@mdw soft]# rpm -ivh open-source-greenplum-db-6.14.1-rhel7-x86_64.rpm
Preparing... ################################# []
Updating / installing...
1:open-source-greenplum-db-6-6.14.1################################# []
##默认安装到/usr/local/ 目录下,给目录授权
chown -R gpadmin:gpadmin /usr/local/greenplum*
复制代码
2.配置SSH,集群互信,免密登陆(root和gpadmin都需要)
## 与旧版本差异点
gp6.x 以前无需3.3.1 ssh-keygen生成密钥,3.3.2 的ssh-copy-id 步骤,直接gpssh-exkeys -f all_host。
$ su gpadmin
##创建hostfile_exkeys
在$GPHOME目录创建两个host文件(all_host,seg_host),用于后续使用gpssh,gpscp 等脚本host参数文件
all_host : 内容是集群所有主机名或ip,包含master,segment,standby等。
seg_host: 内容是所有 segment主机名或ip
若一台机器有多网卡,且网卡没有绑定成bond0模式时,需要将多网卡的ip 或者host都列出来。
[gpadmin@mdw ~]# cd /usr/local/
[gpadmin@mdw local]$ ls
bin etc games greenplum-db greenplum-db-6.14.1 include lib lib64 libexec sbin share src
[gpadmin@mdw local]# cd greenplum-db
[gpadmin@mdw greenplum-db]$ ls
bin docs ext include libexec NOTICE sbin
COPYRIGHT etc greenplum_path.sh lib LICENSE open_source_license_greenplum_database.txt share
[gpadmin@mdw greenplum-db]# vim all_host
[gpadmin@mdw greenplum-db]# vim seg_host
[gpadmin@mdw greenplum-db]# cat all_host
mdw
sdw1
sdw2
[gpadmin@mdw greenplum-db]# cat seg_host
sdw1
sdw2
##生成密钥
$ ssh-keygen -t rsa -b 4096
Generating public/private rsa key pair.
Enter file in which to save the key (/home/gpadmin/.ssh/id_rsa):
Created directory '/home/gpadmin/.ssh'.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
##master与segment互信
su - gpadmin
ssh-copy-id -i ~/.ssh/id_rsa.pub gpadmin@sdw1
ssh-copy-id -i ~/.ssh/id_rsa.pub gpadmin@sdw2
##使用gpssh-exkeys 工具,打通n-n的免密登陆
[gpadmin@mdw greenplum-db]$ gpssh-exkeys -f all_host
bash: gpssh-exkeys: command not found
##需要激活环境变量
[gpadmin@mdw greenplum-db]$ source /usr/local/greenplum-db/greenplum_path.sh
[gpadmin@mdw greenplum-db]$
[gpadmin@mdw greenplum-db]$ gpssh-exkeys -f all_host
[STEP 1 of 5] create local ID and authorize on local host
... /home/gpadmin/.ssh/id_rsa file exists ... key generation skipped
[STEP 2 of 5] keyscan all hosts and update known_hosts file
[STEP 3 of 5] retrieving credentials from remote hosts
... send to sdw1
... send to sdw2
[STEP 4 of 5] determine common authentication file content
[STEP 5 of 5] copy authentication files to all remote hosts
... finished key exchange with sdw1
... finished key exchange with sdw2
[INFO] completed successfully
复制代码
3.验证gpssh
[gpadmin@mdw greenplum-db]$ gpssh -f /usr/local/greenplum-db/all_host -e 'ls /usr/local/'
[sdw1] ls /usr/local/
[sdw1] bin etc games include lib lib64 libexec sbin share src
[ mdw] ls /usr/local/
[ mdw] bin games greenplum-db-6.14.1 lib libexec share
[ mdw] etc greenplum-db include lib64 sbin src
[sdw2] ls /usr/local/
[sdw2] bin etc games include lib lib64 libexec sbin share src
复制代码
4.批量设置环境变量
##批量设置greenplum在gpadmin用户的环境变量
##添加gp的安装目录,和话环境信息到 用户的环境变量中。
vim /home/gpadmin/.bash_profile
cat >> /home/gpadmin/.bash_profile << EOF
source /usr/local/greenplum-db/greenplum_path.sh
EOF
vim .bashrc
cat >> /home/gpadmin/.bashrc << EOF
source /usr/local/greenplum-db/greenplum_path.sh
EOF
vim /etc/profile
cat >> /etc/profile << EOF
source /usr/local/greenplum-db/greenplum_path.sh
EOF
##环境变量文件分发到其他节点
su - root
source /usr/local/greenplum-db/greenplum_path.sh
gpscp -f /usr/local/greenplum-db/seg_host /etc/profile @=:/etc/profile
su - gpadmin
source /usr/local/greenplum-db/greenplum_path.sh
gpscp -f /usr/local/greenplum-db/seg_host /home/gpadmin/.bash_profile gpadmin@=:/home/gpadmin/.bash_profile
gpscp -f /usr/local/greenplum-db/seg_host /home/gpadmin/.bashrc gpadmin@=:/home/gpadmin/.bashrc
复制代码
四、集群节点安装
## 与旧版本差异点
目前官网缺少这部分说明。
在gp6 之前,有一个工具gpseginstall ,可以安装各个节点的gp软件。根据gpseginstall的日志可以分析出,gpseginstall的主要步骤是:
1. 节点上创建gp用户 (此步骤可略过)
2. 打包主节点安装目录
3. scp到各个seg 服务器
4. 解压,创建软连接
5. 授权给gpamdin
gpseginstall 安装日志,参考gp5 安装笔记
复制代码
1.模拟gpseginstall 脚本
# root 用户下执行
# 变量设置
link_name='greenplum-db' #软连接名
binary_dir_location='/usr/local' #安装路径
binary_dir_name='greenplum-db-6.14.1' #安装目录
binary_path='/usr/local/greenplum-db-6.14.1' #全目录
link_name='greenplum-db'
binary_dir_location='/usr/local'
binary_dir_name='greenplum-db-6.14.1'
binary_path='/usr/local/greenplum-db-6.14.1'
复制代码
master节点打包
chown -R gpadmin:gpadmin $binary_path
rm -f ${binary_path}.tar; rm -f ${binary_path}.tar.gz
cd $binary_dir_location; tar cf ${binary_dir_name}.tar ${binary_dir_name}
gzip ${binary_path}.tar
[root@mdw local]# chown -R gpadmin:gpadmin $binary_path
[root@mdw local]# rm -f ${binary_path}.tar; rm -f ${binary_path}.tar.gz
[root@mdw local]# cd $binary_dir_location; tar cf ${binary_dir_name}.tar ${binary_dir_name}
[root@mdw local]# gzip ${binary_path}.tar
[root@mdw local]# ls
bin games greenplum-db-6.14.1 include lib64 sbin src
etc greenplum-db greenplum-db-6.14.1.tar.gz lib libexec share
复制代码
用root用户分发到segment
link_name='greenplum-db'
binary_dir_location='/usr/local'
binary_dir_name='greenplum-db-6.14.1'
binary_path='/usr/local/greenplum-db-6.14.1'
source /usr/local/greenplum-db/greenplum_path.sh
gpssh -f ${binary_path}/seg_host -e "mkdir -p ${binary_dir_location};rm -rf ${binary_path};rm -rf ${binary_path}.tar;rm -rf ${binary_path}.tar.gz"
gpscp -f ${binary_path}/seg_host ${binary_path}.tar.gz root@=:${binary_path}.tar.gz
gpssh -f ${binary_path}/seg_host -e "cd ${binary_dir_location};gzip -f -d ${binary_path}.tar.gz;tar xf ${binary_path}.tar"
gpssh -f ${binary_path}/seg_host -e "rm -rf ${binary_path}.tar;rm -rf ${binary_path}.tar.gz;rm -f ${binary_dir_location}/${link_name}"
gpssh -f ${binary_path}/seg_host -e ln -fs ${binary_dir_location}/${binary_dir_name} ${binary_dir_location}/${link_name}
gpssh -f ${binary_path}/seg_host -e "chown -R gpadmin:gpadmin ${binary_dir_location}/${link_name};chown -R gpadmin:gpadmin ${binary_dir_location}/${binary_dir_name}"
gpssh -f ${binary_path}/seg_host -e "source ${binary_path}/greenplum_path"
gpssh -f ${binary_path}/seg_host -e "cd ${binary_dir_location};ll"
复制代码
2.创建集群数据目录
##创建master 数据目录
mkdir -p /opt/greenplum/data/master
chown gpadmin:gpadmin /opt/greenplum/data/master
##standby 数据目录(本次实验没有standby )
使用gpssh 远程给standby 创建数据目录
# source /usr/local/greenplum-db/greenplum_path.sh
# gpssh -h smdw -e 'mkdir -p /data/master'
# gpssh -h smdw -e 'chown gpadmin:gpadmin /data/master'
##创建segment 数据目录
本次计划每个主机安装两个 segment,两个mirror.
source /usr/local/greenplum-db/greenplum_path.sh
gpssh -f /usr/local/greenplum-db/seg_host -e 'mkdir -p /opt/greenplum/data1/primary'
gpssh -f /usr/local/greenplum-db/seg_host -e 'mkdir -p /opt/greenplum/data1/mirror'
gpssh -f /usr/local/greenplum-db/seg_host -e 'mkdir -p /opt/greenplum/data2/primary'
gpssh -f /usr/local/greenplum-db/seg_host -e 'mkdir -p /opt/greenplum/data2/mirror'
gpssh -f /usr/local/greenplum-db/seg_host -e 'chown -R gpadmin /opt/greenplum/data*'
复制代码
3.集群性能测试
## 与旧版本差异点
gp6 取消了gpcheck 工具。目前可校验的部分是网络和磁盘IO性能。
gpcheck工具可以对gp需要的系统参数,硬件配置进行校验
个人经验(仅供才考,具体标准 要再找资料):
一般来说磁盘要达到2000M/s
网络至少1000M/s
复制代码
4.网络性能测试
[root@mdw ~]# gpcheckperf -f /usr/local/greenplum-db/seg_host -r N -d /tmp
/usr/local/greenplum-db-6.14.1/bin/gpcheckperf -f /usr/local/greenplum-db/seg_host -r N -d /tmp
-------------------
-- NETPERF TEST
-------------------
NOTICE: -t is deprecated, and has no effect
NOTICE: -f is deprecated, and has no effect
NOTICE: -t is deprecated, and has no effect
NOTICE: -f is deprecated, and has no effect
====================
== RESULT 2021-02-27T11:56:10.502661
====================
Netperf bisection bandwidth test
sdw1 -> sdw2 = 2069.110000
sdw2 -> sdw1 = 2251.890000
Summary:
sum = 4321.00 MB/sec
min = 2069.11 MB/sec
max = 2251.89 MB/sec
avg = 2160.50 MB/sec
median = 2251.89 MB/sec
复制代码
5.磁盘I/O 性能测试
gpcheckperf -f /usr/local/greenplum-db/seg_host -r ds -D -d /opt/greenplum/data1/primary
复制代码
6.集群时钟校验(非官方步骤)
验证集群时间,若不一致,需要修改ntp
gpssh -f /usr/local/greenplum-db/all_host -e 'date'
复制代码
五、集群初始化
官方文档:gpdb.docs.pivotal.io/6-2/install…
1.编写初始化配置文件
su - gpadmin
mkdir -p /home/gpadmin/gpconfigs
cp $GPHOME/docs/cli_help/gpconfigs/gpinitsystem_config /home/gpadmin/gpconfigs/gpinitsystem_config
复制代码
2.根据需要修改参数
注意:To specify PORT_BASE, review the port range specified in the net.ipv4.ip_local_port_range parameter in the /etc/sysctl.conf file.
主要修改的参数:
ARRAY_NAME="Greenplum Data Platform"
SEG_PREFIX=gpseg
PORT_BASE=6000
declare -a DATA_DIRECTORY=(/opt/greenplum/data1/primary /opt/greenplum/data2/primary)
MASTER_HOSTNAME=mdw
MASTER_DIRECTORY=/opt/greenplum/data/master
MASTER_PORT=5432
TRUSTED_SHELL=ssh
CHECK_POINT_SEGMENTS=8
ENCODING=UNICODE
MIRROR_PORT_BASE=7000
declare -a MIRROR_DATA_DIRECTORY=(/opt/greenplum/data1/mirror /opt/greenplum/data2/mirror)
DATABASE_NAME=gpdw
复制代码
3.集群初始化命令参数
##root执行
##/usr/local/greenplum-db/./bin/gpinitsystem: line 244: /tmp/cluster_tmp_file.8070: Permission denied报错处理:
gpssh -f /usr/local/greenplum-db/all_host -e 'chmod 777 /tmp'
##/bin/mv: cannot stat `/tmp/cluster_tmp_file.8070': Permission denied报错处理:
gpssh -f /usr/local/greenplum-db/all_host -e 'chmod u+s /bin/ping'
su - gpadmin
gpinitsystem -c /home/gpadmin/gpconfigs/gpinitsystem_config -h /usr/local/greenplum-db/seg_host -D
复制代码
顺利初始化完成,会 打印出 Greenplum Database instance successfully created。
日志生成到/home/gpadmin/gpAdminLogs/ 目录下,命名规则: gpinitsystem_${安装日期}.log
日志后部分如下:
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-End Function PARALLEL_SUMMARY_STATUS_REPORT
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-End Function CREATE_SEGMENT
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-Start Function FORCE_FTS_PROBE
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-End Function FORCE_FTS_PROBE
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-Start Function SCAN_LOG
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-Scanning utility log file for any warning messages
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-Log file scan check passed
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-End Function SCAN_LOG
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-Greenplum Database instance successfully created
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-------------------------------------------------------
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-To complete the environment configuration, please
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-update gpadmin .bashrc file with the following
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-1. Ensure that the greenplum_path.sh file is sourced
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-2. Add "export MASTER_DATA_DIRECTORY=/opt/greenplum/data/master/gpseg-1"
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:- to access the Greenplum scripts for this instance:
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:- or, use -d /opt/greenplum/data/master/gpseg-1 option for the Greenplum scripts
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:- Example gpstate -d /opt/greenplum/data/master/gpseg-1
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-Script log file = /home/gpadmin/gpAdminLogs/gpinitsystem_20210227.log
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-To remove instance, run gpdeletesystem utility
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-To initialize a Standby Master Segment for this Greenplum instance
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-Review options for gpinitstandby
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-------------------------------------------------------
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-The Master /opt/greenplum/data/master/gpseg-1/pg_hba.conf post gpinitsystem
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-has been configured to allow all hosts within this new
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-array to intercommunicate. Any hosts external to this
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-new array must be explicitly added to this file
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-Refer to the Greenplum Admin support guide which is
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-located in the /usr/local/greenplum-db-6.14.1/docs directory
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-------------------------------------------------------
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-End Main
复制代码
仔细阅读日志后面的内容,还有几个步骤需要操作。4.3.1 检查日志内容
日志中有如下提示:
Scan of log file indicates that some warnings or errors
were generated during the array creation
Please review contents of log file
/home/gpadmin/gpAdminLogs/gpinitsystem_20210227.log
复制代码
Scan warnings or errors:
[gpadmin@mdw ~]$ cat /home/gpadmin/gpAdminLogs/gpinitsystem_20210227.log|grep -E -i 'WARN|ERROR]'
WARNING: enabling "trust" authentication for local connections
WARNING: enabling "trust" authentication for local connections
WARNING: enabling "trust" authentication for local connections
WARNING: enabling "trust" authentication for local connections
WARNING: enabling "trust" authentication for local connections
20210227:12:28:46:006522 gpinitsystem:mdw:gpadmin-[INFO]:-Scanning utility log file for any warning messages
复制代码
根据日志内容做相应的调整,使集群性能达到优。
4.设置环境变量
编辑gpadmin 用户的环境变量,增加:
source /usr/local/greenplum-db/greenplum_path.sh
export MASTER_DATA_DIRECTORY=/opt/greenplum/data/master/gpseg-1
复制代码
除此之外,通常还增加:
export PGPORT=5432 # 根据实际情况填写
export PGUSER=gpadmin # 根据实际情况填写
export PGDATABASE=gpdw # 根据实际情况填写
复制代码
环境变量详情参考:gpdb.docs.pivotal.io/510/install….
su - gpadmin
cat >> /home/gpadmin/.bash_profile << EOF
export MASTER_DATA_DIRECTORY=/opt/greenplum/data/master/gpseg-1
export PGPORT=5432
export PGUSER=gpadmin
export PGDATABASE=gpdw
EOF
cat >> /home/gpadmin/.bashrc << EOF
export MASTER_DATA_DIRECTORY=/opt/greenplum/data/master/gpseg-1
export PGPORT=5432
export PGUSER=gpadmin
export PGDATABASE=gpdw
EOF
##环境变量文件分发到其他节点
gpscp -f /usr/local/greenplum-db/seg_host /home/gpadmin/.bash_profile gpadmin@=:/home/gpadmin/.bash_profile
gpscp -f /usr/local/greenplum-db/seg_host /home/gpadmin/.bashrc gpadmin@=:/home/gpadmin/.bashrc
gpssh -f /usr/local/greenplum-db/all_host -e 'source /home/gpadmin/.bash_profile;source /home/gpadmin/.bashrc;'
复制代码
5.若删除重装,使用gpdeletesystem
安装完成,出于种种原因,若需要集群删除重装,使用 gpdeletesystem 工具
详情参考官方文档:
gpdb.docs.pivotal.io/6-2/utility…
使用命令:
gpdeletesystem -d /opt/greenplum/data/master/gpseg-1 -f
复制代码
-d 后面跟 MASTER_DATA_DIRECTORY(master 的数据目录),会清除master,segment所有的数据目录。
-f force, 终止所有进程,强制删除。示例:
gpdeletesystem -d /opt/greenplum/data/master/gpseg-1 -f
复制代码
删除完成后再根据自己需要,调整集群初始化配置文件,并重新初始化。
vi /home/gpadmin/gpconfigs/gpinitsystem_config
gpinitsystem -c /home/gpadmin/gpconfigs/gpinitsystem_config -h /usr/local/greenplum-db/seg_host -D
复制代码
六、安装成功后配置
1.psql 登陆gp 并设置密码
是用psql 登录gp, 一般命令格式为:
psql -h hostname -p port -d database -U user -W password
复制代码
-h后面接对应的master或者segment主机名
-p后面接master或者segment的端口号
-d后面接数据库名可将上述参数配置到用户环境变量中,linux 中使用gpadmin用户不需要密码。
psql 登录,并设置gpadmin用户密码示例:
[gpadmin@mdw gpseg-1]$ psql -h mdw -p5432 -d gpdw
psql (9.4.24)
Type "help" for help.
gpdw=# alter user gpadmin with password 'gpadmin';
ALTER ROLE
gpdw=# \q
复制代码
2.登陆到不同节点
[gpadmin@mdw gpseg-1]$ PGOPTIONS='-c gp_session_role=utility' psql -h mdw -p5432 -d postgres
psql (9.4.24)
Type "help" for help.
postgres=# \q
[gpadmin@mdw gpseg-1]$ PGOPTIONS='-c gp_session_role=utility' psql -h sdw1 -p6000 -d postgres
psql (9.4.24)
Type "help" for help.
postgres=# \q
复制代码
3.客户端登陆gp
配置 pg_hba.conf
配置 postgresql.conf
配置pg_hba.conf
参考配置说明:blog.csdn.net/yaoqiancuo3…
vim /opt/greenplum/data/master/gpseg-1/pg_hba.conf
##修改
host replication gpadmin 192.168.31.201/32 trust
##为
host all gpadmin 192.168.31.201/32 trust
##新增
host all gpadmin 0.0.0.0/0 md5 # 新增规则允许任意ip 密码登陆
复制代码
**配置postgresql.conf **
postgresql.conf里的监听地址设置为:
listen_addresses = '*' # 允许监听任意ip gp6.0 默认会设置这个参数为 listen_addresses = '*'
vim /opt/greenplum/data/master/gpseg-1/postgresql.conf
复制代码
4.加载修改的文件
gpstop -u
作者:Lucifer三思而后行
链接:https://juejin.cn/post/7036205332812529671