MathJax

MathJax-2

MathJax-3

Google Code Prettify

置頂入手筆記

EnterproseDB Quickstart — 快速入門筆記

由於考慮採用 EnterpriseDB 或是直接用 PostgreSQL 的人,通常需要一些入手的資料。這邊紀錄便提供相關快速上手的簡單筆記 ~ 這篇筆記以 資料庫安裝完畢後的快速使用 為目標,基本紀錄登入使用的範例:

2021年10月13日 星期三

利用 Python3 內建 venv 管理 PGSQL PL/Python3 裡面可以使用的 Python 模組

使用 Python 有很多便利,但也很多困擾。。。其中之一就是 dependency hell。經典的圖就是 xkcd: Python Environment

當然這狀況已經有許多手段可以預先規劃避免與處理,其中之一就是使用 virtualenv 製作隔離的 python runtime 環境。

那麼,當 Postgres 的內嵌 stored procdure PL/PythonU 遇到需要安裝模組要怎辦?

一般優先選項都是先用 OS 提供的套件,不過通常會老舊一些(企業版 Linux 常常都是 Long-Term Support)。裝新一點的模組就得用 Python 套件管理模組 pip 處理。

所幸 PL/PythonU 有辦法透過 virtualenv / venv 這類的環境存取模組,免於自己裝的套件跟 OS 內部環境打架。以下一步步作範例~

由於 PL/Python 所用的 Python 環境會取決於安裝的資料庫套件。這邊在 CentOS 8-Stream (目前對應的 Rocky Linux 是 8.4 版,也就是 RHEL 系列的穩定版本位置)底下安裝 PGDG PostgreSQL 14.0 與 PL/Python3 作示範。

筆記紀錄時搭配的 OS Platform Python 為 3.6.8。不過由於 RHEL 8 系列的 OS 內建 Python 有些調整,因此儘管 OS 裡面有 Python library,仍然需要安裝 Python。

以下簡要紀錄安裝步驟(在 LXD 底下測試)
lab@lxdlab:~$ lxc launch images:centos/8-Stream/amd64 pgsql14
lab@lxdlab:~$ lxc exec pgsql14 -- dnf install -y https://download.postgresql.org/pub/repos/yum/reporpms/EL-8-x86_64/pgdg-redhat-repo-latest.noarch.rpm
lab@lxdlab:~$ lxc exec pgsql14 -- dnf -qy module disable postgresql
lab@lxdlab:~$ lxc exec pgsql14 -- dnf group install -y "PostgreSQL Database Server 14 PGDG"
lab@lxdlab:~$ lxc exec pgsql14 -- dnf install -y postgresql14-plpython3
lab@lxdlab:~$ lxc exec pgsql14 -- /usr/pgsql-14/bin/postgresql-14-setup initdb
lab@lxdlab:~$ lxc exec pgsql14 -- dnf install -y python36
lab@lxdlab:~$ lxc shell pgsql14
[root@pgsql14 ~]# ldd /usr/pgsql-14/lib/plpython3.so 
        linux-vdso.so.1 (0x00007ffd4f6dd000)
        libpython3.6m.so.1.0 => /usr/lib64/libpython3.6m.so.1.0 (0x00007f8f29cfa000)
        libpthread.so.0 => /usr/lib64/libpthread.so.0 (0x00007f8f29ada000)
        libc.so.6 => /usr/lib64/libc.so.6 (0x00007f8f29715000)
        libcrypto.so.1.1 => /lib64/libcrypto.so.1.1 (0x00007f8f2922c000)
        libdl.so.2 => /lib64/libdl.so.2 (0x00007f8f29028000)
        libutil.so.1 => /lib64/libutil.so.1 (0x00007f8f28e24000)
        libm.so.6 => /lib64/libm.so.6 (0x00007f8f28aa2000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f8f2a45d000)
        libz.so.1 => /lib64/libz.so.1 (0x00007f8f2888b000)
[root@pgsql14 ~]# 

接著就是正式的步驟了:

1. 初始化一個 virtualenv 環境由於 自 Python 3.6 之後已經有 virtualenv 內建在 Python 核心模組 venv,這邊會直接用該指令產生。在此命名為 plpython 放在 Postgres 的家目錄 /var/lib/pgsql/
[root@pgsql14 ~]# su - postgres
[postgres@pgsql14 ~]$ pwd
/var/lib/pgsql
[postgres@pgsql14 ~]$ python3.6 -m venv plpython
[postgres@pgsql14 ~]$ ls plpython/bin/
activate      activate.fish  easy_install-3.6  pip3    python   python3.6
activate.csh  easy_install   pip               pip3.6  python3
[postgres@pgsql14 ~]$ 

2. 為了檢驗安裝模組,這邊測試一個 pyyaml 模組作確認。檢驗方式從這邊抄的
[postgres@pgsql14 ~]$ ## OS 環境的套件清單
[postgres@pgsql14 ~]$ python3
Python 3.6.8 (default, Sep 22 2021, 11:16:51) 
[GCC 8.5.0 20210514 (Red Hat 8.5.0-3)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pkg_resources
>>> installed_packages = pkg_resources.working_set
>>> installed_packages_list = sorted(["%s==%s" % (i.key, i.version)
...    for i in installed_packages])
>>> print(installed_packages_list)
['gpg==1.13.1', 'libcomps==0.1.16', 'pip==9.0.3', 'rpm==4.14.3', 'setuptools==39.2.0']
>>> exit()
[postgres@pgsql14 ~]$ 
[postgres@pgsql14 ~]$ source plpython/bin/activate
(plpython) [postgres@pgsql14 ~]$ 
(plpython) [postgres@pgsql14 ~]$ echo $PATH
/var/lib/pgsql/plpython/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin
(plpython) [postgres@pgsql14 ~]$ 
(plpython) [postgres@pgsql14 ~]$ pip install pyyaml
Collecting pyyaml
  Downloading https://files.pythonhosted.org/packages/7a/5b/bc0b5ab38247bba158504a410112b6c03f153c652734ece1849749e5f518/PyYAML-5.4.1-cp36-cp36m-manylinux1_x86_64.whl (640kB)
    100% |████████████████████████████████| 645kB 1.6MB/s 
Installing collected packages: pyyaml
Successfully installed pyyaml-5.4.1
You are using pip version 9.0.3, however version 21.3 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
(plpython) [postgres@pgsql14 ~]$ 
(plpython) [postgres@pgsql14 ~]$ ## venv 環境的套件清單
(plpython) [postgres@pgsql14 ~]$ python3
Python 3.6.8 (default, Sep 22 2021, 11:16:51) 
[GCC 8.5.0 20210514 (Red Hat 8.5.0-3)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pkg_resources
>>> installed_packages = pkg_resources.working_set
>>> installed_packages_list = sorted(["%s==%s" % (i.key, i.version)
...    for i in installed_packages])
>>> print(installed_packages_list)
['pip==9.0.3', 'pyyaml==5.4.1', 'setuptools==39.2.0']
>>> exit()
(plpython) [postgres@pgsql14 ~]$
(plpython) [postgres@pgsql14 ~]$ deactivate 
[postgres@pgsql14 ~]$ 

3. 調整 postgres 資料庫的 SystemD unit file
為了讓 PGSQL 在啟動時可以直接進入 venv,這邊透過 SystemD 的資料庫啟動檔作調整,讓資料庫啟動前可以載入 venv 的 activate Script。
但由於 bash shell 有 subshell 的變數有效範圍問題,因此這邊不是用 ExecStartPre,而是直接在啟動 postmaster 的 ExecStart 補上執行 source Script 處理
[root@pgsql14 ~]# cp /usr/lib/systemd/system/postgresql-14.service /etc/systemd/system/
[root@pgsql14 ~]# vi /etc/systemd/system/postgresql-14.service 
[root@pgsql14 ~]# ## 修改後,黃色套底部份
[root@pgsql14 ~]# cat /etc/systemd/system/postgresql-14.service 
# It's not recommended to modify this file in-place, because it will be
# overwritten during package upgrades.  It is recommended to use systemd
# "dropin" feature;  i.e. create file with suffix .conf under
# /etc/systemd/system/postgresql-14.service.d directory overriding the
# unit's defaults. You can also use "systemctl edit postgresql-14"
# Look at systemd.unit(5) manual page for more info.

# Note: changing PGDATA will typically require adjusting SELinux
# configuration as well.

# Note: do not use a PGDATA pathname containing spaces, or you will
# break postgresql-14-setup.
[Unit]
Description=PostgreSQL 14 database server
Documentation=https://www.postgresql.org/docs/14/static/
After=syslog.target
After=network.target

[Service]
Type=notify

User=postgres
Group=postgres

# Note: avoid inserting whitespace in these Environment= lines, or you may
# break postgresql-setup.

# Location of database directory
Environment=PGDATA=/var/lib/pgsql/14/data/

# Where to send early-startup messages from the server (before the logging
# options of postgresql.conf take effect)
# This is normally controlled by the global default set by systemd
# StandardOutput=syslog

# Disable OOM kill on the postmaster
OOMScoreAdjust=-1000
Environment=PG_OOM_ADJUST_FILE=/proc/self/oom_score_adj
Environment=PG_OOM_ADJUST_VALUE=0

ExecStartPre=/usr/pgsql-14/bin/postgresql-14-check-db-dir ${PGDATA}
#ExecStart=/usr/pgsql-14/bin/postmaster -D ${PGDATA}
ExecStart=/bin/bash -c "source /var/lib/pgsql/plpython/bin/activate && /usr/pgsql-14/bin/postmaster -D ${PGDATA}"
ExecReload=/bin/kill -HUP $MAINPID
KillMode=mixed
KillSignal=SIGINT
 

# Do not set any timeout value, so that systemd will not kill postmaster
# during crash recovery.
TimeoutSec=0

[Install]
WantedBy=multi-user.target
[root@pgsql14 ~]# 
[root@pgsql14 ~]# systemctl daemon-reload
[root@pgsql14 ~]# service postgresql-14 start
[root@pgsql14 ~]# 

4. 進入資料庫裡面檢驗:
這邊用 anonymous block 簡單呈現,可以成功的辨認 venv 的內容。
另外補充,若不是使用 systemd 指令,而是手動先啟用 venv 再以 pg_ctl 啟動,也有一樣的功效~
[postgres@pgsql14 ~]$ psql
psql (14.0)
Type "help" for help.

postgres=# CREATE EXTENSION plpython3u ;
CREATE EXTENSION
postgres=# DO $$
import pkg_resources
installed_packages = pkg_resources.working_set
installed_packages_list = sorted(["%s==%s" % (i.key, i.version)
   for i in installed_packages])
plpy.notice(installed_packages_list)
$$ LANGUAGE plpython3u;
NOTICE:  ['pip==9.0.3', 'pyyaml==5.4.1', 'setuptools==39.2.0']
DO
postgres=# 

另外 Linux 的 /proc 目錄對也會紀錄 Process 啟動時的環境變數到個別 pid 的 environ 檔案內,該檔案內容可以用 binutils 套件裡的 strings -a 指令顯示的好看一點。
[postgres@pgsql14 ~]$ cat /proc/$(head -1 $PGDATA/postmaster.pid)/environ
LANG=en_US.utf8INVOCATION_ID=2e19e893f22e4f83be12d87401debd00VIRTUAL_ENV=/var/lib/pgsql/plpythonNOTIFY_SOCKET=/run/systemd/notifyUSER=postgresPWD=/HOME=/var/lib/pgsqlJOURNAL_STREAM=8:69224PG_OOM_ADJUST_VALUE=0PGDATA=/var/lib/pgsql/14/data/SHELL=/bin/bashSHLVL=1LOGNAME=postgresPG_OOM_ADJUST_FILE=/proc/self/oom_score_adjPATH=/var/lib/pgsql/plpython/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/binPS1=(plpython) _=/usr/pgsql-14/bin/postmaster[postgres@pgsql14 ~]$ 
[postgres@pgsql14 ~]$ 
[postgres@pgsql14 ~]$ strings -a /proc/$(head -1 $PGDATA/postmaster.pid)/environ
LANG=en_US.utf8
INVOCATION_ID=2e19e893f22e4f83be12d87401debd00
VIRTUAL_ENV=/var/lib/pgsql/plpython
NOTIFY_SOCKET=/run/systemd/notify
USER=postgres
PWD=/
HOME=/var/lib/pgsql
JOURNAL_STREAM=8:69224
PG_OOM_ADJUST_VALUE=0
PGDATA=/var/lib/pgsql/14/data/
SHELL=/bin/bash
SHLVL=1
LOGNAME=postgres
PG_OOM_ADJUST_FILE=/proc/self/oom_score_adj
PATH=/var/lib/pgsql/plpython/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin
PS1=(plpython) 
_=/usr/pgsql-14/bin/postmaster
[postgres@pgsql14 ~]$ 


基本上這樣就可以成功在 PL/Python 裡面使用 venv 了~~
不過測試後,不能放到 $PGDATA 底下,需要放到其他不是權限 700 的位置才行。


參考資料
PostgreSQL PL/Python: call stored procedure in virtualenv - Stack Overflow
Using virtualenv under PostgreSQL PL/Python · GitHub dmckeone/gist
Support both Python2 and Python3 using virtualenv for plpython
pg使用python编写存储过程_PostgreSQL PL / Python:在virtualenv中调用存储过程_weixin_39707851的博客
Can python venv be used with plpython3u for postgresql? - Stack Overflow
Using Python 3 in virtualenv - Stack Overflow
python - How to use virtualenv with python3.6 on ubuntu 16.04? - Stack Overflow
How to set environmental variable in systemd service - Unix & Linux Stack Exchange
fedora - How can systemd run a command as root before launching a service as a different user? - Server Fault
linux - Systemd with multiple execStart - Stack Overflow
bash - creating OS environment variables using ExecStartPre on Systemd - Unix & Linux Stack Exchange
systemd: permission issue with mkdir & ExecStartPre - Unix & Linux Stack Exchange
How to List Installed Python Packages - ActiveState
Using environment variables in systemd units | Kinvolk
Systemd bash builtins - Unix & Linux Stack Exchange
linux - systemd script - environment file updated by ExecStartPre - Stack Overflow
ExecStart - systemd.service - freedesktop.org
python - How to enable a virtualenv in a systemd service unit? - Stack Overflow
python - How to run a command inside a virtualenv using systemd - Unix & Linux Stack Exchange
How to read environment variables of a process - Unix & Linux Stack Exchange

沒有留言:

張貼留言