當然這狀況已經有許多手段可以預先規劃避免與處理,其中之一就是使用 virtualenv 製作隔離的 python runtime 環境。
那麼,當 Postgres 的內嵌 stored procdure PL/PythonU 遇到需要安裝模組要怎辦?
一般優先選項都是先用 OS 提供的套件,不過通常會老舊一些(企業版 Linux 常常都是 Long-Term Support)。裝新一點的模組就得用 Python 套件管理模組 pip 處理。
所幸 PL/PythonU 有辦法透過 virtualenv / venv 這類的環境存取模組,免於自己裝的套件跟 OS 內部環境打架。以下一步步作範例~
由於 PL/Python 所用的 Python 環境會取決於安裝的資料庫套件。這邊在 CentOS 8-Stream (目前對應的 Rocky Linux 是 8.4 版,也就是 RHEL 系列的穩定版本位置)底下安裝 PGDG PostgreSQL 14.0 與 PL/Python3 作示範。
筆記紀錄時搭配的 OS Platform Python 為 3.6.8。不過由於 RHEL 8 系列的 OS 內建 Python 有些調整,因此儘管 OS 裡面有 Python library,仍然需要安裝 Python。
以下簡要紀錄安裝步驟(在 LXD 底下測試)
lab@lxdlab:~$ lxc launch images:centos/8-Stream/amd64 pgsql14 lab@lxdlab:~$ lxc exec pgsql14 -- dnf install -y https://download.postgresql.org/pub/repos/yum/reporpms/EL-8-x86_64/pgdg-redhat-repo-latest.noarch.rpm lab@lxdlab:~$ lxc exec pgsql14 -- dnf -qy module disable postgresql lab@lxdlab:~$ lxc exec pgsql14 -- dnf group install -y "PostgreSQL Database Server 14 PGDG" lab@lxdlab:~$ lxc exec pgsql14 -- dnf install -y postgresql14-plpython3 lab@lxdlab:~$ lxc exec pgsql14 -- /usr/pgsql-14/bin/postgresql-14-setup initdb lab@lxdlab:~$ lxc exec pgsql14 -- dnf install -y python36
lab@lxdlab:~$ lxc shell pgsql14 [root@pgsql14 ~]# ldd /usr/pgsql-14/lib/plpython3.so linux-vdso.so.1 (0x00007ffd4f6dd000) libpython3.6m.so.1.0 => /usr/lib64/libpython3.6m.so.1.0 (0x00007f8f29cfa000) libpthread.so.0 => /usr/lib64/libpthread.so.0 (0x00007f8f29ada000) libc.so.6 => /usr/lib64/libc.so.6 (0x00007f8f29715000) libcrypto.so.1.1 => /lib64/libcrypto.so.1.1 (0x00007f8f2922c000) libdl.so.2 => /lib64/libdl.so.2 (0x00007f8f29028000) libutil.so.1 => /lib64/libutil.so.1 (0x00007f8f28e24000) libm.so.6 => /lib64/libm.so.6 (0x00007f8f28aa2000) /lib64/ld-linux-x86-64.so.2 (0x00007f8f2a45d000) libz.so.1 => /lib64/libz.so.1 (0x00007f8f2888b000) [root@pgsql14 ~]#
接著就是正式的步驟了:
1. 初始化一個 virtualenv 環境由於 自 Python 3.6 之後已經有 virtualenv 內建在 Python 核心模組 venv,這邊會直接用該指令產生。在此命名為 plpython 放在 Postgres 的家目錄 /var/lib/pgsql/
[root@pgsql14 ~]# su - postgres [postgres@pgsql14 ~]$ pwd /var/lib/pgsql [postgres@pgsql14 ~]$ python3.6 -m venv plpython [postgres@pgsql14 ~]$ ls plpython/bin/ activate activate.fish easy_install-3.6 pip3 python python3.6 activate.csh easy_install pip pip3.6 python3 [postgres@pgsql14 ~]$
2. 為了檢驗安裝模組,這邊測試一個 pyyaml 模組作確認。檢驗方式從這邊抄的:
[postgres@pgsql14 ~]$ ## OS 環境的套件清單 [postgres@pgsql14 ~]$ python3 Python 3.6.8 (default, Sep 22 2021, 11:16:51) [GCC 8.5.0 20210514 (Red Hat 8.5.0-3)] on linux Type "help", "copyright", "credits" or "license" for more information. >>> import pkg_resources >>> installed_packages = pkg_resources.working_set >>> installed_packages_list = sorted(["%s==%s" % (i.key, i.version) ... for i in installed_packages]) >>> print(installed_packages_list) ['gpg==1.13.1', 'libcomps==0.1.16', 'pip==9.0.3', 'rpm==4.14.3', 'setuptools==39.2.0'] >>> exit() [postgres@pgsql14 ~]$
[postgres@pgsql14 ~]$ source plpython/bin/activate (plpython) [postgres@pgsql14 ~]$ (plpython) [postgres@pgsql14 ~]$ echo $PATH /var/lib/pgsql/plpython/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin (plpython) [postgres@pgsql14 ~]$ (plpython) [postgres@pgsql14 ~]$ pip install pyyaml Collecting pyyaml Downloading https://files.pythonhosted.org/packages/7a/5b/bc0b5ab38247bba158504a410112b6c03f153c652734ece1849749e5f518/PyYAML-5.4.1-cp36-cp36m-manylinux1_x86_64.whl (640kB) 100% |████████████████████████████████| 645kB 1.6MB/s Installing collected packages: pyyaml Successfully installed pyyaml-5.4.1 You are using pip version 9.0.3, however version 21.3 is available. You should consider upgrading via the 'pip install --upgrade pip' command. (plpython) [postgres@pgsql14 ~]$ (plpython) [postgres@pgsql14 ~]$ ## venv 環境的套件清單 (plpython) [postgres@pgsql14 ~]$ python3 Python 3.6.8 (default, Sep 22 2021, 11:16:51) [GCC 8.5.0 20210514 (Red Hat 8.5.0-3)] on linux Type "help", "copyright", "credits" or "license" for more information. >>> import pkg_resources >>> installed_packages = pkg_resources.working_set >>> installed_packages_list = sorted(["%s==%s" % (i.key, i.version) ... for i in installed_packages]) >>> print(installed_packages_list) ['pip==9.0.3', 'pyyaml==5.4.1', 'setuptools==39.2.0'] >>> exit() (plpython) [postgres@pgsql14 ~]$ (plpython) [postgres@pgsql14 ~]$ deactivate [postgres@pgsql14 ~]$
3. 調整 postgres 資料庫的 SystemD unit file
為了讓 PGSQL 在啟動時可以直接進入 venv,這邊透過 SystemD 的資料庫啟動檔作調整,讓資料庫啟動前可以載入 venv 的 activate Script。
但由於 bash shell 有 subshell 的變數有效範圍問題,因此這邊不是用 ExecStartPre,而是直接在啟動 postmaster 的 ExecStart 補上執行 source Script 處理
[root@pgsql14 ~]# cp /usr/lib/systemd/system/postgresql-14.service /etc/systemd/system/ [root@pgsql14 ~]# vi /etc/systemd/system/postgresql-14.service [root@pgsql14 ~]# ## 修改後,黃色套底部份 [root@pgsql14 ~]# cat /etc/systemd/system/postgresql-14.service # It's not recommended to modify this file in-place, because it will be # overwritten during package upgrades. It is recommended to use systemd # "dropin" feature; i.e. create file with suffix .conf under # /etc/systemd/system/postgresql-14.service.d directory overriding the # unit's defaults. You can also use "systemctl edit postgresql-14" # Look at systemd.unit(5) manual page for more info. # Note: changing PGDATA will typically require adjusting SELinux # configuration as well. # Note: do not use a PGDATA pathname containing spaces, or you will # break postgresql-14-setup. [Unit] Description=PostgreSQL 14 database server Documentation=https://www.postgresql.org/docs/14/static/ After=syslog.target After=network.target [Service] Type=notify User=postgres Group=postgres # Note: avoid inserting whitespace in these Environment= lines, or you may # break postgresql-setup. # Location of database directory Environment=PGDATA=/var/lib/pgsql/14/data/ # Where to send early-startup messages from the server (before the logging # options of postgresql.conf take effect) # This is normally controlled by the global default set by systemd # StandardOutput=syslog # Disable OOM kill on the postmaster OOMScoreAdjust=-1000 Environment=PG_OOM_ADJUST_FILE=/proc/self/oom_score_adj Environment=PG_OOM_ADJUST_VALUE=0 ExecStartPre=/usr/pgsql-14/bin/postgresql-14-check-db-dir ${PGDATA} #ExecStart=/usr/pgsql-14/bin/postmaster -D ${PGDATA} ExecStart=/bin/bash -c "source /var/lib/pgsql/plpython/bin/activate && /usr/pgsql-14/bin/postmaster -D ${PGDATA}" ExecReload=/bin/kill -HUP $MAINPID KillMode=mixed KillSignal=SIGINT # Do not set any timeout value, so that systemd will not kill postmaster # during crash recovery. TimeoutSec=0 [Install] WantedBy=multi-user.target [root@pgsql14 ~]# [root@pgsql14 ~]# systemctl daemon-reload [root@pgsql14 ~]# service postgresql-14 start [root@pgsql14 ~]#
4. 進入資料庫裡面檢驗:
這邊用 anonymous block 簡單呈現,可以成功的辨認 venv 的內容。
另外補充,若不是使用 systemd 指令,而是手動先啟用 venv 再以 pg_ctl 啟動,也有一樣的功效~
[postgres@pgsql14 ~]$ psql psql (14.0) Type "help" for help. postgres=# CREATE EXTENSION plpython3u ; CREATE EXTENSION postgres=# DO $$ import pkg_resources installed_packages = pkg_resources.working_set installed_packages_list = sorted(["%s==%s" % (i.key, i.version) for i in installed_packages]) plpy.notice(installed_packages_list) $$ LANGUAGE plpython3u; NOTICE: ['pip==9.0.3', 'pyyaml==5.4.1', 'setuptools==39.2.0'] DO postgres=#
另外 Linux 的 /proc 目錄對也會紀錄 Process 啟動時的環境變數到個別 pid 的 environ 檔案內,該檔案內容可以用 binutils 套件裡的 strings -a 指令顯示的好看一點。
[postgres@pgsql14 ~]$ cat /proc/$(head -1 $PGDATA/postmaster.pid)/environ LANG=en_US.utf8INVOCATION_ID=2e19e893f22e4f83be12d87401debd00VIRTUAL_ENV=/var/lib/pgsql/plpythonNOTIFY_SOCKET=/run/systemd/notifyUSER=postgresPWD=/HOME=/var/lib/pgsqlJOURNAL_STREAM=8:69224PG_OOM_ADJUST_VALUE=0PGDATA=/var/lib/pgsql/14/data/SHELL=/bin/bashSHLVL=1LOGNAME=postgresPG_OOM_ADJUST_FILE=/proc/self/oom_score_adjPATH=/var/lib/pgsql/plpython/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/binPS1=(plpython) _=/usr/pgsql-14/bin/postmaster[postgres@pgsql14 ~]$ [postgres@pgsql14 ~]$ [postgres@pgsql14 ~]$ strings -a /proc/$(head -1 $PGDATA/postmaster.pid)/environ LANG=en_US.utf8 INVOCATION_ID=2e19e893f22e4f83be12d87401debd00 VIRTUAL_ENV=/var/lib/pgsql/plpython NOTIFY_SOCKET=/run/systemd/notify USER=postgres PWD=/ HOME=/var/lib/pgsql JOURNAL_STREAM=8:69224 PG_OOM_ADJUST_VALUE=0 PGDATA=/var/lib/pgsql/14/data/ SHELL=/bin/bash SHLVL=1 LOGNAME=postgres PG_OOM_ADJUST_FILE=/proc/self/oom_score_adj PATH=/var/lib/pgsql/plpython/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin PS1=(plpython) _=/usr/pgsql-14/bin/postmaster [postgres@pgsql14 ~]$
不過測試後,不能放到 $PGDATA 底下,需要放到其他不是權限 700 的位置才行。
參考資料
PostgreSQL PL/Python: call stored procedure in virtualenv - Stack Overflow
Using virtualenv under PostgreSQL PL/Python · GitHub dmckeone/gist
Support both Python2 and Python3 using virtualenv for plpython
pg使用python编写存储过程_PostgreSQL PL / Python:在virtualenv中调用存储过程_weixin_39707851的博客
Can python venv be used with plpython3u for postgresql? - Stack Overflow
Using Python 3 in virtualenv - Stack Overflow
python - How to use virtualenv with python3.6 on ubuntu 16.04? - Stack Overflow
How to set environmental variable in systemd service - Unix & Linux Stack Exchange
fedora - How can systemd run a command as root before launching a service as a different user? - Server Fault
linux - Systemd with multiple execStart - Stack Overflow
bash - creating OS environment variables using ExecStartPre on Systemd - Unix & Linux Stack Exchange
systemd: permission issue with mkdir & ExecStartPre - Unix & Linux Stack Exchange
How to List Installed Python Packages - ActiveState
Using environment variables in systemd units | Kinvolk
Systemd bash builtins - Unix & Linux Stack Exchange
linux - systemd script - environment file updated by ExecStartPre - Stack Overflow
ExecStart - systemd.service - freedesktop.org
python - How to enable a virtualenv in a systemd service unit? - Stack Overflow
python - How to run a command inside a virtualenv using systemd - Unix & Linux Stack Exchange
How to read environment variables of a process - Unix & Linux Stack Exchange
沒有留言:
張貼留言