MathJax

MathJax-2

MathJax-3

Google Code Prettify

置頂入手筆記

EnterproseDB Quickstart — 快速入門筆記

由於考慮採用 EnterpriseDB 或是直接用 PostgreSQL 的人,通常需要一些入手的資料。這邊紀錄便提供相關快速上手的簡單筆記 ~ 這篇筆記以 資料庫安裝完畢後的快速使用 為目標,基本紀錄登入使用的範例:

2025年3月5日 星期三

Barman 3.10.0 的 Python debug 小旅行

最近測試 EDB 資料庫搭配 Barman 備份功能,以為怪怪的又讓我挖到 bug

就努力試了一下 Python debug 工具追蹤一下 Barman 其中步驟的運作

在仔細對照釐清後,看起來是我眼睛業障重~

這篇筆記紀錄一下下這個小旅程。

在設定 Barman 的過程中,會設定一個「非 superuser」的備份帳號

具體設定可以在手冊 Barman Manual: Preliminary steps - PostgreSQL connection 看到

以下列一點權限檢查方式

[enterprisedb@edb16p ~]$ psql -U bkpuser
Null display is "(NULL)".
Timing is on.
psql (16.2.0)
Type "help" for help.

[[local]] edb=> \du bkpuser
        List of roles
 Role name |   Attributes
-----------+-----------------
 bkpuser   | Replication    +
           | Profile default

[[local]] edb=>
[[local]] edb=> \drg
                      List of role grants
 Role name |      Member of       |   Options    |   Grantor
-----------+----------------------+--------------+--------------
 bkpuser   | pg_checkpoint        | INHERIT, SET | enterprisedb
 bkpuser   | pg_monitor           | INHERIT, SET | enterprisedb
 bkpuser   | pg_read_all_settings | INHERIT, SET | enterprisedb
 bkpuser   | pg_read_all_stats    | INHERIT, SET | enterprisedb
 dbzuser   | pg_read_all_data     | INHERIT, SET | enterprisedb
 efm       | pg_read_all_settings | INHERIT, SET | enterprisedb
 efm       | pg_read_all_stats    | INHERIT, SET | enterprisedb
(7 rows)

[[local]] edb=> select proname,proacl from pg_proc where 'bkpuser=X' = any(proacl);
WARNING:  defaulting grantor to user ID 10
LINE 1: select proname,proacl from pg_proc where 'bkpuser=X' = any(p...
                                                 ^
         proname         |                        proacl
-------------------------+------------------------------------------------------
 pg_backup_start         | {enterprisedb=X/enterprisedb,bkpuser=X/enterprisedb}
 pg_backup_stop          | {enterprisedb=X/enterprisedb,bkpuser=X/enterprisedb}
 pg_switch_wal           | {enterprisedb=X/enterprisedb,bkpuser=X/enterprisedb}
 pg_create_restore_point | {enterprisedb=X/enterprisedb,bkpuser=X/enterprisedb}
(4 rows)

Time: 2.915 ms
[[local]] edb=>

搭配著設定檔之後,就要執行檢查。

在排除一些東忘西忘的調整(pg_hba.conf、帳密、參數、以及 barman 本身的設定檔之後)還是有一個不能通過的狀況。

[barman@edb-pem-server ~]$ barman check bkp_testenv
Server bkp_testenv:
        PostgreSQL: OK
        no access to backup functions: FAILED (privileges for PostgreSQL backup functions are required (see documentation))
        PostgreSQL streaming: OK
        wal_level: OK
        PostgreSQL server is standby: OK
        Primary server is not a standby: OK
        Primary and standby have same system ID: OK
        has monitoring privileges (WAL streaming): OK
        PostgreSQL streaming (WAL streaming): OK
        wal_level (WAL streaming): OK
        systemid coherence (WAL streaming): OK (no system Id stored on disk)
        replication slot (WAL streaming): OK
        directories: OK
        retention policy settings: OK
        backup maximum age: OK (no last_backup_maximum_age provided)
        backup minimum size: OK (0 B)
        wal maximum age: OK (no last_wal_maximum_age provided)
        wal size: OK (0 B)
        compression settings: OK
        failed backups: OK (there are 0 failed backups)
        minimum redundancy requirements: OK (have 0 backups, expected at least 0)
        pg_basebackup: OK
        pg_basebackup compatible: OK
        pg_basebackup supports tablespaces mapping: OK
        systemid coherence: OK (no system Id stored on disk)
        pg_receivexlog: OK
        pg_receivexlog compatible: OK
        receive-wal running: OK
        archive_mode: OK
        archive_command: OK
        continuous archiving: OK
        archiver errors: OK
[barman@edb-pem-server ~]$

當然上面這個錯誤,將 Barman 指定成 superuser 一定會歐趴,但實際上不會這麼使用。

一定得找出原因。

在反覆檢查,甚至用關鍵字在 Barman 程式碼搜尋,定位了幾個程式碼位置(這邊測試的是 Barman 3.10.0,所以直接用這個版本為主)

  • 直接搜尋錯誤訊息 "no access to backup functions",會定位出barman/server.py#L766 這邊有一個檢查段落從 dict 裡面抓 "has_backup_privileges" 這個 key

  • 在程式碼用 grep -rl 抓出包含這個 key "has_backup_privileges" 的可能檔案,會發現在 barman/postgres.py#L976 這邊會給出值;而這個判斷的 method 出現在同一份檔案的 barman/postgres.py#L563 PostgreSQLConnection.has_backup_privileges(self) 裡面


仔細看一下,看得到它湊的 SQL 在查物件呼叫權限。

由於上面從 psql 裡面「千真萬確」看到我的權限有設定好,因此我天真的以為或許是 Python 的 driver psycopg2 的 fetchone() 呼叫過程中取值的問題(psycopg2.cursor.fetchone() 回傳的是 tuple datatype,這邊判斷的是 Python 的 true/false)


因此努力找一下,使用 python -m 的方式直接進入 debug mode

以下先定位(breakpoint)出第一個取 dict 檢查的部份,查看它的 true/false 判斷,結果竟然是 False

(Note:barman 的 timeout 機制會踢掉動作,因此有以下灰色部份自動跳出,這時要重新執行)

[barman@edb-pem-server ~]$ python3 -m pdb /usr/bin/barman  check bkp_testenv
> /usr/bin/barman(3)<module>()
-> __requires__ = 'barman==3.10.0'
(Pdb) break barman/server.py:766
Breakpoint 1 at /usr/lib/python3.6/site-packages/barman/server.py:766
(Pdb) where
  /usr/lib64/python3.6/bdb.py(434)run()
-> exec(cmd, globals, locals)
  <string>(1)<module>()
> /usr/bin/barman(3)<module>()
-> __requires__ = 'barman==3.10.0'
(Pdb) continue
Server bkp_testenv:
        PostgreSQL: OK
> /usr/lib/python3.6/site-packages/barman/server.py(766)check_postgres()
-> if remote_status.get("has_backup_privileges"):
(Pdb)   check timeout: FAILED (barman check command timed out)
The program exited via sys.exit(). Exit status: 1
> /usr/bin/barman(3)<module>()
-> __requires__ = 'barman==3.10.0'
(Pdb) c
Server bkp_testenv:
        PostgreSQL: OK
> /usr/lib/python3.6/site-packages/barman/server.py(766)check_postgres()
-> if remote_status.get("has_backup_privileges"):
(Pdb) where
  /usr/lib64/python3.6/bdb.py(434)run()
-> exec(cmd, globals, locals)
  <string>(1)<module>()
  /usr/bin/barman(11)<module>()
-> load_entry_point('barman==3.10.0', 'console_scripts', 'barman')()
  /usr/lib/python3.6/site-packages/barman/cli.py(2390)main()
-> args.func(args)
  /usr/lib/python3.6/site-packages/barman/cli.py(1225)check()
-> server.check()
  /usr/lib/python3.6/site-packages/barman/server.py(600)check()
-> self.check_postgres(check_strategy)
> /usr/lib/python3.6/site-packages/barman/server.py(766)check_postgres()
-> if remote_status.get("has_backup_privileges"):
(Pdb) p remote_status.get("has_backup_privileges")
False
(Pdb)   check timeout: FAILED (barman check command timed out)
The program exited via sys.exit(). Exit status: 1
> /usr/bin/barman(3)<module>()
-> __requires__ = 'barman==3.10.0'
(Pdb)  where
  /usr/lib64/python3.6/bdb.py(434)run()
-> exec(cmd, globals, locals)
  <string>(1)<module>()
> /usr/bin/barman(3)<module>()
-> __requires__ = 'barman==3.10.0'
(Pdb) exit()
[barman@edb-pem-server ~]$

在以為我要多吞幾顆葉黃素的時候,我不死心繼續往來源 function 的回傳位置找

底下除了找回傳值,由於查詢的 SQL 是依著不同版號的資料庫判斷生成的,也把 SQL 抓出來丟進去 psql

[barman@edb-pem-server ~]$ python3 -m pdb /usr/bin/barman check bkp_testenv
> /usr/bin/barman(3)<module>()
-> __requires__ = 'barman==3.10.0'
(Pdb) break barman/postgres.py:633
Breakpoint 1 at /usr/lib/python3.6/site-packages/barman/postgres.py:633
(Pdb) continue
Server bkp_testenv:
> /usr/lib/python3.6/site-packages/barman/postgres.py(633)has_backup_privileges()
-> return cur.fetchone()[0]
(Pdb) p cur.fetchone()[0]
False
(Pdb) p backup_check_query
"\n        SELECT\n          usesuper\n          OR\n          (\n            (\n              pg_has_role(CURRENT_USER, 'pg_monitor', 'MEMBER')\n              OR\n              (\n                pg_has_role(CURRENT_USER, 'pg_read_all_settings', 'MEMBER')\n                AND pg_has_role(CURRENT_USER, 'pg_read_all_stats', 'MEMBER')\n              )\n            )\n            AND\n            (\n                has_function_privilege(CURRENT_USER, 'pg_backup_start(text,bool)', 'EXECUTE')\n            )\n            AND\n            (\n                has_function_privilege(CURRENT_USER, 'pg_backup_stop(bool)', 'EXECUTE')\n            )\n            AND has_function_privilege(\n              CURRENT_USER, 'pg_switch_wal()', 'EXECUTE')\n            AND has_function_privilege(\n              CURRENT_USER, 'pg_create_restore_point(text)', 'EXECUTE')\n          )\n        FROM\n          pg_user\n        WHERE\n          usename = CURRENT_USER\n        "
(Pdb)   check timeout: FAILED (barman check command timed out)
The program exited via sys.exit(). Exit status: 1
> /usr/bin/barman(3)<module>()
-> __requires__ = 'barman==3.10.0'
(Pdb) 

在 psql 裡面,把那句 SQL 的 \n 拿掉,丟進去查看看。。。

但。。。怎麼查出來是 true。。。。。。。。。。。。。。。?????????

[enterprisedb@edb16p ~]$ psql -U bkpuser
Null display is "(NULL)".
Timing is on.
psql (16.2.0)
Type "help" for help.

[[local]] edb=>         SELECT
[[local]] edb->           usesuper
[[local]] edb->           OR
[[local]] edb->           (
[[local]] edb(>             (
[[local]] edb(>               pg_has_role(CURRENT_USER, 'pg_monitor', 'MEMBER')
[[local]] edb(>               OR
[[local]] edb(>               (
[[local]] edb(>                 pg_has_role(CURRENT_USER, 'pg_read_all_settings', 'MEMBER')
[[local]] edb(>                 AND pg_has_role(CURRENT_USER, 'pg_read_all_stats', 'MEMBER')
[[local]] edb(>               )
[[local]] edb(>             )
[[local]] edb(>             AND
[[local]] edb(>             (
[[local]] edb(>                 has_function_privilege(CURRENT_USER, 'pg_backup_start(text,bool)', 'EXECUTE')
[[local]] edb(>             )
[[local]] edb(>             AND
[[local]] edb(>             (
[[local]] edb(>                 has_function_privilege(CURRENT_USER, 'pg_backup_stop(bool)', 'EXECUTE')
[[local]] edb(>             )
[[local]] edb(>             AND has_function_privilege(
[[local]] edb(>               CURRENT_USER, 'pg_switch_wal()', 'EXECUTE')
[[local]] edb(>             AND has_function_privilege(
[[local]] edb(>               CURRENT_USER, 'pg_create_restore_point(text)', 'EXECUTE')
[[local]] edb(>           )
[[local]] edb->         FROM
[[local]] edb->           pg_user
[[local]] edb->         WHERE
[[local]] edb->           usename = CURRENT_USER
[[local]] edb-> ;
 ?column?
----------
 t
(1 row)

Time: 3.404 ms
[[local]] edb=>

最後。。。只好試試看看我的連線跑到哪邊了

我這邊的測試有準備 primary/standby 的結構,並搭配 Barman 的 model 功能做測試

另外擔心會不會連接到隔壁台

[barman@edb-pem-server ~]$ python3 -m pdb /usr/bin/barman check bkp_testenv
> /usr/bin/barman(3)<module>()
-> __requires__ = 'barman==3.10.0'
(Pdb) break barman/postgres.py:633
Breakpoint 1 at /usr/lib/python3.6/site-packages/barman/postgres.py:633
(Pdb) continue
Server bkp_testenv:
> /usr/lib/python3.6/site-packages/barman/postgres.py(633)has_backup_privileges()
-> return cur.fetchone()[0]
(Pdb) self._conn.get_dsn_parameters()
{'user': 'bkpuser', 'passfile': '/var/lib/barman/.pgpass', 'channel_binding': 'prefer', 'dbname': 'postgres', 'host': 'edb16s', 'port': '5444', 'options': '', 'sslmode': 'prefer', 'sslcompression': '0', 'sslcertmode': 'allow', 'sslsni': '1', 'ssl_min_protocol_version': 'TLSv1.2', 'gssencmode': 'prefer', 'krbsrvname': 'postgres', 'gssdelegation': '0', 'target_session_attrs': 'any', 'load_balance_hosts': 'disable'}
(Pdb) 

結果。。。。。。。。。上面看到。。。。。。。我在 barman 連線的 dbname(填在 conninfo,完整參數要看手冊的 man5 頁面)我好像填的是 postgres

但我。。好像 GRANT 是在 edb 執行的。。。。。。。

檢查一下。。真的沒錯~

[enterprisedb@edb16s ~]$ psql -U bkpuser
Null display is "(NULL)".
Timing is on.
psql (16.2.0)
Type "help" for help.

[[local]] edb=>         SELECT
          usesuper
          OR
          (
            (
              pg_has_role(CURRENT_USER, 'pg_monitor', 'MEMBER')
              OR
              (
                pg_has_role(CURRENT_USER, 'pg_read_all_settings', 'MEMBER')
                AND pg_has_role(CURRENT_USER, 'pg_read_all_stats', 'MEMBER')
              )
            )
            AND
            (
                has_function_privilege(CURRENT_USER, 'pg_backup_start(text,bool)', 'EXECUTE')
            )
            AND
            (
                has_function_privilege(CURRENT_USER, 'pg_backup_stop(bool)', 'EXECUTE')
            )
            AND has_function_privilege(
              CURRENT_USER, 'pg_switch_wal()', 'EXECUTE')
            AND has_function_privilege(
              CURRENT_USER, 'pg_create_restore_point(text)', 'EXECUTE')
          )
        FROM
          pg_user
        WHERE
          usename = CURRENT_USER;
 ?column?
----------
 t
(1 row)

Time: 3.125 ms
[[local]] edb=> \c postgres
You are now connected to database "postgres" as user "bkpuser".
[[local]] postgres=>         SELECT
          usesuper
          OR
          (
            (
              pg_has_role(CURRENT_USER, 'pg_monitor', 'MEMBER')
              OR
              (
                pg_has_role(CURRENT_USER, 'pg_read_all_settings', 'MEMBER')
                AND pg_has_role(CURRENT_USER, 'pg_read_all_stats', 'MEMBER')
              )
            )
            AND
            (
                has_function_privilege(CURRENT_USER, 'pg_backup_start(text,bool)', 'EXECUTE')
            )
            AND
            (
                has_function_privilege(CURRENT_USER, 'pg_backup_stop(bool)', 'EXECUTE')
            )
            AND has_function_privilege(
              CURRENT_USER, 'pg_switch_wal()', 'EXECUTE')
            AND has_function_privilege(
              CURRENT_USER, 'pg_create_restore_point(text)', 'EXECUTE')
          )
        FROM
          pg_user
        WHERE
          usename = CURRENT_USER;
 ?column?
----------
 f
(1 row)

Time: 2.960 ms
[[local]] postgres=> 

所以釐清原因:function grant 所在的 database 跟 barman 設定檔的連線字串 database 不同,所以沒看到。

不過順帶一提,group 的權限可以跨 database 所以沒差,只有 function 會因為不同 database 而影響(找個安慰自己的托詞)


這個告訴我自己:

  1. 就算手冊放在那邊,還是得多加測試做好準備

  2. barman 這個 non-superuser 功能是正常的

  3. python -m pdb 可以用來 debug,不需要手工安插讓人尷尬的一句程式碼 import pdb; pdb.set_trace(); 就能夠開啟 debug 功能

  4. 該吃葉黃素或是胡蘿蔔素惹嗎?



參考資料

How to step through Python code to help debug issues? - Stack Overflow

what is the difference between "next" and "until" in pdb - Stack Overflow

pdb — The Python Debugger — Python 3.12.3 documentation

How to print all variables values when debugging Python with pdb, without specifying each variable? - Stack Overflow

Python Debugger (pdb): Navigating through multi-module code using pdb - Stack Overflow

Python's many command-line utilities - Python Morsels


沒有留言:

張貼留言