MathJax

MathJax-2

MathJax-3

Google Code Prettify

置頂入手筆記

EnterproseDB Quickstart — 快速入門筆記

由於考慮採用 EnterpriseDB 或是直接用 PostgreSQL 的人,通常需要一些入手的資料。這邊紀錄便提供相關快速上手的簡單筆記 ~ 這篇筆記以 資料庫安裝完畢後的快速使用 為目標,基本紀錄登入使用的範例:

2020年9月13日 星期日

PGSQL 外掛 DIY 筆記—數值 array 排序練習

開源軟體的特點,就是讓你可以跳下去寫程式。PGSQL 也不例外~~然而,除了核心專案本身的開發之外,熱門軟體常常有提供外掛開發 API,讓使用者擴充功能(例如,貼近生活的瀏覽器 Chrome 外掛Firefox 外掛,文書編輯軟體 Libreoffice,或是工作上會使用的新興商業軟體,如 Tableau 等~)
在 Postgres 裡面,外掛的開發蠻貼近原始碼的,這邊就認識一下最簡易的開發,自製外掛~
雖然已經有幾篇不錯的參考教學了(見這篇筆記後的參考資料),但這邊還是紀錄一篇用自己的話來述說的筆記。

因為 PostgreSQL 是老牌的開源軟體專案,它被開發時所使用的語言稍微有一點歷史(C 與 C++)~不過裡面的 API 也相對穩定,很多開源專案也都是用 C/C++ 開發的;另外,雖然 PostgreSQL 是用 C 寫成的產物,但內部有它自己的語言結構,以反映時代的趨勢~

雖然開發 PGSQL 外掛還不需要觸碰到內部原始碼,甚至打包外掛也不一定要使用 C 才行,這邊還是選擇從 C 作開發的練習。

開發一個 PGSQL 外掛,有幾個最基本的東西要準備
 - Control File:紀錄版本資訊
 - 程式碼:分成
   a) C/C++ 的程式碼:基本上是擴增功能的本體
   b) SQL 的內容:通常是在 SQL 指令內被呼叫的「外殼」
 - Makefile:編譯都需要~

其中程式碼的部份有以下的彈性:
 1. 標準的外掛:同時有 C/C++ 真正內容的程式碼,以及 SQL 函數的參數定義
 2. 可以有「純 SQL」寫成的外掛:就如同一般寫 Stored Function/Procedure 然後打包起來。例如這篇教學範例
 3. 可以不需要 SQL 檔案:出現在直接生效,不提供 Function 呼叫的時候。例如內部的 auth_delay 外掛
 4. 用新穎一些的程式語言混合開發:例如 MADLibMulticorn Foreign Data Wrapper。一個應用的範例 envfdw

為了擴充新功能,常常需要「僭越」資料庫預設行為(例如,SQL 解析、產生執行計畫的流程中的每個步驟),因此有一群稱作 Hook 的界面,讓外掛們可以「狸貓換太子」,不需要重編譯資料庫就變更預設行為。一些跟 Hook 相關資訊可以參考底下的參考資料。

這邊先從外掛 DIY 入手。



這邊的外掛,要做到的事情是陣列成員的排序。這邊會採用 C 的標準函式庫 stdlib 裡面的 qsort() 函數,把單一個 array 內容排序後回傳。
這邊儘先簡單處理基本數值型態的 array 內容,其他的都先不處理~至於 NUMERIC 型態牽涉的構造比較深,這邊也沒有處理~~

首先,先處理共通的檔案,Makefile 跟 Control File。這兩個檔案最基本內容,是可以用複製貼上的~
首先,Control FIle 在敘述這個外掛的資訊(檔名:num_array_sort/num_array_sort.control)
comment = 'Numeric type array sorting'
default_version = '0.0.1'
relocatable = true
module_pathname = '$libdir/num_array_sort'

然後 Makefile 就是編譯指令的匯總步驟。這邊重點不在教學 Makefile,而是在寫出 PGSQL 外掛的基本骨幹。(檔名:num_array_sort/Makefile)
MODULES = num_array_sort
EXTENSION = num_array_sort         # Extension Name
DATA = num_array_sort--0.0.1.sql   # Installation SQL

PG_CONFIG = pg_config
PGXS := $(shell $(PG_CONFIG) --pgxs)
include $(PGXS)

接著是 C 的程式碼。這部份有一些基本起手式在的~(檔名:num_array_sort/num_array_sort.c)
#include <stdlib.h>
#include "postgres.h"
#include "fmgr.h"
#if PG_VERSION_NUM >= 110004
#include "catalog/pg_type_d.h"
#else
#include "catalog/pg_type.h"
#endif
#include "utils/array.h"
#include "utils/arrayaccess.h"
#include "utils/lsyscache.h"

PG_MODULE_MAGIC;


int compare_int2(const void *a, const void *b);
int compare_int4(const void *a, const void *b);
int compare_int8(const void *a, const void *b);
int compare_float4(const void *a,const void *b);
int compare_float8(const void *a,const void *b);


PG_FUNCTION_INFO_V1(num_array_sort);

Datum num_array_sort(PG_FUNCTION_ARGS) {
  ArrayType *currentArray = PG_GETARG_ARRAYTYPE_P(0);
  void* arrayptr;
  int arrayLength;
  Oid elemTypeId;
  int i;

  // We only handle one-dimensional arrays
  if (ARR_NDIM(currentArray) != 1) {
    ereport(ERROR, (errmsg("We only handle one-dimensional array")));
  }

  // Collect necessary array information
  elemTypeId = ARR_ELEMTYPE(currentArray);
  arrayLength = (ARR_DIMS(currentArray))[0];
  arrayptr = ARR_DATA_PTR(currentArray);
  ereport(INFO, (errmsg("Array length is %d", arrayLength)));

  // Check datatype, then call GCC qsort() with each comparison function
  switch (elemTypeId) {
 // Datatype INT16
 case INT2OID:
      for (i=0; i < arrayLength; i++)
        ereport(INFO, (errmsg("No. %d-th element is %d", i, ((int16 *) arrayptr)[i])));
   qsort (arrayptr, arrayLength, sizeof (int16), compare_int2);
      break;
 // Datatype INT32
    case INT4OID:
      for (i=0; i < arrayLength; i++)
        ereport(INFO, (errmsg("No. %d-th element is %d", i, ((int32 *) arrayptr)[i])));
      qsort (arrayptr, arrayLength, sizeof (int32), compare_int4);
      break;
 // Datatype INT64
    case INT8OID:
      for (i=0; i < arrayLength; i++)
        ereport(INFO, (errmsg("No. %d-th element is %ld", i, ((int64 *) arrayptr)[i])));
      qsort (arrayptr, arrayLength, sizeof (int64), compare_int8);
      break;
    // Datatype FLOAT4
    case FLOAT4OID:
      for (i=0; i < arrayLength; i++)
        ereport(INFO, (errmsg("No. %d-th element is %f", i, ((float4 *) arrayptr)[i])));
      qsort (arrayptr, arrayLength, sizeof (float4), compare_float4);
      break;
    // Datatype FLOAT8
    case FLOAT8OID:
      for (i=0; i < arrayLength; i++)
        ereport(INFO, (errmsg("No. %d-th element is %f", i, ((float8 *) arrayptr)[i])));
      qsort (arrayptr, arrayLength, sizeof (float8), compare_float8);
      break;
    default:
      elog(ERROR, "Unsupported type of array");
  }

  PG_RETURN_ARRAYTYPE_P(currentArray);
}


// We don't need to use DatumGetxxx() macro for below compare functions

int compare_int2(const void *a, const void *b) {
  int16 *x = (int16 *) a;
  int16 *y = (int16 *) b;
  ereport(INFO, (errmsg("Comparing int16 x=%d and y=%d", *x, *y)));
  return *x - *y;
}

int compare_int4(const void *a, const void *b) {
  int32 *x = (int32 *) a;
  int32 *y = (int32 *) b;
  ereport(INFO, (errmsg("Comparing int32 x=%d and y=%d", *x, *y)));
  return *x - *y;
}

int compare_int8(const void *a, const void *b) {
  int64 *x = (int64 *) a;
  int64 *y = (int64 *) b;
  ereport(INFO, (errmsg("Comparing int64 x=%ld and y=%ld", *x, *y)));
  return *x - *y;
}

int compare_float4(const void *a,const void *b) {
  //ereport(INFO, (errmsg("a=%f and b=%f", *(float4 *) a, *(float4 *) b)));
  float4 *x = (float4 *) a;
  float4 *y = (float4 *) b;
  ereport(INFO, (errmsg("Comparing float4 x=%f and y=%f", *x, *y)));
  // Don't return (*x - *y;) for float
  if (*x < *y) 
    return -1;
  else if (*x > *y) 
    return 1;
  return 0;
}

int compare_float8(const void *a,const void *b) {
  float8 *x = (float8 *) a;
  float8 *y = (float8 *) b;
  ereport(INFO, (errmsg("Comparing float8 x=%f and y=%f", *x, *y)));
  // Don't return (*x - *y;) for float
  if (*x < *y) 
    return -1;
  else if (*x > *y) 
    return 1;
  return 0;
}

上面的內容有一些部份是必然需要的
 - 必要的 Header:

  1. postgres.h 看名字也知道一定要~
  2. fmgr.h 則是 Function Manager 的縮寫,提供參數/回傳值的 API。這個要搭配 PG_MODULE_MAGIC 這個 Macro 宣告,說明這程式使用 PGSQL 外掛界面才行

 - PG_FUNCTION_INFO_V1 界面的 Macro 函數:這個可以看成向資料庫註冊的意思。因為歷史因素,這邊叫第一代(以前有第零代),目前也只有這個~
傳入的參數 PG_FUNCTION_ARGS 與 PG_GETARG_xxx() 是成對的輸入值擷取語法
 - 回傳型態 Datum 與 PG_RETURN_xxx() 是成對的輸出值回傳語法

然後,開發程式,最重要的就是弄清楚衍伸的資料型態跟相關參數與函數,以及結果的回傳方式三項最基本。這邊對三項作一點筆記
 1) 資料型態
最簡單的就是 Datum,相當於 C 裡面的 void * 這個 Pointer 型態,通常表示一個資料的 Pointer。但這個指出現在 function returning type 而已。
實際上在 C 裡面操作的 Datatype,需要對應到 DB 內部的 Datatype,也就是手冊的這張表~基本的 Datatype 都宣告在 "postgres.h" 裡面。

 2) PG_FUNCTION_ARGS 作為函數的參數,就自動由資料庫依照 SQL 指定的餵進來
在 function body 裡面要接收傳入的參數,需要依照 DB 內的 Datatype,使用對應的函數 PG_GETARG_xxx(),依序取得相關的值(Pass by Value 的方式)

 3) 函數宣告回傳值 Datum,會搭配函數內部回傳採用的函數,完成結果的回傳(不是單純用普通的 return 而已~)
  • 在 function body 的最後,會使用相對應的 PG_RETURN_xxx() 回傳單一值;
  • 如果是回傳一筆欄位的值,則需要多引用一個 Header,"funcapi.h",才能用相關的函數,定義相關的欄位形式;
  • 若是一個 Set-Returning function,就比較複雜了,會需要使用 FuncCallContext 這個 struct 了。

上面的 2) 與 3) 主要都宣告在 "fmgr.h" 裡面

此外,在 C 裡面通常使用 malloc() 與 free(),宣告/釋放動態記憶體空間。但是在這邊,需要統籌由 Postgres 控制,因此需要使用 Postgres 提供的 palloc() 與 pfree() 才行(這兩組內部是一樣的)。

目前感到需要花力氣的地方,就是努力在 pg_config --includedir 裡面(或是 src/include/ 裡面)所提供的 Header files 裡面尋找需要的宣告。。。這一點只能釐清要處理的問題,從其他相關外掛或是相關段落原始碼反覆查看怎麼使用了。目前看到比較常用的是 src/include/utils/ 裡面的內容~如果有需要撈系統表,則是  src/include/catalog/ 。

Postgres 專案對於程式碼排版有一些規約,可以參考這邊,這邊的練習也沒有遵守這個~(例如,註解的形狀)。不過建議實際的專案盡量追隨規範比較好~~

最後,這個 DIY 做的是可以被呼叫的東西,因此需要給一個 Stored Function。(檔名:num_array_sort/num_array_sort--0.0.1.sql)
CREATE OR REPLACE FUNCTION num_array_sort(anyarray) 
RETURNS anyarray 
AS 'MODULE_PATHNAME', 'num_array_sort' 
LANGUAGE C IMMUTABLE STRICT 
 PARALLEL SAFE;

其中的 PARALLEL SAFE 並非必要的內容:若確認函數可以拆分多個 Worker 一起跑(通常是逐筆資料處理的函數)就可以增加。

上面這樣就完成一個 DIY 的外掛專案了。



接著是要安裝:原則上只要 make 跟 make install 就會安裝到軟體目錄裡面了。

這邊示範的 PGSQL 版本,使用的是 EnterpriseDB 提供的企業版資料庫 v12。由於 PGSQL 11 開始導入 LLVM 實作出 JIT 功能,EDB 12 企業版資料庫在編譯上有相關的開發環境需要設置(llvm 5.0 的 header file,以及特定的 clang 版本的套件)。這個狀況主要是編譯的 Try-and-error 過程中,觀察 pg_config 指令列舉的  Include / Link 參數得知的。
[root@edbdev ~]# /usr/edb/as12/bin/pg_config --configure
'--prefix=/usr/edb/as12' '--includedir=/usr/edb/as12/include' '--mandir=/usr/edb/as12/share/man' '--datadir=/usr/edb/as12/share' '--libdir=/usr/edb/as12/lib' 'LLVM_CONFIG=/usr/lib64/llvm5.0/bin/llvm-config' 'CLANG=/opt/rh/llvm-toolset-7/root/usr/bin/clang' '--with-llvm' '--with-perl' '--with-python' '--with-tcl' '--with-tclconfig=/usr/lib64' '--with-openssl' '--with-pam' '--with-krb5' '--with-gssapi' '--with-includes=/usr/include' '--with-libraries=/usr/lib64' '--enable-nls' '--with-ossp-uuid' '--with-libxml' '--with-libxslt' '--with-ldap' '--with-systemd' '--with-libcurl' '--with-icu' '--with-oci=/usr/include/oracle/10.2.0.5/client64' '--with-system-tzdata=/usr/share/zoneinfo' '--sysconfdir=/etc/sysconfig/edb/as12' '--docdir=/usr/share/doc' '--with-pgport=5444' 'CFLAGS=-I/usr/libexec/edb-as12-icu53.1/include/ -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic -I/usr/include/et' 'LDFLAGS=-L/usr/libexec/edb-as12-icu53.1/lib/ ' 'CPPFLAGS= -I/usr/include/et' 'CXXFLAGS= -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches   -m64 -mtune=generic' 'PKG_CONFIG_PATH=:/usr/lib64/pkgconfig:/usr/share/pkgconfig' 'ICU_CFLAGS=-I/usr/libexec/edb-as12-icu53.1/include/' 'ICU_LIBS=-Wl,-rpath,/usr/libexec/edb-as12-icu53.1/lib -L/usr/libexec/edb-as12-icu53.1/lib -licui18n -licuuc -licudata' 'PYTHON=/usr/bin/python2'
[root@edbdev ~]# 

以上黃色標記的段落,是 Software Collenction 提供的 CLang 的檔案路徑。該套件可以從 Clang and LLVM Toolset 7.0 — Software Collections 取得。
這一步驟,通常取決於所採用的 PGSQL Distribution,每個版本的官方執行檔,可能也有不同的組合。因此在查詢上面的 pg_config 之後,就要作一點點功課,尋找編譯時對應的 Library。

以下便針對相關環境作準備:這邊獨立準備一個編譯環境,不包含運作中的資料庫,純粹用來編譯(Note:這邊的環境用 LXC Container 準備的~很快就「長」一個 OS 出來了)
[root@edbdev ~]# yum group install -y "Development Tools"
[root@edbdev ~]# ## 裝 EDB 12 header
[root@edbdev ~]# yum install -y edb-as12-server-devel
[root@edbdev ~]# ## 裝 LLVM5.0 header
[root@edbdev ~]# yum install -y epel-release
[root@edbdev ~]# yum install -y llvm5.0-devel
[root@edbdev ~]# ## 裝 CLang
[root@edbdev ~]# yum install -y centos-release-scl
[root@edbdev ~]# ## 也可以直接編輯 repo 檔
[root@edbdev ~]# yum-config-manager --enable rhel-server-rhscl-7-rpms
[root@edbdev ~]# yum install -y llvm-toolset-7.0
[root@edbdev ~]# ##yum install -y devtoolset-7

使用 Software Collection 的軟體,通常都需要切換到相關的「環境」:可以想成類似 Python 的 VirtualEnv 功能。
啟用方式參考
https://stackoverflow.com/questions/18338045/enabling-software-collections-redhat-developer-toolset
/opt/rh/llvm-toolset-7/root/usr/bin/clang: Command not found_一名数据库爱好者的专栏-CSDN博客_make: /opt/rh/llvm-toolset-7/root/usr/bin/clang: c
https://github.com/theory/pg-semver/issues/35
https://github.com/timescale/prometheus-postgresql-adapter/issues/58
https://developers.redhat.com/blog/2018/07/07/yum-install-gcc7-clang/
[root@edbdev ~]# scl enable llvm-toolset-7.0 bash
[root@edbdev ~]# source /opt/rh/llvm-toolset-7.0/enable

接著可以編譯了。編譯其實很單純,就是一個 make 指令,不過要記得先把環境變數 PATH 加入 pg_config 所在位置
[root@edbdev ~]# export PATH=$PATH:/usr/edb/as12/bin/
[root@edbdev ~]# pg_config --pgxs
/usr/edb/as12/lib/pgxs/src/makefiles/pgxs.mk
[root@edbdev ~]# 
[root@edbdev ~]# cd ~/num_array_sort
[root@edbdev num_array_sort]# make
gcc -std=gnu99 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -I/usr/libexec/edb-as12-icu53.1/include/ -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic -I/usr/include/et -fPIC -I. -I./ -I/usr/edb/as12/include/server -I/usr/edb/as12/include/internal -I/usr/libexec/edb-as12-icu53.1/include/ -I/usr/include/et -D_GNU_SOURCE -I/usr/include/libxml2  -I/usr/include  -c -o num_array_sort.o num_array_sort.c
gcc -std=gnu99 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -I/usr/libexec/edb-as12-icu53.1/include/ -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic -I/usr/include/et -fPIC num_array_sort.o -L/usr/edb/as12/lib -L/usr/libexec/edb-as12-icu53.1/lib/  -L/usr/lib64/llvm5.0/lib  -L/usr/lib64 -Wl,--as-needed -Wl,-rpath,'/usr/edb/as12/lib',--enable-new-dtags  -shared -o num_array_sort.so
/opt/rh/llvm-toolset-7/root/usr/bin/clang -Wno-ignored-attributes -fno-strict-aliasing -fwrapv -O2  -I. -I./ -I/usr/edb/as12/include/server -I/usr/edb/as12/include/internal -I/usr/libexec/edb-as12-icu53.1/include/ -I/usr/include/et -D_GNU_SOURCE -I/usr/include/libxml2  -I/usr/include -flto=thin -emit-llvm -c -o num_array_sort.bc num_array_sort.c
make: /opt/rh/llvm-toolset-7/root/usr/bin/clang: Command not found
make: *** [num_array_sort.bc] Error 127
[root@edbdev num_array_sort]# 
啊。。上面失敗了一下。。。檢視一下錯誤訊息,原因是編譯的 make 拘泥在 Toolset 的版本號 7 跟 7.0 的小差異。。。幫他補上後再跑一次:記得跑失敗要先執行 make clean
[root@edbdev num_array_sort]# ln -s /opt/rh/llvm-toolset-7.0/ /opt/rh/llvm-toolset-7
[root@edbdev num_array_sort]# make clean
rm -f num_array_sort.so num_array_sort.o  \
    num_array_sort.bc
[root@edbdev num_array_sort]#
[root@edbdev num_array_sort]# make
gcc -std=gnu99 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -I/usr/libexec/edb-as12-icu53.1/include/ -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic -I/usr/include/et -fPIC -I. -I./ -I/usr/edb/as12/include/server -I/usr/edb/as12/include/internal -I/usr/libexec/edb-as12-icu53.1/include/ -I/usr/include/et -D_GNU_SOURCE -I/usr/include/libxml2  -I/usr/include  -c -o num_array_sort.o num_array_sort.c
gcc -std=gnu99 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -I/usr/libexec/edb-as12-icu53.1/include/ -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic -I/usr/include/et -fPIC num_array_sort.o -L/usr/edb/as12/lib -L/usr/libexec/edb-as12-icu53.1/lib/  -L/usr/lib64/llvm5.0/lib  -L/usr/lib64 -Wl,--as-needed -Wl,-rpath,'/usr/edb/as12/lib',--enable-new-dtags  -shared -o num_array_sort.so
/opt/rh/llvm-toolset-7/root/usr/bin/clang -Wno-ignored-attributes -fno-strict-aliasing -fwrapv -O2  -I. -I./ -I/usr/edb/as12/include/server -I/usr/edb/as12/include/internal -I/usr/libexec/edb-as12-icu53.1/include/ -I/usr/include/et -D_GNU_SOURCE -I/usr/include/libxml2  -I/usr/include -flto=thin -emit-llvm -c -o num_array_sort.bc num_array_sort.c
[root@edb12 num_array_sort]#
接著是安裝。直接執行 make install 就會直接安裝;但若有需要帶走,放到其他環境安裝的話,則可以增加一個環境變數 DESTDIR 指定「安裝」起始位置。這樣一來就可以打包起來,拿到別處使用。這邊進行後者操作並打包起來,再手動搬動到資料庫的環境
[root@edbdev num_array_sort]# mkdir -p ~/edb_extensions/num_array_sort
[root@edbdev num_array_sort]# 
[root@edbdev num_array_sort]# make install DESTDIR=~/edb_extensions/num_array_sort
/bin/mkdir -p '/root/edb_extensions/num_array_sort/usr/edb/as12/share/extension'
/bin/mkdir -p '/root/edb_extensions/num_array_sort/usr/edb/as12/share/extension'
/bin/mkdir -p '/root/edb_extensions/num_array_sort/usr/edb/as12/lib'
/bin/install -c -m 644 .//num_array_sort.control '/root/edb_extensions/num_array_sort/usr/edb/as12/share/extension/'
/bin/install -c -m 644 .//num_array_sort--0.0.1.sql  '/root/edb_extensions/num_array_sort/usr/edb/as12/share/extension/'
/bin/install -c -m 755  num_array_sort.so '/root/edb_extensions/num_array_sort/usr/edb/as12/lib/'
/bin/mkdir -p '/root/edb_extensions/num_array_sort/usr/edb/as12/lib/bitcode/num_array_sort'
/bin/mkdir -p '/root/edb_extensions/num_array_sort/usr/edb/as12/lib/bitcode'/num_array_sort/
/bin/install -c -m 644 num_array_sort.bc '/root/edb_extensions/num_array_sort/usr/edb/as12/lib/bitcode'/num_array_sort/./
cd '/root/edb_extensions/num_array_sort/usr/edb/as12/lib/bitcode' && /usr/lib64/llvm5.0/bin/llvm-lto -thinlto -thinlto-action=thinlink -o num_array_sort.index.bc num_array_sort/num_array_sort.bc
error: can't create module summary index for buffer: Invalid summary version 4, 1, 2 or 3 expected
LLVM ERROR: ThinLink didn't create an index
make: *** [install] Error 1
[root@edbdev num_array_sort]# 
[root@edbdev num_array_sort]# cd ~/edb_extensions/num_array_sort/
[root@edbdev num_array_sort]# tar -czvf num_array_sort.tgz ./*
./usr/
./usr/edb/
./usr/edb/as12/
./usr/edb/as12/share/
./usr/edb/as12/share/extension/
./usr/edb/as12/share/extension/num_array_sort.control
./usr/edb/as12/share/extension/num_array_sort--0.0.1.sql
./usr/edb/as12/lib/
./usr/edb/as12/lib/bitcode/
./usr/edb/as12/lib/bitcode/num_array_sort/
./usr/edb/as12/lib/bitcode/num_array_sort/num_array_sort.bc
./usr/edb/as12/lib/num_array_sort.so
[root@edbdev num_array_sort]# 

啊啊。。上面又出了點小錯。。。少了 _.index.bc 檔。。。不過不會影響我們的操作~未來有機會多認識一點 LLVM 之後再來排解~

然後這包 tarball 檔案就可以帶到真正的資料庫主機進行安裝。注意,把 binary 放置完畢之後,資料庫需要重新啟動,才有辦法載入新增的 Shared Object(就是上面編譯完的 library)
[root@edb12 ~]# tar -xvf ~/num_array_sort.tgz -C /
./usr/
./usr/edb/
./usr/edb/as12/
./usr/edb/as12/share/
./usr/edb/as12/share/extension/
./usr/edb/as12/share/extension/num_array_sort.control
./usr/edb/as12/share/extension/num_array_sort--0.0.1.sql
./usr/edb/as12/lib/
./usr/edb/as12/lib/bitcode/
./usr/edb/as12/lib/bitcode/num_array_sort/
./usr/edb/as12/lib/bitcode/num_array_sort/num_array_sort.bc
./usr/edb/as12/lib/num_array_sort.so
[root@edb12 ~]# 
[root@edb12 ~]# service edb-as-12 restart
Redirecting to /bin/systemctl restart edb-as-12.service
[root@edb12 ~]# 

然後就是測試了~我們會在系統表裡面看到新增的外掛,把他載入之後就可以呼叫了
edb=# select name from pg_available_extensions where name like 'n%' order by 1;
      name      
----------------
 num_array_sort
(1 row)

edb=# 
edb=# create extension num_array_sort ;
CREATE EXTENSION
edb=# 
edb=# 
edb=# select num_array_sort('{1,72,19,3,37,56,1000,77}'::int2[]);
INFO:  Array length is 8
INFO:  No. 0-th element is 1
INFO:  No. 1-th element is 72
INFO:  No. 2-th element is 19
INFO:  No. 3-th element is 3
INFO:  No. 4-th element is 37
INFO:  No. 5-th element is 56
INFO:  No. 6-th element is 1000
INFO:  No. 7-th element is 77
INFO:  Comparing int16 x=1 and y=72
INFO:  Comparing int16 x=72 and y=19
INFO:  Comparing int16 x=1 and y=37
INFO:  Comparing int16 x=37 and y=77
INFO:  Comparing int16 x=72 and y=37
INFO:  Comparing int16 x=77 and y=37
INFO:  Comparing int16 x=1000 and y=37
INFO:  Comparing int16 x=56 and y=37
INFO:  Comparing int16 x=1 and y=37
INFO:  Comparing int16 x=19 and y=37
INFO:  Comparing int16 x=3 and y=37
INFO:  Comparing int16 x=3 and y=1
INFO:  Comparing int16 x=3 and y=19
INFO:  Comparing int16 x=72 and y=56
INFO:  Comparing int16 x=72 and y=1000
INFO:  Comparing int16 x=1000 and y=77
INFO:  Comparing int16 x=72 and y=77
      num_array_sort       
---------------------------
 {1,3,19,37,56,72,77,1000}
(1 row)

edb=# 
edb=# select num_array_sort('{1,72,19,3,37,56,1000000,77}'::int4[]);
INFO:  Array length is 8
INFO:  No. 0-th element is 1
INFO:  No. 1-th element is 72
INFO:  No. 2-th element is 19
INFO:  No. 3-th element is 3
INFO:  No. 4-th element is 37
INFO:  No. 5-th element is 56
INFO:  No. 6-th element is 1000000
INFO:  No. 7-th element is 77
INFO:  Comparing int32 x=1 and y=72
INFO:  Comparing int32 x=72 and y=19
INFO:  Comparing int32 x=1 and y=37
INFO:  Comparing int32 x=37 and y=77
INFO:  Comparing int32 x=72 and y=37
INFO:  Comparing int32 x=77 and y=37
INFO:  Comparing int32 x=1000000 and y=37
INFO:  Comparing int32 x=56 and y=37
INFO:  Comparing int32 x=1 and y=37
INFO:  Comparing int32 x=19 and y=37
INFO:  Comparing int32 x=3 and y=37
INFO:  Comparing int32 x=3 and y=1
INFO:  Comparing int32 x=3 and y=19
INFO:  Comparing int32 x=72 and y=56
INFO:  Comparing int32 x=72 and y=1000000
INFO:  Comparing int32 x=1000000 and y=77
INFO:  Comparing int32 x=72 and y=77
        num_array_sort        
------------------------------
 {1,3,19,37,56,72,77,1000000}
(1 row)

edb=# 
edb=# select num_array_sort('{1,72,19.4,19.0,3,37,56,100000.3,77}'::float4[]);
INFO:  Array length is 9
INFO:  No. 0-th element is 1.000000
INFO:  No. 1-th element is 72.000000
INFO:  No. 2-th element is 19.400000
INFO:  No. 3-th element is 19.000000
INFO:  No. 4-th element is 3.000000
INFO:  No. 5-th element is 37.000000
INFO:  No. 6-th element is 56.000000
INFO:  No. 7-th element is 100000.296875
INFO:  No. 8-th element is 77.000000
INFO:  Comparing float4 x=1.000000 and y=72.000000
INFO:  Comparing float4 x=72.000000 and y=19.400000
INFO:  Comparing float4 x=1.000000 and y=3.000000
INFO:  Comparing float4 x=3.000000 and y=77.000000
INFO:  Comparing float4 x=72.000000 and y=3.000000
INFO:  Comparing float4 x=77.000000 and y=3.000000
INFO:  Comparing float4 x=100000.296875 and y=3.000000
INFO:  Comparing float4 x=56.000000 and y=3.000000
INFO:  Comparing float4 x=37.000000 and y=3.000000
INFO:  Comparing float4 x=1.000000 and y=3.000000
INFO:  Comparing float4 x=19.400000 and y=3.000000
INFO:  Comparing float4 x=19.000000 and y=3.000000
INFO:  Comparing float4 x=19.400000 and y=3.000000
INFO:  Comparing float4 x=19.400000 and y=19.000000
INFO:  Comparing float4 x=19.000000 and y=37.000000
INFO:  Comparing float4 x=72.000000 and y=37.000000
INFO:  Comparing float4 x=77.000000 and y=37.000000
INFO:  Comparing float4 x=100000.296875 and y=37.000000
INFO:  Comparing float4 x=56.000000 and y=37.000000
INFO:  Comparing float4 x=19.400000 and y=37.000000
INFO:  Comparing float4 x=19.400000 and y=19.000000
INFO:  Comparing float4 x=72.000000 and y=56.000000
INFO:  Comparing float4 x=72.000000 and y=100000.296875
INFO:  Comparing float4 x=100000.296875 and y=77.000000
INFO:  Comparing float4 x=72.000000 and y=77.000000
          num_array_sort          
----------------------------------
 {1,3,19,19.4,37,56,72,77,100000}
(1 row)

edb=# 
edb=# select num_array_sort('{1,72,19.4,19.0,3,37,56,100000.3,77}'::float8[]);
INFO:  Array length is 9
INFO:  No. 0-th element is 1.000000
INFO:  No. 1-th element is 72.000000
INFO:  No. 2-th element is 19.400000
INFO:  No. 3-th element is 19.000000
INFO:  No. 4-th element is 3.000000
INFO:  No. 5-th element is 37.000000
INFO:  No. 6-th element is 56.000000
INFO:  No. 7-th element is 100000.300000
INFO:  No. 8-th element is 77.000000
INFO:  Comparing float8 x=1.000000 and y=72.000000
INFO:  Comparing float8 x=72.000000 and y=19.400000
INFO:  Comparing float8 x=1.000000 and y=3.000000
INFO:  Comparing float8 x=3.000000 and y=77.000000
INFO:  Comparing float8 x=72.000000 and y=3.000000
INFO:  Comparing float8 x=77.000000 and y=3.000000
INFO:  Comparing float8 x=100000.300000 and y=3.000000
INFO:  Comparing float8 x=56.000000 and y=3.000000
INFO:  Comparing float8 x=37.000000 and y=3.000000
INFO:  Comparing float8 x=1.000000 and y=3.000000
INFO:  Comparing float8 x=19.400000 and y=3.000000
INFO:  Comparing float8 x=19.000000 and y=3.000000
INFO:  Comparing float8 x=19.400000 and y=3.000000
INFO:  Comparing float8 x=19.400000 and y=19.000000
INFO:  Comparing float8 x=19.000000 and y=37.000000
INFO:  Comparing float8 x=72.000000 and y=37.000000
INFO:  Comparing float8 x=77.000000 and y=37.000000
INFO:  Comparing float8 x=100000.300000 and y=37.000000
INFO:  Comparing float8 x=56.000000 and y=37.000000
INFO:  Comparing float8 x=19.400000 and y=37.000000
INFO:  Comparing float8 x=19.400000 and y=19.000000
INFO:  Comparing float8 x=72.000000 and y=56.000000
INFO:  Comparing float8 x=72.000000 and y=100000.300000
INFO:  Comparing float8 x=100000.300000 and y=77.000000
INFO:  Comparing float8 x=72.000000 and y=77.000000
           num_array_sort           
------------------------------------
 {1,3,19,19.4,37,56,72,77,100000.3}
(1 row)

edb=# 
edb=# select num_array_sort('{1,72,19.4,19.0,3,37,56,100000.3,77}'::real[]);
INFO:  Array length is 9
INFO:  No. 0-th element is 1.000000
INFO:  No. 1-th element is 72.000000
INFO:  No. 2-th element is 19.400000
INFO:  No. 3-th element is 19.000000
INFO:  No. 4-th element is 3.000000
INFO:  No. 5-th element is 37.000000
INFO:  No. 6-th element is 56.000000
INFO:  No. 7-th element is 100000.296875
INFO:  No. 8-th element is 77.000000
INFO:  Comparing float4 x=1.000000 and y=72.000000
INFO:  Comparing float4 x=72.000000 and y=19.400000
INFO:  Comparing float4 x=1.000000 and y=3.000000
INFO:  Comparing float4 x=3.000000 and y=77.000000
INFO:  Comparing float4 x=72.000000 and y=3.000000
INFO:  Comparing float4 x=77.000000 and y=3.000000
INFO:  Comparing float4 x=100000.296875 and y=3.000000
INFO:  Comparing float4 x=56.000000 and y=3.000000
INFO:  Comparing float4 x=37.000000 and y=3.000000
INFO:  Comparing float4 x=1.000000 and y=3.000000
INFO:  Comparing float4 x=19.400000 and y=3.000000
INFO:  Comparing float4 x=19.000000 and y=3.000000
INFO:  Comparing float4 x=19.400000 and y=3.000000
INFO:  Comparing float4 x=19.400000 and y=19.000000
INFO:  Comparing float4 x=19.000000 and y=37.000000
INFO:  Comparing float4 x=72.000000 and y=37.000000
INFO:  Comparing float4 x=77.000000 and y=37.000000
INFO:  Comparing float4 x=100000.296875 and y=37.000000
INFO:  Comparing float4 x=56.000000 and y=37.000000
INFO:  Comparing float4 x=19.400000 and y=37.000000
INFO:  Comparing float4 x=19.400000 and y=19.000000
INFO:  Comparing float4 x=72.000000 and y=56.000000
INFO:  Comparing float4 x=72.000000 and y=100000.296875
INFO:  Comparing float4 x=100000.296875 and y=77.000000
INFO:  Comparing float4 x=72.000000 and y=77.000000
          num_array_sort          
----------------------------------
 {1,3,19,19.4,37,56,72,77,100000}
(1 row)

edb=# 

上面的示範中,放了一些旗子作標示,所以輸出內容多了一點~只要註解掉重編譯一份就可以了~


這篇筆記就紀錄到這邊,這份筆記也有在 EDB 11 測試過,幾乎是通用的。
這一篇筆記,可以說是作為 PGSQL 使用者的最後一哩路~接著就。。。往原始碼貢獻者邁進惹~

雖然開源專案的原始碼都放在那邊,讓所有路人都可以打開來看一看(尤其像 PostgreSQL 這樣一個不只是 Open Source SOftware,更是 Free and Open Source Software)。但其實要有辦法切入程式碼,還是需要對軟體的整體運作有一定程度的了解才行。
例如,在 PostgreSQL 這邊,除了作為一般用戶要了解的 RDBMS 操作(CRUD、Transaction、Isolation Level、Locking、Driver)之外,還得要有架構面(Query Processing、Session/Memory Resource Management、Storage Internal、系統表運作、Write-Ahead Log 機制)、維運功能面向(備份機制的設計、Replication)以及 Stored Function/Procedure 的操作與認識(包含 Aggregation Function、 Trigger / Event Trigger 與 Datatype / Operator 之類)。還得去看一點 The Internals of PostgreSQL、聽一點 Youtube 上面 “Hacking PostgreSQL” 相關的演講(與投影片),才比較有概念了解怎麼切入程式碼跟手冊的相關頁面閱讀。。。不會僅僅是「Backend Developer」或是「Database Administrator」這樣「對立」角色分野之下,可以單方面看到的;兩種視角只能綜合在一起,才可能看清楚程式碼的運作方式。

當然,這篇筆記,雖然是最後一哩路,其實也只是一個開始~PGSQL 不僅僅是 OSS 軟體,更作為一個 FOSS 軟體,只要還有人想要使用,就會一直有改進。了解了這些,並用正體中文作一個紀錄,希望可以幫助到有那麼樣的興趣的人,可以找到一點資源,切入這個世界的角落。


參考資料


用 C 寫 Stored Procedure
PostgreSQL 扩展开发基础教程 - ≈正念≈Markdown 版本~
PostgreSQL: Documentation: Current: 38.10. C-Language Functions
Writing PostgreSQL Functions in C LG #139
Creating, building and using PostgreSQL extensions in C/C++ – Sebastian Mader
Creating New Modules using PostgreSQL Create Extension | Severalnines
PostgreSQL: Simple C extension Development for a Novice User (and Performance Advantages) - Percona Database Performance Blog
Writing Postgres Extensions - The Basics - Big elephants
Writing Postgres Extensions - Types and Operators - Big elephants
Writing Postgres Extensions - Debugging - Big elephants
Writing Postgres Extensions - Testing - Big elephants
Writing Postgres Extensions Code Organization and Versioning - Big elephants
Writing PostgreSQL Extensions is Fun – C Language - Percona Database Performance Blog
Overview of Server-Side Programming in PostgreSQL - pgDash
A Faster, Lightweight Trigger Function in C for PostgreSQL - Percona Database Performance Blog
Jignesh Shah's Blog: PostgreSQL Extensions - A Deeper Look
A Guide to Create User-Defined Extension Modules to Postgres – Highgo Software Inc.
How to create, test and debug an extension written in C for PostgreSQL – Highgo Software Inc.
Creating a PostgreSQL procedural language - Part 1 - Setup - 2ndQuadrant | PostgreSQL
Creating a PostgreSQL procedural language – Part 2 – Embedding Julia - 2ndQuadrant | PostgreSQL
Creating a PostgreSQL procedural language - Part 3 - Executing User Code - 2ndQuadrant | PostgreSQL
Creating a PostgreSQL procedural language - Part 4 - Handling Input Parameters - 2ndQuadrant | PostgreSQL
Creating a PostgreSQL procedural language - Part 5 - Returning Results - 2ndQuadrant | PostgreSQL
Postgresql 编写自定义 C 函数 | 学习笔记
https://gpdb.docs.pivotal.io/6-0Beta/ref_guide/sql_commands/CREATE_FUNCTION.html

程式碼裡面的 Hook
Hooks in postgresql-Guillaume LelargePGCon 2012 錄影
Using PostgreSQL Hooks | End Point
Hooks in Postgres - super-superuser restrictions | Michael Paquier - PostgreSQL committer
AmatanHead/psql-hooks: Unofficial documentation for PostgreSQL hooks.
フックとコールバックによる拡張 - PostgreSQL 雑記 - postgresqlグループ
基于PG数据库插件的SQL规范审核工具 - 云+社区 - 腾讯云
-基于PG数据库插件的SQL规范审核工具介绍-陈刚 | MySlide - 专注PPT分享,追随SlideShare和SpeakerDeck的脚步
blog/20180517_01.md at master · digoal/blog PostgreSQL HooK 介绍
PostgreSQL的hook机制初步学习 - 健哥的数据花园 - 博客园
PostgreSQL插件hook机制-yanzongshuai的专栏-51CTO博客PostgreSQL外掛hook機制 - IT閱讀
Hooks in PostgreSQL - 狮子歌歌的博客 | HD Blog

參與PGSQL專案的核心程式碼開發
https://www.citusdata.com/blog/2019/01/15/contributing-to-postgres/
FOSDEM 2018 - De-mystifying contributing to PostgreSQL
Introduction to Hacking PostgreSQL - Neil Conway原網站
The Internals of PostgreSQL : Introduction
[1901.01973] Looking Back at Postgres
A Study of the Contributors of PostgreSQL - Daniel M. German
Tips & Tricks for Navigating the PostgreSQL Community | Severalnines
How to contribute PostgreSQL - Hari Babu kommi
Hacking PostgreSQL - Stephen Frost
Contributing to Postgres - DZone Database
How people and companies can contribute to PostgreSQL - FUJITSU Enterprise Postgres
So, you want to be a developer? - PostgreSQL wiki
I am Developer! (And You Can Too!) - 2ndQuadrant | PostgreSQL
Postgres is the coolest database - Reason #4: It is extendable - 2ndQuadrant | PostgreSQL
PostgreSQL 源码解读(150)- PG Tools#2(BaseBackup函数)_ITPUB博客
NotifyMyFrontEnd 函数背后的数据缓冲区(三) - 健哥的数据花园 - 博客园
Postgres Execution Plans — Field Glossary - pgMustard - Medium
Trace Query Processing Internals with Debugger – Highgo Software Inc.
PgSQL · 最佳实践 · 回归测试探寻 | 数据库内核月报 - 2019 / 09
How to build and debug PostgreSQL 12 using latest Eclipse IDE on Ubuntu 18.04 – Highgo Software Inc.

ictlyh/SourceCodeAnalysis: Source code analysis of Impala, PostgreSQL, Citus and Postgres-XL
How to learn Postgresql and its internals? - Stack Overflow

A simple way found a bug born in 1997 - Highgo Software Inc.

pjungwir/aggs_for_vecs: Postgres C extension with aggregate functions for array inputs
scottgs/libpgam: PostgreSQL extensions for array math and other operations on array data
https://github.com/apastor/entropy-postgresql
sorting - C library function to do sort - Stack Overflow
Comparator function of qsort() in C - GeeksforGeeks
C library function - qsort()
Array Sort Function (The GNU C Library)
https://www.google.com/search?q=c+sorting+function
PostgreSQL: Documentation: Current: F.18. intarray
c++ - Void* array casting to float, int32, int16, etc - Stack Overflow
postgresql - Postgres C Function - Passing & Returning Numerics - Stack Overflow

https://github.com/postgres/postgres/blob/master/src/backend/executor/execMain.c
https://github.com/postgres/postgres/blob/master/contrib/earthdistance/earthdistance.c
https://github.com/postgres/postgres/tree/master/contrib/auth_delay
https://github.com/postgres/postgres/blob/master/src/backend/tcop/postgres.c
https://github.com/postgres/postgres/blob/master/src/backend/libpq/pqformat.c
https://github.com/postgres/postgres/blob/master/src/backend/tcop/postgres.c#POSTGRES main processing loop begins here
https://github.com/postgres/postgres/blob/master/src/backend/tcop/postgres.c#whereToSendOutput
https://github.com/postgres/postgres/blob/master/src/backend/libpq/pqcomm.c#pq_flush


幾個演講
Hacking PostgreSQL: Data Access Methods | UrFUx on edX - YouTube
Hacking PostgreSQL by Stephen Frost (FOSDEM’ 19) - YouTube
Stephen Frost: Hacking on PostgreSQL -- PGCon 2018 - YouTube
Hacking on PostgreSQL (PGCONF’ 13) - YouTube
How does PostgreSQL actually work? By Postgres Global Development Core-Team Member - YouTube
Hacking PostgreSQL Part 1 - YouTube
FOSDEM - video recordings
DB Hacking - PostgreSQL - YouTube

沒有留言:

張貼留言