绑定完请刷新页面
取消
刷新

分享好友

×
取消 复制
在PostgreSQL中构建基础类型
2020-06-17 17:05:50

近在做一些功能兼容的工作,做了一些数据类型的工作。其中一部分是添加新的基础类型,很多细节值得记录一下,在此进行步骤介绍以及代码介绍。

此次要添加一种新的数据类型,这种类型使用的算法和PostgreSQL使用相同的input和output函数。那么接下来就将所有步骤进行演示。

类型信息

  1. 类型名,newtype;
  2. 类型长度,pg中显示-1,定长,和bigint相同,实际占用8个字节;
  3. 使用bigint(int8或int64)的input、output、send、receive函数;
  4. 能够进行显示、排序、简单比较、建立索引;

实现方法

  1. 定义基本函数:
  • 使用unused_oids,确定oid
  • newtypein,newtypeout,newtyperecv,new*end[1]
  • 在系统表pg_proc中注册相关以上4个函数

2. 定义基础类型:

  • 在系统表pg_type中注册类型newtype
  • 在bootstrap注册基础类型[2]

3. 实验效果:

shawn@B-D53RLVDL-1650 bin % ./psql postgres
psql (13devel)
Type "help" for help.

postgres=# create table testnewtype(t1 newtype);
CREATE TABLE
postgres=# insert into testnewtype values ('1'),('2');
INSERT  2
postgres=# checkpoint ;
CHECKPOINT
postgres=# select * from testnewtype ;
 t1 
----
 1
 2
(2 rows)

postgres=# 

4. 排序问题

按照以上步骤建立基础类型后,是无法直接进行排序操作的。

postgres=# select * from testnewtype order by t1;
2020-04-12 21:43:21.710 CST [60897] ERROR:  could not identify an ordering operator for type newtype at character 36
2020-04-12 21:43:21.710 CST [60897] HINT:  Use an explicit ordering operator or modify the query.
2020-04-12 21:43:21.710 CST [60897] STATEMENT:  select * from testnewtype order by t1;
ERROR:  could not identify an ordering operator for type newtype
LINE 1: select * from testnewtype order by t1;
                                           ^
HINT:  Use an explicit ordering operator or modify the query.
postgres=#

那么接下来就需要支持排序。

4.1 根据以上报错信息可以找到以下函数堆:

(gdb) bt
#0  get_sort_group_operators (argtype=argtype@entry=562, needLT=needLT@entry=true, 
    needEQ=needEQ@entry=true, needGT=needGT@entry=false, ltOpr=ltOpr@entry=0x7ffe6113a108, 
    eqOpr=eqOpr@entry=0x7ffe6113a10c, gtOpr=gtOpr@entry=0x0, 
    isHashable=isHashable@entry=0x7ffe6113a106) at parse_oper.c:191
#1  0x0000000000579b75 in addTargetToSortList (pstate=pstate@entry=0x216adf8, tle=0x216b4a0, 
    sortlist=sortlist@entry=0x0, targetlist=0x216b5b0, sortby=0x216acd8)
    at parse_clause.c:3299
#2  0x0000000000579d4e in transformSortClause (pstate=pstate@entry=0x216adf8, 
    orderlist=0x216ad28, targetlist=targetlist@entry=0x216af58, 
    exprKind=exprKind@entry=EXPR_KIND_ORDER_BY, useSQL99=useSQL99@entry=false)
    at parse_clause.c:2607
#3  0x000000000055b444 in transformSelectStmt (stmt=0x216ab10, pstate=0x216adf8)
    at analyze.c:1244
#4  transformStmt (pstate=pstate@entry=0x216adf8, parseTree=0x216ab10) at analyze.c:301
#5  0x000000000055cdc8 in transformOptionalSelectInto (pstate=pstate@entry=0x216adf8, 
    parseTree=<optimized out>) at analyze.c:246
#6  0x000000000055cece in transformTopLevelStmt (parseTree=0x216ad78, pstate=0x216adf8)
    at analyze.c:196
#7  parse_analyze (parseTree=0x216ad78, 
    sourceText=0x2169ef8 "select * from ridt order by r1;", param*=0x0, numParams=0, 
    queryEnv=0x0) at analyze.c:116
#8  0x000000000077f77c in pg_analyze_and_rewrite (parsetree=parsetree@entry=0x216ad78, 
    query_string=query_string@entry=0x2169ef8 "select * from ridt order by r1;", 
    param*=param*@entry=0x0, numParams=numParams@entry=0, queryEnv=queryEnv@entry=0x0)
    at postgres.c:691
#9  0x000000000077fc4c in exec_simple_query (
    query_string=0x2169ef8 "select * from ridt order by r1;") at postgres.c:1155
#10 0x00000000007810be in PostgresMain (argc=<optimized out>, argv=argv@entry=0x2194f90, 
    dbname=0x2194eb8 "postgres", username=<optimized out>) at postgres.c:4298
#11 0x0000000000480ec6 in BackendRun (port=<optimized out>, port=<optimized out>)
    at postmaster.c:4510
#12 BackendStartup (port=0x218ce90) at postmaster.c:4202
#13 ServerLoop () at postmaster.c:1727
#14 0x000000000070b92e in PostmasterMain (argc=argc@entry=3, argv=argv@entry=0x2164ac0)
    at postmaster.c:1400
#15 0x0000000000481bff in main (argc=3, argv=0x2164ac0) at main.c:210
(gdb) 

PostgreSQL会在函数get_sort_group_operators内进行操作符查找:

/* get_sort_group_operators代码片段 */	
typentry = lookup_type_cache(argtype, cache_flags);

首先,函数经过lookup_type_cache,这个函数会查找pg_opclass,而这个表依赖于pg_opfamily的信息,还会查找pg_amop,pg_amop依赖于pg_operator的定义。而pg_operator又依赖于pg_proc的函数定义。

/* get_sort_group_operators代码片段 */
lt_opr = typentry->lt_opr;
eq_opr = typentry->eq_opr;
gt_opr = typentry->gt_opr;

以上表示需要三种操作符,lt,eq,gt,以及操作符对应的三种函数。

这就需要在newtype.c中添加函数体,以及在pg_proc中添加函数定义,在pg_operator添加操作符定义。

那么添加完以上代码后,会发现:

postgres=# select * from testnewtype where t1 < '2' order by t1;
2020-04-13 22:36:04.644 CST [31149] ERROR:  XX000: missing support function 1(564,564) in opfamily 567
2020-04-13 22:36:04.644 CST [31149] LOCATION:  FinishSortSupportFunction, sortsupport.c:119
2020-04-13 22:36:04.644 CST [31149] STATEMENT:  select * from testnewtype where t1 < '2' order by t1;
ERROR:  missing support function 1(564,564) in opfamily 567
postgres=#

首先根据定义找到函数栈:

(gdb) bt
#0  FinishSortSupportFunction (opfamily=567, opcintype=564, ssup=0x206a9c8) at sortsupport.c:95
#1  0x000000000089dc9d in PrepareSortSupportFromOrderingOp (orderingOp=570, ssup=0x206a9c8) at sortsupport.c:149
#2  0x00000000008a35e9 in tuplesort_begin_heap (tupDesc=<optimized out>, nkeys=1, attNums=0x20b3a40, 
    sortOperators=0x20b3a58, sortCollations=0x20b3a70, nullsFirstFlags=0x20b3a88, workMem=4096, 
    coordinate=coordinate@entry=0x0, randomAccess=false) at tuplesort.c:862
#3  0x00000000006429b4 in ExecSort (pstate=0x20447a0) at nodeSort.c:89
#4  0x0000000000618322 in ExecProcNode (node=0x20447a0) at ../../../src/include/executor/executor.h:245
#5  ExecutePlan (execute_once=<optimized out>, dest=0x20b3f40, direction=<optimized out>, numberTuples=0, 
    sendTuples=true, operation=CMD_SELECT, use_parallel_mode=<optimized out>, planstate=0x20447a0, estate=0x2044578)
    at execMain.c:1646
#6  standard_ExecutorRun (queryDesc=0x200ef28, direction=<optimized out>, count=0, execute_once=<optimized out>)
    at execMain.c:364
#7  0x000000000076838b in PortalRunSelect (portal=portal@entry=0x1fbbec8, forward=forward@entry=true, count=0, 
    count@entry=9223372036854775807, dest=dest@entry=0x20b3f40) at pquery.c:912
#8  0x00000000007695f8 in PortalRun (portal=portal@entry=0x1fbbec8, count=count@entry=9223372036854775807, 
    isTopLevel=isTopLevel@entry=true, run_once=run_once@entry=true, dest=dest@entry=0x20b3f40, 
    altdest=altdest@entry=0x20b3f40, qc=qc@entry=0x7ffd12185180) at pquery.c:756
#9  0x000000000076532c in exec_simple_query (query_string=0x1f54ef8 "select * from newt order by t1;")
    at postgres.c:1239
#10 0x00000000007666b7 in PostgresMain (argc=<optimized out>, argv=argv@entry=0x1f7ffa0, 
    dbname=0x1f7fec8 "postgres", username=<optimized out>) at postgres.c:4315
---Type <return> to continue, or q <return> to quit---
#11 0x00000000004815bb in BackendRun (port=<optimized out>, port=<optimized out>) at postmaster.c:4510
#12 BackendStartup (port=0x1f77ea0) at postmaster.c:4202
#13 ServerLoop () at postmaster.c:1727
#14 0x00000000006f4533 in PostmasterMain (argc=argc@entry=3, argv=argv@entry=0x1f4fac0) at postmaster.c:1400
#15 0x00000000004821fe in main (argc=3, argv=0x1f4fac0) at main.c:210
(gdb)

在函数FinishSortSupportFunction进行报错的。

而引起问题的部分代码为:

static void
FinishSortSupportFunction(Oid opfamily, Oid opcintype, SortSupport ssup)
{
	Oid			sortSupportFunction;

	/* Look for a sort support function */
	sortSupportFunction = get_opfamily_proc(opfamily, opcintype, opcintype,
}											BTSORTSUPPORT_PROC);

get_opfamily_proc的代码是:

Oid
get_opfamily_proc(Oid opfamily, Oid lefttype, Oid righttype, int16 procnum)
{
	HeapTuple	tp;
	Form_pg_amproc amproc_tup;
	RegProcedure result;

	tp = SearchSysCache4(AMPROCNUM,
						 ObjectIdGetDatum(opfamily),
						 ObjectIdGetDatum(lefttype),
						 ObjectIdGetDatum(righttype),
						 Int16GetDatum(procnum));
	if (!HeapTupleIsValid(tp))
		return InvalidOid;
	amproc_tup = (Form_pg_amproc) GETSTRUCT(tp);
	result = amproc_tup->amproc;
	ReleaseSysCache(tp);
	return result;
}											BTSORTSUPPORT_PROC);

这里能看到会在系统表pg_amproc查找数据,在ctid排序时能看到查找的是bttidcmp,那么就需要增加相关函数以及定义。

5. 完成以上步骤后的演示结果:

postgres=# create table testnewtype(t1 newtype);
CREATE TABLE
postgres=# insert into testnewtype values ('1'),('2'),('3'),('4'),('5'),('5'),('6');
INSERT 0 7
postgres=# select * from testnewtype where t1 < '6' order by t1;
 t1 
----
 1
 2
 3
 4
 5
 5
(6 rows)

postgres=#

6. 总结[3]

  1. 在这里一共添加了一个文件,src/backend/utils/adt/newtype.c;
  2. 修改系统表pg_proc,增加基础类型函数定义;
  3. 修改系统表pg_type,增加基础类型定义;
  4. 增加比较函数体;
  5. 增加比较函数定义;
  6. 在系统表pg_opfamily增加操作符族;
  7. 在pg_opcalss增加操作符类;
  8. 在pg_operator增加操作符,便于使用where查询;
  9. 在pg_amop增加与访问方法操作符族相关的操作符信息;
  10. 增加cmp函数体;
  11. 在pg_amproc增加关于访问方法操作符族相关的支持函数btnewtype定义。

参考

  1. ^复制的int8类型的相关函数
  2. ^routines to support running postgres in 'bootstrap' mode bootstrap mode is used to create the initial template database
  3. ^具体代码可以参看我的github https://github.com/wangguoke/postgres/commit/cb5ae3ca38965cb55ed780b46043d30927cebfbe
戳我,来吐槽~