JSON创建索引概述 
对于JSON数据来说并没有什么特殊的SQL数据类型,所以创建索引的方式也并无特别之处。可以对json_value函数返回值创建B-tree索引或者位图索引,也可以对is json、js not json、json_exists等函数创建位图索引。
一般来说,位图索引通常是创建在那些唯一值很少的字段上,比如性别之类。
函数索引只会在对应函数的查询sql上才会生效,这在具体使用中无可避免存在一些局限性,所以在创建这些函数索引之前需要充分了解到业务经常使用的场景。
对于那些查询场景不确定的情况也有一种对应方式,那就是JSON search index,这是一种常规索引,不局限于特定的JSON路径表达式。它非常适合结构式的查询,比如查找JSON字段中的具体值,并且通过json_textcontains来进行全文查询,在各种字符串值中寻找特定的单词。
对于同一个JSON列来说,可以同时创建函数索引和全文索引,这里要注意的是JSON search index是专门用来对应JSON数据的全文索引。
Oracle建议使用AL32UTF8字符集,否则创建和使用索引时会自动的进行字符集的转换,而这种转换是有损的,也就表示某些数据查询的时候会无法返回
 
创建一张测试JSON表
1 2 3 4 5 6 7 drop table json_test purge; create table json_test (     id raw(16) not null,     document clob     CONSTRAINT ensure_json CHECK (document IS JSON) ); 
插入测试数据
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 INSERT INTO json_test   VALUES (SYS_GUID(),           '{ "PONumber"             : 1600,           "Reference"            : "ABULL-20140421",           "Requestor"            : "Alexis Bull",           "User"                 : "ABULL",           "CostCenter"           : "A50",           "ShippingInstructions" : { "name"   : "Alexis Bull",                                      "Address": { "street"  : "200 Sporting Green",                                                   "city"    : "South San Francisco",                                                   "state"   : "CA",                                                   "zipCode" : 99236,                                                   "country" : "United States of America" },                                      "Phone" : [ { "type" : "Office", "number" : "909-555-7307" },                                                  { "type" : "Mobile", "number" : "415-555-1234" } ] },           "Special Instructions" : null,           "AllowPartialShipment" : false,           "LineItems"            : [ { "ItemNumber" : 1,                                        "Part"       : { "Description" : "One Magic Christmas",                                                         "UnitPrice"   : 19.95,                                                         "UPCCode"     : 13131092899 },                                        "Quantity"   : 9.0 },                                      { "ItemNumber" : 2,                                        "Part"       : { "Description" : "Lethal Weapon",                                                         "UnitPrice"   : 19.95,                                                         "UPCCode"     : 85391628927 },                                        "Quantity"   : 5.0 } ] }'); 
创建位图索引 
创建基于json_exists返回数据的位图索引,这种索引就非常适合json_exists,因为返回的结果只有true或者false
1 2 3 4 create bitmap index idx_name 	on json_test (json_exists(document, '$.ShippingInstructions.name')); Index created. 
创建基于json_value函数返回数据的位图索引,这种就适合字段里只有少量可能的数据的情况
1 2 3 4 drop index idx_name; create bitmap index idx_name 	on json_test (json_value(document, '$.ShippingInstructions')); 
创建JSON_VALUE函数索引 
可以创建基于JSON_VALUE的函数索引,标准语法即可,显式的指定函数json_value,或者直接简单的用点表示法的语法。这些方法创建的索引对基于点表示法查询和json_value查询都可应用。
1 CREATE UNIQUE INDEX po_num_idx ON json_test po (po.document.PONumber); 
这里基于点表示法创建了一个函数索引,另外也可以通过json_value函数来创建基于POnumber字段的索引
1 2 create unique index po_num_idx2  on json_test (json_value(document, '$.PONumber' returning number error on error null on empty)); 
注意这里的ERROR ON ERROR表示如果数据包含不存在的POnumber的记录、有多个PONumber、或者有个PONumber不是数字型则索引就会创建失败,如果这个索引已经存在,则不符合要求的数据会插入失败。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 set autotrace traceonly select a.document."User",a.document."CostCenter" from json_test a where json_value(document, '$.PONumber' returning number error on error null on empty) = 1600; Execution Plan ---------------------------------------------------------- Plan hash value: 1738154326 -------------------------------------------------------------------------------------------- | Id  | Operation                    | Name        | Rows  | Bytes | Cost (%CPU)| Time     | -------------------------------------------------------------------------------------------- |   0 | SELECT STATEMENT             |             |  8168 |    27M|    30   (0)| 00:00:01 | |   1 |  NESTED LOOPS                |             |  8168 |    27M|    30   (0)| 00:00:01 | |   2 |   TABLE ACCESS BY INDEX ROWID| JSON_TEST   |     1 |  3489 |     1   (0)| 00:00:01 | |*  3 |    INDEX UNIQUE SCAN         | PO_NUM_IDX2 |     1 |       |     1   (0)| 00:00:01 | |   4 |   JSONTABLE EVALUATION       |             |       |       |            |          | -------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): ---------------------------------------------------    3 - access(JSON_VALUE("DOCUMENT" /*+ LOB_BY_VALUE */  FORMAT JSON , '$.POnumber'               RETURNING NUMBER ERROR ON ERROR NULL ON EMPTY)=1600) 
如果使用点表示法来查询,则同样可以使用这个索引
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 select a.document."User",a.document."CostCenter" from json_test a where a.document."PONumber" = 1600; Execution Plan ---------------------------------------------------------- Plan hash value: 1738154326 -------------------------------------------------------------------------------------------- | Id  | Operation                    | Name        | Rows  | Bytes | Cost (%CPU)| Time     | -------------------------------------------------------------------------------------------- |   0 | SELECT STATEMENT             |             |  8168 |    27M|    30   (0)| 00:00:01 | |   1 |  NESTED LOOPS                |             |  8168 |    27M|    30   (0)| 00:00:01 | |   2 |   TABLE ACCESS BY INDEX ROWID| JSON_TEST   |     1 |  3489 |     1   (0)| 00:00:01 | |*  3 |    INDEX UNIQUE SCAN         | PO_NUM_IDX2 |     1 |       |     1   (0)| 00:00:01 | |   4 |   JSONTABLE EVALUATION       |             |       |       |            |          | -------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): ---------------------------------------------------    3 - access(JSON_VALUE("DOCUMENT" /*+ LOB_BY_VALUE */  FORMAT JSON , '$.PONumber'               RETURNING NUMBER ERROR ON ERROR NULL ON EMPTY)=1600) 
这里可以看到,尽管使用点表示法时,但是谓词仍将其看做JSON_VALUE格式的数据而使用json_value的索引,并没有看见其进行隐式转换成JSON_QUERY,这里主要是优化器进行了更优解而选择了json_value表达式的索引。
而当我们不使用这个索引时,就会看到预期的json_query转换
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 drop index po_num_idx2; select a.document."User",a.document."CostCenter" from json_test a where a.document."PONumber" = '1600'; Execution Plan ---------------------------------------------------------- Plan hash value: 4223338378 ------------------------------------------------------------------------------------------- | Id  | Operation                    | Name       | Rows  | Bytes | Cost (%CPU)| Time     | ------------------------------------------------------------------------------------------- |   0 | SELECT STATEMENT             |            |  8168 |    27M|    30   (0)| 00:00:01 | |   1 |  NESTED LOOPS                |            |  8168 |    27M|    30   (0)| 00:00:01 | |   2 |   TABLE ACCESS BY INDEX ROWID| JSON_TEST  |     1 |  3496 |     1   (0)| 00:00:01 | |*  3 |    INDEX UNIQUE SCAN         | PO_NUM_IDX |     1 |       |     1   (0)| 00:00:01 | |   4 |   JSONTABLE EVALUATION       |            |       |       |            |          | ------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): ---------------------------------------------------    3 - access(JSON_QUERY("DOCUMENT" /*+ LOB_BY_VALUE */  FORMAT JSON ,               '$.PONumber' RETURNING VARCHAR2(4000) ASIS  WITHOUT ARRAY WRAPPER NULL ON               ERROR)='1600') 
Oracle建议使用下面的方式对json_value创建函数索引
点表示法
索引值可以灵活对应点表示法的查询,无论何时都会返回JSON数据。包含非标量的JSON数据,可以匹配除了json_value查询的点表示法查询。
 
json_value表达式,指定RETURNING数据类型,使用ERROR ON ERROR(可选NULL ON EMPTY)
索引值只有指定数据类型的标量数据,但是仍可以适用于那些返回标量数据的点表示法查询。
 
 
JSON_VALUE索引和查询的数据类型注意事项 
默认情况下,json_value函数返回的是varchar2类型数据,当创建一个基于json_value的函数索引时,除非指定RETURNING语句返回一个不同类型的数据,否则当谓词条件是一个非varchar2值的时候,就并不会适用索引。
返回值为varchar2类型,谓词为number型
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 select a.document."User",a.document."CostCenter" from json_test a where json_value(document, '$.PONumber' error on error null on empty) = 1600; Execution Plan ---------------------------------------------------------- Plan hash value: 2422004586 ------------------------------------------------------------------------------------ | Id  | Operation              | Name      | Rows  | Bytes | Cost (%CPU)| Time     | ------------------------------------------------------------------------------------ |   0 | SELECT STATEMENT       |           |    82 |   278K|    32   (0)| 00:00:01 | |   1 |  NESTED LOOPS          |           |    82 |   278K|    32   (0)| 00:00:01 | |   2 |   TABLE ACCESS FULL    | JSON_TEST |     1 |  3476 |     3   (0)| 00:00:01 | |*  3 |   JSONTABLE EVALUATION |           |       |       |            |          | ------------------------------------------------------------------------------------ Predicate Information (identified by operation id): ---------------------------------------------------    3 - filter(TO_NUMBER("P"."C_01$")=1600) 
多JSON字段创建复合索引 
当对JSON对象的多个字段建复合索引时,首选需要对其创建虚拟列,然后在虚拟列上创建B-tree索引。
1 2 3 4 5 6 7 alter table json_test add (userid varchar2(50) generated always as (json_value(document, '$.User' returning varchar2(50)))); alter table json_test add (cost varchar2(50) generated always as (json_value(document, '$.CostCenter' returning varchar2(50)))); create index idx_json on json_test(userid,cost); 
两种方式查询可以用到复合索引
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 select document from json_test where userid='ABULL' and cost='A50'; Execution Plan ---------------------------------------------------------- Plan hash value: 521514918 ------------------------------------------------------------------------------------------------- | Id  | Operation                           | Name      | Rows  | Bytes | Cost (%CPU)| Time     | ------------------------------------------------------------------------------------------------- |   0 | SELECT STATEMENT                    |           |     1 |  3496 |     1   (0)| 00:00:01 | |   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| JSON_TEST |     1 |  3496 |     1   (0)| 00:00:01 | |*  2 |   INDEX RANGE SCAN                  | IDX_JSON  |     1 |       |     1   (0)| 00:00:01 | ------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): ---------------------------------------------------    2 - access("USERID"='ABULL' AND "COST"='A50') 
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 select document from json_test where json_value(document,'$.User' returning varchar2(50))='ABULL' and json_value(document,'$.CostCenter' returning varchar2(50))='A50'; Execution Plan ---------------------------------------------------------- Plan hash value: 521514918 ------------------------------------------------------------------------------------------------- | Id  | Operation                           | Name      | Rows  | Bytes | Cost (%CPU)| Time     | ------------------------------------------------------------------------------------------------- |   0 | SELECT STATEMENT                    |           |     1 |  3496 |     1   (0)| 00:00:01 | |   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| JSON_TEST |     1 |  3496 |     1   (0)| 00:00:01 | |*  2 |   INDEX RANGE SCAN                  | IDX_JSON  |     1 |       |     1   (0)| 00:00:01 | ------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): ---------------------------------------------------    2 - access("JSON_TEST"."USERID"='ABULL' AND "JSON_TEST"."COST"='A50') 
JSON全文索引 
JSON search索引是一个常规索引,它可以提高临时结构查询和全文搜索的性能。 这是一个Oracle Text索引,专门设计用于JSON数据。
如果你使用12cR1版本创建过JSON search索引,则在新版本之后建议drop后重建
 
JSON search index 
通过关键字FOR JSON来执行CREATE SEARCH INDEX语句进行索引的创建,JSON全文索引的维护是异步的,这样可以避免在业务高峰期的索引维护成本,而选择在业务低谷的时候进行这些消耗资源的操作。异步维护也就意味着在索引同步之前,那些被修改的数据或新插入的数据并不会在索引里存在记录。
创建JSON全文索引
1 create search index idx_search on json_test(document) for json; 
当JSON全文索引创建完毕以后,就可以使用json_textcontains来进行查询
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 select document from json_test where json_textcontains(document, '$.LineItems.Part.Description', 'Magic'); Execution Plan ---------------------------------------------------------- Plan hash value: 1340174430 ------------------------------------------------------------------------------------------ | Id  | Operation                   | Name       | Rows  | Bytes | Cost (%CPU)| Time     | ------------------------------------------------------------------------------------------ |   0 | SELECT STATEMENT            |            |     1 |  3488 |     1   (0)| 00:00:01 | |   1 |  TABLE ACCESS BY INDEX ROWID| JSON_TEST  |     1 |  3488 |     1   (0)| 00:00:01 | |*  2 |   DOMAIN INDEX              | IDX_SEARCH |       |       |     1   (0)| 00:00:01 | ------------------------------------------------------------------------------------------ Predicate Information (identified by operation id): ---------------------------------------------------    2 - access("CTXSYS"."CONTAINS"("JSON_TEST"."DOCUMENT",'Magic INPATH               (/LineItems/Part/Description)')>0) 
执行计划中出现了DOMAIN INDEX表示sql使用了全文索引。
上面的例子仅查询了描述字段里出现Magic的情况,而比如想查询Description同时包含Magic和Christmas的情况则需要使用操作符&或and
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 select document from json_test where json_textcontains(document, '$.LineItems.Part.Description', 'Magic \& Christmas'); { "PONumber"             : 1600,           "Reference"            : "ABULL-20140421",           "Requestor"            : "Alexis Bull",           "User"                 : "ABULL",           "CostCenter"           : "A50",           "ShippingInstructions" : { "name"   : "Alexis Bull",                                      "Address": { "street"  : "200 Sporting Green",                                                   "city"    : "South San Francisco",                                                   "state"   : "CA",                                                   "zipCode" : 99236,                                                   "country" : "United States of America" },                                      "Phone" : [ { "type" : "Office", "number" : "909-555-7307" },                                                  { "type" : "Mobile", "number" : "415-555-1234" } ] },           "Special Instructions" : null,           "AllowPartialShipment" : false,           "LineItems"            : [ { "ItemNumber" : 1,                                        "Part"       : { "Description" : "One Magic Christmas",                                                         "UnitPrice"   : 19.95,                                                         "UPCCode"     : 13131092899 },                                        "Quantity"   : 9.0 },                                      { "ItemNumber" : 2,                                        "Part"       : { "Description" : "Lethal Weapon",                                                         "UnitPrice"   : 19.95,                                                         "UPCCode"     : 85391628927 },                                        "Quantity"   : 5.0 } ] } 
这里使用了反斜杠转义&。
也可以用and,但是注意这里查询包含保留字and,所以需要加上花括号{}
1 2 select document from json_test where json_textcontains(document, '$.LineItems.Part.Description', '{Magic and Christmas}'); 
JSON全文索引同样支持那些不包含json_textcontains的查询
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 SELECT COUNT(*) FROM   json_test WHERE  JSON_EXISTS(document, '$.Reference'); Execution Plan ---------------------------------------------------------- Plan hash value: 2409884336 ------------------------------------------------------------------------------- | Id  | Operation        | Name       | Rows  | Bytes | Cost (%CPU)| Time     | ------------------------------------------------------------------------------- |   0 | SELECT STATEMENT |            |     1 |  3488 |     1   (0)| 00:00:01 | |   1 |  SORT AGGREGATE  |            |     1 |  3488 |            |          | |*  2 |   DOMAIN INDEX   | IDX_SEARCH |       |       |     1   (0)| 00:00:01 | ------------------------------------------------------------------------------- Predicate Information (identified by operation id): ---------------------------------------------------    2 - access("CTXSYS"."CONTAINS"("JSON_TEST"."DOCUMENT",'HASPATH(/Refer               ence)')>0) 
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 SELECT COUNT(*) FROM   json_test where json_value(document, '$.ShippingInstructions.name') = 'Alexis Bull'; Execution Plan ---------------------------------------------------------- Plan hash value: 3056167535 ------------------------------------------------------------------------------------------- | Id  | Operation                    | Name       | Rows  | Bytes | Cost (%CPU)| Time     | ------------------------------------------------------------------------------------------- |   0 | SELECT STATEMENT             |            |     1 |  3488 |     1   (0)| 00:00:01 | |   1 |  SORT AGGREGATE              |            |     1 |  3488 |            |          | |*  2 |   TABLE ACCESS BY INDEX ROWID| JSON_TEST  |     1 |  3488 |     1   (0)| 00:00:01 | |*  3 |    DOMAIN INDEX              | IDX_SEARCH |       |       |     1   (0)| 00:00:01 | ------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): ---------------------------------------------------    2 - filter(JSON_VALUE("DOCUMENT" FORMAT JSON , '$.ShippingInstructions.name'               RETURNING VARCHAR2(4000) NULL ON ERROR)='Alexis Bull')    3 - access("CTXSYS"."CONTAINS"("JSON_TEST"."DOCUMENT",'{Alexis Bull} INPATH               (/ShippingInstructions/name)')>0)