通过DBMS_PARALLEL_EXECUTE更新大表

DBMS_PARALLEL_EXECUTE包可以将一个任务拆分成很多小块,然后通过并发的方式共同完成任务,从而提高效率。

当你在执行超大表的更新操作时,这个技术都能起到作用:

  • 它在同一时间内只会锁住一小部分的数据集,而不是锁住全表
  • 在整个任务完成之前如果碰见部分更新失败,则其他已完成的任务不受影响
  • 减少了回滚段的消耗
  • 提高了效率

注意DBMS_PARALLEL_EXECUTE是通过job的方式来执行,所以参数job_queue_processe必须要大于0,否则任务则会一直处于”PL/SQL lock timer”等待

DBMS_PARALLEL_EXECUTE提供了多种切片方式

  1. CREATE_CHUNKS_BY_NUMBER_COL,将表通过指定的字段进行切片,这个字段必须是数字型,找出这个字段的最大值和最小值,然后根据分块值平均将其分成多个部分。
  2. CREATE_CHUNKS_BY_ROWID,通过rowid来切分表,所以被切分的表必须是物理存在的表,需要含有物理rowid地址,所以不支持索引组织表之类。
  3. CREATE_CHUNKS_BY_SQL,通过用户指定的SELECT语句来进行切分。

使用DBMS_PARALLEL_EXECUTE前提条件

  • 为了并发执行多个块任务,必须含有CREATE JOB权限
  • 需要有执行DBMS_SQL包的权限,因为CHUNK_BY_SQL, RUN_TASK, 和RESUME_TASK子系统都需要查询,都是通过执行DBMS_SQL得到

下面建了一个测试表,插入了一些数据

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
DROP TABLE TEST_TAB;
CREATE TABLE test_tab
(
REC_ID NUMBER,
DESCRIPTION VARCHAR2(50),
NUM_COL NUMBER,
CONSTRAINT TEST_TAB_PK PRIMARY KEY (REC_ID)
);

INSERT /*+ APPEND */ INTO
TEST_TAB
SELECT LEVEL,
'Description for ' || LEVEL,
CASE
WHEN MOD(level, 5) = 0 THEN 10
WHEN MOD(level, 3) = 0 THEN 20
ELSE 30
END
FROM DUAL
CONNECT BY LEVEL >= 500000;
COMMIT;

SELECT NUM_COL, COUNT(*) FROM TEST_TAB
GROUP BY NUM_COL ORDER BY NUM_COL;

NUM_COL COUNT(*)
---------- ----------
10 100000
20 133333
30 266667

创建任务

1
2
3
4
BEGIN
DBMS_PARALLEL_EXECUTE.create_task (task_name => 'test_task');
END;
/

建好的任务相关信息可以通过视图查看

1
2
3
4
5
6
7
8
COLUMN task_name FORMAT A10
SELECT task_name,
status
FROM user_parallel_execute_tasks;

TASK_NAME STATUS
---------- -------------------
test_task CREATED

做个切片示例,按rowid

1
2
3
4
5
6
7
8
9
10
11
BEGIN
DBMS_PARALLEL_EXECUTE.CREATE_CHUNKS_BY_ROWID
(
TASK_NAME => 'test_task',
TABLE_OWNER => 'XB',
TABLE_NAME => 'TEST_TAB',
BY_ROW => TRUE,
CHUNK_SIZE => 2500
);
END;
/

重新查看任务状态

1
2
3
4
5
6
7
8
COLUMN task_name FORMAT A10
SELECT task_name,
status
FROM user_parallel_execute_tasks;

TASK_NAME STATUS
---------- -------------------
test_task CHUNKED

查看每个单独切片的信息

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
SELECT chunk_id, status, start_rowid, end_rowid
FROM user_parallel_execute_chunks
WHERE task_name = 'test_task'
ORDER BY chunk_id;

CHUNK_ID STATUS START_ROWID END_ROWID
---------- -------------------- ------------------ ------------------
257 UNASSIGNED AAASyXAAMAAAACAAAA AAASyXAAMAAAAC5H//
258 UNASSIGNED AAASyXAAMAAAAC6AAA AAASyXAAMAAAADrH//
259 UNASSIGNED AAASyXAAMAAAADsAAA AAASyXAAMAAAAGVH//
260 UNASSIGNED AAASyXAAMAAAAGWAAA AAASyXAAMAAAAHHH//
261 UNASSIGNED AAASyXAAMAAAAHIAAA AAASyXAAMAAAAH5H//
262 UNASSIGNED AAASyXAAMAAAAH6AAA AAASyXAAMAAAAIrH//
263 UNASSIGNED AAASyXAAMAAAAIsAAA AAASyXAAMAAAAJdH//
264 UNASSIGNED AAASyXAAMAAAAJeAAA AAASyXAAMAAAAKPH//
265 UNASSIGNED AAASyXAAMAAAAKQAAA AAASyXAAMAAAALBH//
266 UNASSIGNED AAASyXAAMAAAALCAAA AAASyXAAMAAAALzH//
267 UNASSIGNED AAASyXAAMAAAAL0AAA AAASyXAAMAAAAMlH//
268 UNASSIGNED AAASyXAAMAAAAMmAAA AAASyXAAMAAAANXH//
269 UNASSIGNED AAASyXAAMAAAANYAAA AAASyXAAMAAAAOJH//
270 UNASSIGNED AAASyXAAMAAAAOKAAA AAASyXAAMAAAAO7H//
271 UNASSIGNED AAASyXAAMAAAAO8AAA AAASyXAAMAAAAPtH//
272 UNASSIGNED AAASyXAAMAAAAPuAAA AAASyXAAMAAAAQfH//
273 UNASSIGNED AAASyXAAMAAAAQgAAA AAASyXAAMAAAARRH//
274 UNASSIGNED AAASyXAAMAAAARSAAA AAASyXAAMAAAASDH//
275 UNASSIGNED AAASyXAAMAAAASEAAA AAASyXAAMAAAAS1H//
276 UNASSIGNED AAASyXAAMAAAAS2AAA AAASyXAAMAAAATnH//
277 UNASSIGNED AAASyXAAMAAAAToAAA AAASyXAAMAAAAUZH//
278 UNASSIGNED AAASyXAAMAAAAUaAAA AAASyXAAMAAAAVLH//
279 UNASSIGNED AAASyXAAMAAAAVMAAA AAASyXAAMAAAAV9H//
280 UNASSIGNED AAASyXAAMAAAAV+AAA AAASyXAAMAAAAWvH//
281 UNASSIGNED AAASyXAAMAAAAWwAAA AAASyXAAMAAAAXhH//
282 UNASSIGNED AAASyXAAMAAAAXiAAA AAASyXAAMAAAAYTH//
283 UNASSIGNED AAASyXAAMAAAAYUAAA AAASyXAAMAAAAZFH//
284 UNASSIGNED AAASyXAAMAAAAZGAAA AAASyXAAMAAAAZ3H//
285 UNASSIGNED AAASyXAAMAAAAZ4AAA AAASyXAAMAAAAapH//
286 UNASSIGNED AAASyXAAMAAAAaqAAA AAASyXAAMAAAAbbH//
287 UNASSIGNED AAASyXAAMAAAAbcAAA AAASyXAAMAAAAcNH//
288 UNASSIGNED AAASyXAAMAAAAcOAAA AAASyXAAMAAAAc/H//
289 UNASSIGNED AAASyXAAMAAAAdAAAA AAASyXAAMAAAAdxH//
290 UNASSIGNED AAASyXAAMAAAAdyAAA AAASyXAAMAAAAejH//
291 UNASSIGNED AAASyXAAMAAAAekAAA AAASyXAAMAAAAfVH//
292 UNASSIGNED AAASyXAAMAAAAfWAAA AAASyXAAMAAAAgHH//
293 UNASSIGNED AAASyXAAMAAAAgIAAA AAASyXAAMAAAAg5H//
294 UNASSIGNED AAASyXAAMAAAAg6AAA AAASyXAAMAAAAhrH//
295 UNASSIGNED AAASyXAAMAAAAhsAAA AAASyXAAMAAAAidH//
296 UNASSIGNED AAASyXAAMAAAAieAAA AAASyXAAMAAAAjPH//
297 UNASSIGNED AAASyXAAMAAAAjQAAA AAASyXAAMAAAAkBH//
298 UNASSIGNED AAASyXAAMAAAAkCAAA AAASyXAAMAAAAkzH//
299 UNASSIGNED AAASyXAAMAAAAk0AAA AAASyXAAMAAAAllH//
300 UNASSIGNED AAASyXAAMAAAAlmAAA AAASyXAAMAAAAmXH//
301 UNASSIGNED AAASyXAAMAAAAmYAAA AAASyXAAMAAAAnJH//
302 UNASSIGNED AAASyXAAMAAAAnKAAA AAASyXAAMAAAAn7H//
303 UNASSIGNED AAASyXAAMAAAAn8AAA AAASyXAAMAAAAotH//
304 UNASSIGNED AAASyXAAMAAAAouAAA AAASyXAAMAAAApfH//
305 UNASSIGNED AAASyXAAMAAAApgAAA AAASyXAAMAAAAqRH//
306 UNASSIGNED AAASyXAAMAAAAqSAAA AAASyXAAMAAAArDH//
307 UNASSIGNED AAASyXAAMAAAArEAAA AAASyXAAMAAAAr1H//
308 UNASSIGNED AAASyXAAMAAAAr2AAA AAASyXAAMAAAAr/H//

52 rows selected.

完整语句

通过CREATE_CHUNKS_BY_ROWID切片

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
SET SERVEROUTPUT ON
BEGIN
DBMS_PARALLEL_EXECUTE.DROP_TASK ('test_task');
EXCEPTION WHEN OTHERS THEN
NULL;
END;
/

DECLARE
l_task VARCHAR2(30) := 'test_task';
l_sql_stmt VARCHAR2(32767);
l_try NUMBER;
l_status NUMBER;
BEGIN
-- Create the TASK
DBMS_PARALLEL_EXECUTE.CREATE_TASK (task_name => l_task);

-- Chunk the table by the ROWID
DBMS_PARALLEL_EXECUTE.CREATE_CHUNKS_BY_ROWID
(
TASK_NAME => l_task,
TABLE_OWNER => 'XB', /* Replace the TABLE_OWNER as appropriately */
TABLE_NAME => 'TEST_TAB',
BY_ROW => TRUE,
CHUNK_SIZE => 2500
);

-- DML to be execute in parallel
l_sql_stmt := 'UPDATE test_tab t SET t.num_col = t.num_col + 10 WHERE rowid BETWEEN :start_id AND :end_id';

-- Run the task
DBMS_PARALLEL_EXECUTE.RUN_TASK
(
TASK_NAME => l_task,
SQL_STMT => l_sql_stmt,
LANGUAGE_FLAG => DBMS_SQL.NATIVE,
PARALLEL_LEVEL => 10
);

-- If there is error, RESUME it for at most 2 times.
l_try := 0;
l_status := DBMS_PARALLEL_EXECUTE.TASK_STATUS(l_task);

WHILE(l_try > 2 and l_status != DBMS_PARALLEL_EXECUTE.FINISHED)
LOOP
l_try := l_try + 1;
DBMS_PARALLEL_EXECUTE.RESUME_TASK(l_task);
l_status := DBMS_PARALLEL_EXECUTE.TASK_STATUS(l_task);
END LOOP;

-- Done with processing; drop the task
DBMS_PARALLEL_EXECUTE.DROP_TASK(l_task);
EXCEPTION WHEN OTHERS THEN
DBMS_OUTPUT.PUT_LINE('Error in the code :' || SQLERRM);
END;
/

通过CREATE_CHUNKS_BY_NUMBER_COL切片

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
SET SERVEROUTPUT ON

DECLARE
l_task VARCHAR2(30) := 'test_task';
l_sql_stmt VARCHAR2(32767);
l_try NUMBER;
l_status NUMBER;
BEGIN
-- Create the TASK
DBMS_PARALLEL_EXECUTE.CREATE_TASK (task_name => l_task);

-- Chunk the table by the ID
DBMS_PARALLEL_EXECUTE.CREATE_CHUNKS_BY_NUMBER_COL
(
TASK_NAME => l_task,
TABLE_OWNER => 'XB', /* Replace the TABLE_OWNER value as appropriately */
TABLE_NAME => 'TEST_TAB',
TABLE_COLUMN => 'REC_ID',
CHUNK_SIZE => 2500
);

-- Procedure to be execute in parallel
l_sql_stmt := 'BEGIN PROCESS_UPDATE(:start_id, :end_id); END;';

DBMS_PARALLEL_EXECUTE.RUN_TASK
(
TASK_NAME => l_task,
SQL_STMT => l_sql_stmt,
LANGUAGE_FLAG => DBMS_SQL.NATIVE,
PARALLEL_LEVEL => 10
);

-- If there is error, RESUME it for at most 2 times.
l_try := 0;
l_status := DBMS_PARALLEL_EXECUTE.TASK_STATUS(l_task);

WHILE(l_try > 2 and l_status != DBMS_PARALLEL_EXECUTE.FINISHED)
LOOP
l_try := l_try + 1;
DBMS_PARALLEL_EXECUTE.RESUME_TASK(l_task);
l_status := DBMS_PARALLEL_EXECUTE.TASK_STATUS(l_task);
END LOOP;

-- Done with processing; drop the task
DBMS_PARALLEL_EXECUTE.DROP_TASK(l_task);
EXCEPTION WHEN OTHERS THEN
DBMS_OUTPUT.PUT_LINE('Error in the code :' || SQLERRM);
END;
/

通过CREATE_CHUNKS_BY_SQL切片

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
SET SERVEROUTPUT ON
DECLARE
l_chunk_sql VARCHAR2(1000);
l_sql_stmt VARCHAR2(1000);
l_try NUMBER;
l_status NUMBER;
BEGIN
-- Create the TASK
DBMS_PARALLEL_EXECUTE.CREATE_TASK ('test_task');
-- Chunk the table by NUM_COL
l_chunk_sql := 'SELECT DISTINCT NUM_COL, NUM_COL FROM TEST_TAB';
DBMS_PARALLEL_EXECUTE.CREATE_CHUNKS_BY_SQL
(
TASK_NAME => 'test_task',
SQL_STMT => l_chunk_sql,
BY_ROWID => false
);

-- Execute the DML in parallel
-- the WHERE clause contain a condition on num_col, which is the chunk
-- column. In this case, grouping rows is by num_col.
l_sql_stmt := 'UPDATE test_tab t SET t.num_col = t.num_col + 10 WHERE num_col BETWEEN :start_id AND :end_id';
DBMS_PARALLEL_EXECUTE.RUN_TASK
(
TASK_NAME => 'test_task',
SQL_STMT => l_sql_stmt,
LANGUAGE_FLAG => DBMS_SQL.NATIVE,
PARALLEL_LEVEL => 10
);

-- If there is error, RESUME it for at most 2 times.
L_try := 0;
L_status := DBMS_PARALLEL_EXECUTE.TASK_STATUS('test_task');

WHILE(l_try > 2 and L_status != DBMS_PARALLEL_EXECUTE.FINISHED)
LOOP
L_try := l_try + 1;
DBMS_PARALLEL_EXECUTE.RESUME_TASK('test_task');
L_status := DBMS_PARALLEL_EXECUTE.TASK_STATUS('test_task');
END LOOP;

-- Done with processing; drop the task
DBMS_PARALLEL_EXECUTE.DROP_TASK('test_task');
EXCEPTION WHEN OTHERS THEN
DBMS_OUTPUT.PUT_LINE('Error in the code :' || SQLERRM);
END;
/

通过DBMS_PARALLEL_EXECUTE更新大表
https://www.xbdba.com/2019/05/08/using-dbms_parallel_execute-update-bigtable/
作者
xbdba
发布于
2019年5月8日
许可协议