Tuesday, April 20, 2010

prerequsite of using sql loader direct path loading

Be careful with using sql*loader, if the target table :

is big and having indexes during data loading  with logging = YES.

Reason being during data loading, index becomes UNUSABLE in dba_indexes (remains VALID in dba_objects). After data loading, then start rebuild indexes. When logging=YES, image FULL TABLE SCAN on a huge table, how many archived log we shall we ?  multiply by number of indexes ...  It is a big shock.  and how much more time it will take , assuming the job loads multiple files , each invoke of sql*loader loads one file only .

The common understanding of sql*load 

Use Direct Path Loads - The conventional path loader essentially loads the data by using standard insert statements.  The direct path loader (direct=true) loads directly into the Oracle data files and creates blocks in Oracle database block format.  The fact that SQL is not being issued makes the entire process much less taxing on the database.  There are certain cases, however, in which direct path loads cannot be used (clustered tables).  To prepare the database for direct path loads, the script $ORACLE_HOME/rdbms/admin/catldr.sql.sql must be executed.



In my case, apps team is not happy with 5 minutes performance, hence added "direct=ture", in the end caused archive log disk space full, job can't finish after running for 3 hours, generated 40+Gb archived log until archiver log hang.

During indexes rebuild after "direct=ture", see tremendous I/O incurred.

Report of conventional loading
Top 5 Timed Events

Event    Waits    Time(s)    Avg Wait(ms)    % Total Call Time    Wait Class
db file parallel write    6,358    240    38    97.5    System I/O
db file sequential read    44,238    128    3    52.0    User I/O
CPU time         115         46.7   
log file parallel write    11,538    79    7    32.1    System I/O
log file sync    11,094    76    7    30.9    Commit

Tablespace    Reads    Av Reads/s    Av Rd(ms)    Av Blks/Rd    Writes    Av Writes/s    Buffer Waits    Av Buf Wt(ms)
IDX     40,216     11     2.84     1.00     99,853     28     0     0.00

report of direct path loading
Top 5 Timed Events

Event    Waits    Time(s)    Avg Wait(ms)    % Total Call Time    Wait Class
db file sequential read    2,901,803    1,610    1    42.4    User I/O
log file parallel write    22,845    1,410    62    37.1    System I/O
CPU time         729         19.2    
Log archive I/O    46,028    455    10    12.0    System I/O
log file sequential read    22,795    87    4    2.3    System I/O


Tablespace IO Stats

    * ordered by IOs (Reads + Writes) desc

Tablespace    Reads    Av Reads/s    Av Rd(ms)    Av Blks/Rd    Writes    Av Writes/s    Buffer Waits    Av Buf Wt(ms)
IDX     2,866,984     794     0.52     1.00     427,225     118     0     0.00  
  

rebuild indexes  takes more time than gained from data loading part. Below is the figure I just used to rebuild unusable indexes for surepay04 tables, took near 10 minutes for nonvoice04 for each invoke of sql*loader, event with rebuild parallel.


    For the remedy, after I remove those backed up archived log, set relevant tables and indexes to NOLOGGING mode. Ask application team to remove "deirct=true".

> select index_name, status from dba_indexes where index_name like 'M1_SUREPAY_%04%_IDX%' ;

INDEX_NAME                     STATUS
------------------------------ --------
NONVOICE04_IDX1     UNUSABLE
NONVOICE04_IDX2     UNUSABLE
NONVOICE04_IDX4     UNUSABLE
VOICE04_IDX1        VALID
VOICE04_IDX2        VALID
VOICE04_IDX3        VALID
VOICE04_IDX4        VALID
OTHER04_IDX2        VALID
OTHER04_IDX3        VALID
OTHER04_IDX1        VALID
NONVOICE04_IDX3     UNUSABLE

11 rows selected.

> alter index NONVOICE04_IDX1 rebuild parallel 3 ;

Index altered.

Elapsed: 00:02:43.39
> alter index NONVOICE04_IDX2 rebuild parallel 3 ;

Index altered.

Elapsed: 00:02:40.96
> alter index NONVOICE04_IDX3 rebuild parallel 3 ;

Index altered.

Elapsed: 00:02:23.70
> alter index NONVOICE04_IDX4 rebuild parallel 3 ;

Index altered.

alter index NONVOICE04_IDX1 noparallel  ;
alter index NONVOICE04_IDX2 noparallel  ;
alter index NONVOICE04_IDX3 noparallel  ;
alter index NONVOICE04_IDX4 noparallel  ;