Be careful with using sql*loader, if the target table :
is big and having indexes during data loading with logging = YES.
Reason being during data loading, index becomes UNUSABLE in dba_indexes (remains VALID in dba_objects). After data loading, then start rebuild indexes. When logging=YES, image FULL TABLE SCAN on a huge table, how many archived log we shall we ? multiply by number of indexes ... It is a big shock. and how much more time it will take , assuming the job loads multiple files , each invoke of sql*loader loads one file only .
The common understanding of sql*load
Use Direct Path Loads - The conventional path loader essentially loads the data by using standard insert statements. The direct path loader (direct=true) loads directly into the Oracle data files and creates blocks in Oracle database block format. The fact that SQL is not being issued makes the entire process much less taxing on the database. There are certain cases, however, in which direct path loads cannot be used (clustered tables). To prepare the database for direct path loads, the script $ORACLE_HOME/rdbms/admin/catldr.sql.sql must be executed.
In my case, apps team is not happy with 5 minutes performance, hence added "direct=ture", in the end caused archive log disk space full, job can't finish after running for 3 hours, generated 40+Gb archived log until archiver log hang.
During indexes rebuild after "direct=ture", see tremendous I/O incurred.
Report of conventional loading
Top 5 Timed Events
Event Waits Time(s) Avg Wait(ms) % Total Call Time Wait Class
db file parallel write 6,358 240 38 97.5 System I/O
db file sequential read 44,238 128 3 52.0 User I/O
CPU time 115 46.7
log file parallel write 11,538 79 7 32.1 System I/O
log file sync 11,094 76 7 30.9 Commit
Tablespace Reads Av Reads/s Av Rd(ms) Av Blks/Rd Writes Av Writes/s Buffer Waits Av Buf Wt(ms)
IDX 40,216 11 2.84 1.00 99,853 28 0 0.00
report of direct path loading
Top 5 Timed Events
Event Waits Time(s) Avg Wait(ms) % Total Call Time Wait Class
db file sequential read 2,901,803 1,610 1 42.4 User I/O
log file parallel write 22,845 1,410 62 37.1 System I/O
CPU time 729 19.2
Log archive I/O 46,028 455 10 12.0 System I/O
log file sequential read 22,795 87 4 2.3 System I/O
Tablespace IO Stats
* ordered by IOs (Reads + Writes) desc
Tablespace Reads Av Reads/s Av Rd(ms) Av Blks/Rd Writes Av Writes/s Buffer Waits Av Buf Wt(ms)
IDX 2,866,984 794 0.52 1.00 427,225 118 0 0.00
rebuild indexes takes more time than gained from data loading part. Below is the figure I just used to rebuild unusable indexes for surepay04 tables, took near 10 minutes for nonvoice04 for each invoke of sql*loader, event with rebuild parallel.
For the remedy, after I remove those backed up archived log, set relevant tables and indexes to NOLOGGING mode. Ask application team to remove "deirct=true".
> select index_name, status from dba_indexes where index_name like 'M1_SUREPAY_%04%_IDX%' ;
INDEX_NAME STATUS
------------------------------ --------
NONVOICE04_IDX1 UNUSABLE
NONVOICE04_IDX2 UNUSABLE
NONVOICE04_IDX4 UNUSABLE
VOICE04_IDX1 VALID
VOICE04_IDX2 VALID
VOICE04_IDX3 VALID
VOICE04_IDX4 VALID
OTHER04_IDX2 VALID
OTHER04_IDX3 VALID
OTHER04_IDX1 VALID
NONVOICE04_IDX3 UNUSABLE
11 rows selected.
> alter index NONVOICE04_IDX1 rebuild parallel 3 ;
Index altered.
Elapsed: 00:02:43.39
> alter index NONVOICE04_IDX2 rebuild parallel 3 ;
Index altered.
Elapsed: 00:02:40.96
> alter index NONVOICE04_IDX3 rebuild parallel 3 ;
Index altered.
Elapsed: 00:02:23.70
> alter index NONVOICE04_IDX4 rebuild parallel 3 ;
Index altered.
alter index NONVOICE04_IDX1 noparallel ;
alter index NONVOICE04_IDX2 noparallel ;
alter index NONVOICE04_IDX3 noparallel ;
alter index NONVOICE04_IDX4 noparallel ;