Friday, December 10, 2010

Upgrade 1st Production database to 11.2.0.2

Finally, I made it , from 10.2.0.4 to 11.2.0.2 for data warehouse database.

Only one minor performance issue encountered, which is within control.

-- Provide workaround to application , before log TAR to metalink.

Cheers! One more to upgrade Jan.

Tuesday, November 30, 2010

How to control Ubuntu’s Services easily?

How to control Ubuntu’s Services easily? | Ubuntu Tweak Blog

How To Start, Stop Services In Ubuntu 10

How To Start, Stop Services In Ubuntu 10.04 Lucid Lynx Automatically | Liberian Geek

Wednesday, November 17, 2010

vmware player failed to compile module vmmon after upgrade to Ubuntu 10.10

Installing VMware Workstation 7.1.1 64 bit on Ubuntu 10.10 » Pario TechnoBlob

wget http://www.sputnick-area.net/scripts/vmware7.1.1-patch-kernel-2.6.35.bash

VirtualBox can’t operate in VMX root mode. Please disable the KVM kernel extension, recompile your kernel and reboot (VERR_VMX_IN_VMX_ROOT_MODE).

After I changed CPU from AMD to Intel, encounter the following error :
VirtualBox can’t operate in VMX root mode. Please disable the KVM kernel extension, recompile your kernel and reboot (VERR_VMX_IN_VMX_ROOT_MODE).
The root cause is either the kvm-intel or kvm-amd module has been loaded at boot time. To remove it   :
root@localhost:~# lsmod |grep kvm
kvm_intel              32832 0
kvm                   182683 1 kvm_intel
root@localhost:~# modprobe -r kvm_intel
No more message like the one above will be displayed.

CentOS: Unable to resume swap device SWAP-sda2

Last night, upgrade my AMD CPU to Intel CPU, everything works fine Ubuntu, but not able to boot Cent OS , with above error message .

Googled and got idea from http://fedoraforum.org/forum/printthread.php?t=120868

It seems vladak's message match to my case.
"I hit this problem while swapping motherboard+CPU (Athlon X2 instead of Sempron) for Fedora 10 box. After the hardware change (could be caused because the old motherboard had 2 IDE channels and the new one has only 1 ? the boot disk is connected via IDE.) the system reported the dreaded "Unable to access resume device" message. Booting from Fedora 10 install DVD into rescue mode, chroot to /mnt/sysimage and mkinitrd fixed the problem."

Fixed the problem with below steps.
1. Find the CentOS 5.5 install DVD and boot into rescue mode.
2. chroot /mnt/sysimage
3. cd /boot and move existing initrd*.img ./initrd_bak
[root@localhost boot]# ll initrd_bak
total 15648
-rw------- 1 root root 2663536 Aug 21 12:46 initrd-2.6.18-194.11.1.el5.img
-rw------- 1 root root 2663856 Aug 21 12:45 initrd-2.6.18-194.11.1.el5xen.img
-rw------- 1 root root 2663556 Sep 12 22:19 initrd-2.6.18-194.11.3.el5.img
-rw------- 1 root root 2663874 Sep 12 22:19 initrd-2.6.18-194.11.3.el5xen.img
-rw------- 1 root root 2663599 Sep 29 22:56 initrd-2.6.18-194.11.4.el5.img
-rw------- 1 root root 2663935 Sep 29 22:55 initrd-2.6.18-194.11.4.el5xen.img

4. invoke below command to initialize RAM disk

mkinitrd initrd-2.6.18-194.11.4.el5.img initrd-2.6.18-194.11.4.el5

5. then I have below file, note tha the size is different with backup copy.
rw------- 1 root root 2672372 Nov 17 2010 initrd-2.6.18-194.11.4.el5.img

6. exit twice to boot up. cheers!

Friday, November 12, 2010

Using advantage of partition elimination

[The problem]

Application team asked for help:

Daily financial report delayed near one month because job running very slow while DB to shutdown everyday for cold backup.
My estimation is about 30 hours for the job to completed.
Vendor not able to provide solution even tried changing code a few times.

[Diagnostic]
    This range partitioning table is about 140Gb big, with 500+ partitions, even partition stores 5 values of job_id.
    From execution plan, partition is performed but no parallelism regardless PARALLEL server enabled, and no performing full table scan, how ever generating lots of I/O , consist gets requires , physical reads etc.
    Total consist gets is about 5 times of partitions needed for scan.
The statement is like this:
     select ... from p_table where part_key_col in (select distinct job_id from job_table...);

    The execution plan shows NEST LOOP for each value return from sub-query, caused redundantly access to partitions.
    ie.
     for each job_id returned from sub-query (110 distinct job_id)
       do
         full partition scan ( one of total 22 partitions )
   done.
Hence, each partition is scanned 5 times in worst situation. Total times: 110
While parallelized scan can't happen in single partition, which only occurs for simultaneoustly access to multiple partitions.

[Solution]
Rewrite the code , to make partition elimination happen.
    select ... from p_table where part_key_col between (select min(distinct job_id) from job_table ...) and (select max(distinct job_id) from job_table ...);
Not that the logic slightly changed, but applicable in this case.

     After make this change according to my suggestion, job finished within 30 minutes, while observing 12 parallel processes running happily to scan 22 partitions once only.
    Cheers!

[Update on 17-Nov]

One more think, SQL logic should not be changed. Studied more about partition pruning from data warehousing guide, found USE_HASH hint achieved same effects without rewrite SQL.

Monday, October 25, 2010

changes/bugs may fail your 11g upgrade

For OLAP type databases or running complicated batch jobs.
1. here are two bugs 8477973 and 4926618

8477973 Multiple open DB links / ORA-2020 / distributed deadlock possible

Impact: hang your sql session

4926618 Excessive CPU on HASH UNIQUE when repeated

Good news, both of them are fixed in 11.2.0.2

Impact: x times slower your UPDATE statement. Especially updating more than 1 million rows. The more rows more greater times. In my case, 21 minutes becomes 8 hours .

2. new parameter db_ultra_safe: DATA_AND_INDEX comes with significant overhead. For my INSERT case, it is 7x slower.

Spent weeks of effort to come to above findings. Hope it help preventing from happening to you . Cheers!

Friday, October 01, 2010

editing multpile files with vi

Googled and learned useful commands can still recall today
:n #go to next file
:rew # rewind to first file
:e! # abandon all changes

I think these basic commands are more than enough

Saturday, July 24, 2010

NTLM proxy

Configuring ISA Proxy in Ubuntu « Ridvan Baluyos

cool

Thursday, July 22, 2010

oracle password file

(转)oracle口令文件 orapwd - 笔记 - ITPUB个人空间 - powered by X-Space

openssh missing cygcrypto-0.9.8.dll

cygcrypto-0.9.8.dll ?

my solution is to go to http://www.cygwin.com/packages/ , find relative packages are libopenssl and libopenssl.

reinstall them, working now.

backup related views

Sunday, July 18, 2010

Thursday, July 15, 2010

Linux logging

Linux日志分析的实战专题 - 笔记 - ITPUB个人空间 - powered by X-Space

Sunday, July 11, 2010

ctime, atime, mtime

修改文件时间与创建新文件：touch - 笔记

Shrink datafile

数据库缩小表空间一例 - 在路上...... - ITPUB个人空间

buffer and cache

Linux内存管理机制中buffer和cache的区别 - 尛样儿 - ITPUB个人空间

Oracle 11g防止暴力破解数据库用户密码的手段——延长失败尝试响应Oracle 11g防止暴力破解数据库用户密码的手段——延长失败尝试响应

virtualbox vs vmware

VirtualBox 3 vs Vmware Workstation 6.5.2

5 Reasons Why You Should Use VirtualBox Over VMware Server

VirtualBox vs. VMware vs. Parallels

[My key opinion]

VMWare: Better support USB devices, while virtualbox uses share folder.

VirtualBox: Suppose both sounds of host and guest.

linux proc/cpuinfo

processor : 0

vendor_id : GenuineIntel

cpu family : 15

model : 6

model name : Intel(R) Pentium(R) 4 CPU 3.00GHz

stepping : 5

cpu MHz : 3143.295

cache size : 0 KB

physical id 0

siblings : 1

fdiv_bug : no

hlt_bug : no

f00f_bug : no

coma_bug : no

fpu : yes

fpu_exception : yes

cpuid level : 6

wp : yes

flags : fpu vme pse tsc msr pae cx8 sep pge cmov acpi mmx fxsr sse sse2

bogomips : 1674.44

rpm i386 i586 i686 之间的区别

有的rpm有分i386 i586 i686等不同版本，如：
　　abc-1.2.3-4.i386.rpm
　　abc-1.2.3-4.i586.rpm
　　abc-1.2.3-4.i686.rpm
　　它们有什么不同呢？

　　这里的i386 i586 i686指的是适用于intel i386、 i586、i686 兼容指令集的微处理器。一般来说，等级愈高的机器可接受较低等级的rpm文件。
i386—几乎所有的X86平台，不论是旧的pentum或者是新的pentum-IV与K7系统CPU，都可以正常工作，i指得是Intel兼容的CPU，至于386就是CPU的等级。
i586—就是586等级的计算机，包括pentum第一代MMX CPU，AMD的K5，K6系统CPU（socket7插脚）等CPU都是这个等级。
i686—pentum 2 以后的Intel系统CPU及K7以后等级的CPU都属于这个686等级。
你可以透过/proc/cpuinfo这个档案查询你的CPU等级。

/proc/cpuinfo

This virtual file identifies the type of processor used by your system. The following is an example of the output typical of /proc/cpuinfo:

         processor : 0 vendor_id : GenuineIntel cpu family : 15 model : 2 model name : Intel(R) Xeon(TM) CPU 2.40GHz stepping : 7 cpu MHz : 2392.371 cache size : 512 KB physical id : 0 siblings : 2 runqueue : 0 fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 2 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm bogomips : 4771.02

· processor — Provides each processor with an identifying number. On systems that have one processor, only a 0 is present.

· cpu family — Austhoritatively identifies the type of processor in the system. For an Intel-based system, place the number in front of "86" to determine the value. This is particularly helpful for those attempting to identify the architecture of an older system such as a 586, 486, or 386. Because some RPM packages are compiled for each of these particular architectures, this value also helps users determine which packages to install.

· model name — Displays the common name of the processor, including its project name.

· cpu MHz — Shows the precise speed in megahertz for the processor to the thousandths decimal place.

· cache size — Displays the amount of level 2 memory cache available to the processor.

· siblings — Displays the number of sibling CPUs on the same physical CPU for architectures which use hyper-threading.

· flags — Defines a number of different qualities about the processor, such as the presence of a floating point unit (FPU) and the ability to process MMX instructions.

Understanding /proc/cpuinfo

Example:

$ uname -r
2.6.18-8.el5

How many physical processors are there?

$ grep 'physical id' /proc/cpuinfo | sort | uniq | wc -l
2

How many virtual processors are there?

$ grep ^processor /proc/cpuinfo | wc -l
4

Are the processors dual-core (or multi-core)?

$ grep 'cpu cores' /proc/cpuinfo
cpu cores       : 2
cpu cores       : 2
cpu cores       : 2
cpu cores       : 2

"2" indicates the two physical processors are dual-core, resulting in 4 virtual processors.

If "1" was returned, the two physical processors are single-core. If the processors are single-core, and the number of virtual processors is greater than the number of physical processors, the CPUs are using hyper-threading. Hyper-threading is supported if ht is present in the CPU flags and you are using an SMP kernel.

Are the processors 64-bit?

A 64-bit processor will have lm ("long mode") in the flags section of cpuinfo. A 32-bit processor will not.

e.g.,

flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt lm 3dnowext 3dnow pni cx16 lahf_lm cmp_legacy svm cr8legacy ts fid vid ttp tm stc

What do the CPU flags mean?

The CPU flags are briefly described in the kernel header file cpufeature.h.

proc/cpuinfo 确定系统的CPU情况

一些操作系统的最新版本已经更新了 /proc/cpuinfo 文件，以支持多路平台。如果您的系统中的 /proc/cpuinfo 文件能够正确地反映出处理器信息，那么就不需要执行上述步骤。反之，可采用本文中的信息进行解释。
/proc/cpuinfo 文件包含系统上每个处理器的数据段落。/proc/cpuinfo 描述中有 6 个条目适用于多内核和超线程（HT）技术检查：processor, vendor id, physical id, siblings, core id 和 cpu cores。
processor 条目包括这一逻辑处理器的唯一标识符。
physical id 条目包括每个物理封装的唯一标识符。
core id 条目保存每个内核的唯一标识符。
siblings 条目列出了位于相同物理封装中的逻辑处理器的数量。
cpu cores 条目包含位于相同物理封装中的内核数量。
如果处理器为英特尔处理器，则 vendor id 条目中的字符串是 GenuineIntel。
拥有相同 physical id 的所有逻辑处理器共享同一个物理插座。每个 physical id 代表一个唯一的物理封装。Siblings 表示位于这一物理封装上的逻辑处理器的数量。它们可能支持也可能不支持超线程（HT）技术。每个 core id 均代表一个唯一的处理器内核。所有带有相同 core id 的逻辑处理器均位于同一个处理器内核上。如果有一个以上逻辑处理器拥有相同的 core id 和 physical id，则说明系统支持超线程（HT）技术。如果有两个或两个以上的逻辑处理器拥有相同的 physical id，但是 core id 不同，则说明这是一个多内核处理器。cpu cores 条目也可以表示是否支持多内核。
例如，如果系统包含两个物理封装，每个封装中又包含两个支持超线程（HT）技术的处理器内核，则 /proc/cpuinfo 文件将包含此数据。（注：数据并不在表格中。）

processor	0	1	2	3	4	5	6	7
physical id	0	1	0	1	0	1	0	1
core id	0	2	1	3	0	2	1	3
siblings	4	4	4	4	4	4	4	4
cpu cores	2	2	2	2	2	2	2	2

此例说明逻辑处理器 0 和 4 驻留在物理封装 0 的内核 0 上。这就表示逻辑处理器 0 和 4 支持超线程（HT）技术。相同的工作可用于封装 0 内核 1 上的逻辑处理器 2 和 6，封装 1 内核 2 上的逻辑处理器 1 和 5，以及封装 1 内核 3 上的逻辑处理器 3 和 7。此系统支持超线程（HT）技术，因为两个逻辑处理器共享同一个内核。有两种方式可以确定是否支持多内核。由于内核 0 和 1 存在于封装 0 上，而内核 2 和 3 存在于封装 1 上，所以这是一个多内核系统。此外，cpu cores 条目为 2，也说明有两个内核驻留在物理封装中。这是一个多路系统，因为有两个封装。
值得注意的是 physical id 和 core id 的编号可能是也可能不是连续的。系统上有两个物理封装并不罕见，而且 physical id 等于 0 和 3
CPU ID
CPU ID是CPU生产厂家为识别不同类型的CPU，而为CPU制订的不同的单一的代码；不同厂家的CPU，其CPU ID定义也是不同的；如 “0F24”（Inter处理器）、“681H”（AMD处理器），根据这些数字代码即可判断CPU属于哪种类型，这就是一般意义上的CPU ID。
由于计算机使用的是十六进制，因此CPU ID也是以十六进制表示的。Inter处理器的CPU ID一共包含四个数字，如“0F24”，从左至右分别表示 Type（类型）、Family（系列）、Mode（型号）和Stepping（步进编号）。从CPUID为“068X”的处理器开始，Inter另外增加了Brand ID（品种标识）用来辅助应用程序识别CPU的类型，因此根据“068X”CPUID还不能正确判别Pentium和Celerom处理器。必须配合Brand ID来进行细分。AMD处理器一般分为三位，如“681”，从左至右分别表示为Family（系列）、Mode（型号）和 Stepping（步进编号）。
Type（类型）
类型标识用来区别INTEL微处理器是用于由最终用户安装，还是由专业个人计算机系统集成商、服务公司或制作商安装；数字“1”标识所测试的微处理器是用于由用户安装的；数字“0”标识所测试的微处理器是用于由专业个人计算机系统集成商、服务公司或制作商安装的。我们通常使用的INTEL处理器类型标识都是“0”，“0F24”CPUID就属于这种类型。
Family（系列）
系列标识可用来确定处理器属于那一代产品。如6系列的INTEL处理器包括Pentium Pro、Pentium II、 Pentium II Xeon、Pentium III和Pentium III Xeon处理器。5系列（第五代）包括Pentium处理器和采用 MMX技术的Pentium处理器。AMD的6系列实际指有K7系列CPU，有DURON和ATHION两大类。最新一代的 INTEL Pentium 4系列处理器（包括相同核心的Celerom处理器）的系列值为“F”
Mode（型号）
型号标识可用来确定处理器的制作技术以及属于该系列的第几代设计（或核心），型号与系列通常是相互配合使用的，用于确定计算机所安装的处理器是属于某系列处理器的哪种特定类型。如可确定Celerom处理器是Coppermine还是Tualutin核心；Athlon XP处理器是Paiomino还是 Thorouhgbred核心。
Stepping（步进编号）
步进编号用来标识处理器的设计或制作版本，有助于控制和跟踪处理器的更改，步进还可以让最终用户更具体地识别其系统安装的处理器版本，确定微处理器的内部设计或制作特性。步进编号就好比处理器的小版本号，如CPUID为 “686”和“686A”就好比WINZIP8.0和8.1的关系。步进编号和核心步进是密切联系的。如CPUID为“686”的Pentium III 处理器是cCO核心，而“686A”表示的是更新版本cD0核心。
Brand ID（品种标识）
INTEL从Coppermine核心的处理器开始引入Brand ID作为CPU的辅助识别手段。如我们通过Brand ID可以识别出处理器究竟是Celerom还是Pentium 4。

和Oracle安装有关的Linux参数

共享内存
共享内存通过将通用的结构和数据放在共享内存段中，使得进程可以对它们进行访问。这是现有最快的进程间通信（IPC）方式主要是因为数据在进程之间传递时没有涉及到内核的操作。在进程之间不需要复制数据。
Oracle 将共享内存用于它的系统全局区 (SGA)，这是一个由所有的 Oracle 备份进程及前台进程所共享的内存区域。为 SGA 分配足够的容量对于 Oracle 的性能非常重要，因为它负责保存数据库缓冲区高速缓存、共享 SQL、访问路径以及更多。
要确定所有共享内存的限制，可使用以下命令：
# ipcs -lm
------ Shared Memory Limits --------
max number of segments = 4096
max seg size (kbytes) = 32768
max total shared memory (kbytes) = 8388608
min seg size (bytes) = 1
SHMMAX
SHMMAX 参数定义共享内存段的最大大小（以字节为单位）。
Oracle SGA 由共享内存组成，且错误设置 SHMMAX 可能会限制 SGA 的大小。在设置 SHMMAX 时，切记 SGA 的大小应该适合于一个共享内存段。SHMMAX 设置不足可能会导致以下问题：
ORA-27123:unable to attach to shared memory segment
您可以通过执行以下命令来确定 SHMMAX 的值：
# cat /proc/sys/kernel/shmmax
33554432
SHMMAX 的默认值为 32MB。通常，这个值对于配置 Oracle SGA 而言太小了。我通常使用以下任一方法将 SHMMAX 参数设置为 2GB：
* 通过直接更改 /proc 文件系统，你无需重新启动计算机便可以改变 SHMMAX 的缺省设置。可以使用以下方法动态设置 SHMMAX 的值。通过将此命令置于 /etc/rc.local 启动文件中可以使它永久有效：
# echo "2147483648" > /proc/sys/kernel/shmmax
* 您还可以使用 sysctl 命令来更改 SHMMAX 的值：
# sysctl -w kernel.shmmax=2147483648
* 最后，通过将该内核参数插入到 /etc/sysctl.conf 启动文件中，您可以使这种更改永久有效：
# echo "kernel.shmmax=2147483648" >> /etc/sysctl.conf
SHMMNI
我们现在看一下 SHMMNI 参数。这个内核参数用于设置系统范围内共享内存段的最大数量。该参数的缺省值是 4096。该值足以满足需要，因此通常无需更改。
可以通过执行以下命令来确定 SHMMNI 的值：
# cat /proc/sys/kernel/shmmni
4096
SHMALL
该参数控制系统一次可以使用的共享内存总量（以页为单位）
简言之，该参数的值始终应至少为：
ceil(SHMMAX/PAGE_SIZE)
SHMALL 的默认大小为 2097152，并可以使用以下命令进行查询：
# cat /proc/sys/kernel/shmall
2097152
SHMALL 的默认设置足以满足 Oracle RAC 10g 安装的需要。
（注意：i386 平台上的 Red Hat Linux 中的页面大小为 4,096 字节。但您可以使用 bigpages，它支持配置更大的内存页面大小。)
设置信号
对信号的最佳描述是，它是用于在共享资源（如共享内存）的进程（或进程中的线程）之间提供同步的计数器。Unix System V 支持信号集，其中的每个信号都是一个信号计数。当应用程序请求信号时，它使用“集合”来完成此工作。
要确定所有信号限制，可使用以下命令：
# ipcs -ls
------ Semaphore Limits --------
max number of arrays = 128
max semaphores per array = 250
max semaphores system wide = 32000
max ops per semop call = 32
semaphore max value = 32767
您还可以使用以下命令：
# cat /proc/sys/kernel/sem
250 32000 32 128
SEMMSL
SEMMSL 内核参数用于控制每个信号集合的最大信号数。
Oracle 建议将 SEMMSL 设置为 init.ora 文件（适用于 Linux 系统上所有数据库）中的最大 PROCESS 实例参数设置再加上 10。此外，Oracle 建议将 SEMMSL 设置为不小于 100。
SEMMNI
SEMMNI 内核参数用于控制整个 Linux 系统中信号集的最大数量。
Oracle 建议将 SEMMNI 设置为不小于 100。
SEMMNS
SEMMNS 内核参数用于控制整个 Linux 系统中的信号（而非信号集）的最大数量
Oracle 建议将 SEMMNS 设置为系统上每个数据库的 PROCESSES 实例参数设置之和，加上最大的 PROCESSES 的两倍，最后为系统上的每个 Oracle 数据库加上 10。
使用以下计算式确定可以在 Linux 系统上分配的信号的最大数量。它将是以下两者中较小的一个值：
SEMMNS — 或 — (SEMMSL * SEMMNI) 。
SEMOPM
SEMOPM 内核参数用于控制每个 semop 系统调用可以执行的信号操作数。
semop 系统调用（函数）能够使用一个 semop 系统调用完成多个信号的操作。一个信号集可以拥有每个信号集中最大数量的 SEMMSL，因此建议将 SEMOPM 设置为等于 SEMMSL。
Oracle 建议将 SEMOPM 设置为不小于 100。
设置信号内核参数
我想更改（增加）的唯一参数是 SEMOPM。所有其他的缺省设置可以完全满足我们的示例安装。
* 您可以通过直接更改 /proc 文件系统，不必重新启动机器而更改所有信号设置的缺省设置。该方法将以下内容置于 /etc/rc.local 启动文件中：
# echo "250 32000 100 128" > /proc/sys/kernel/sem
* 您还可以使用 sysctl 命令来更改所有信号设置的值：
# sysctl -w kernel.sem="250 32000 100 128"
* 最后，可以通过将内核参数插入到 /etc/sysctl.conf 启动文件中以使此更改永久有效：
# echo "kernel.sem=250 32000 100 128" >> /etc/sysctl.conf
设置文件句柄
配置 Red Hat Linux 服务器时，必须确保最大文件句柄数足够大。文件句柄的设置表示您在 Linux 系统上可以打开的文件数。
使用以下命令来确定整个系统中文件句柄的最大数量：
# cat /proc/sys/fs/file-max
32768
Oracle 建议将整个系统的文件句柄值至少设置为 65536。
* 通过直接更改 /proc 文件系统，您可以不必重新启动机器而改变文件句柄最大数量的默认设置。该方法将以下内容置于 /etc/rc.local 启动文件中：
# echo "65536" > /proc/sys/fs/file-max
* 您还可以使用 sysctl 命令来更改 SHMMAX 的值：
# sysctl -w fs.file-max=65536
* 最后，可以通过将内核参数插入到 /etc/sysctl.conf 启动文件中以使此更改永久有效：
# echo "fs.file-max=65536" >> /etc/sysctl.conf
可以通过使用以下命令查询文件句柄的当前使用情况：
# cat /proc/sys/fs/file-nr
613 95 32768
file-nr 文件显示了三个参数：分配的文件句柄总数、当前使用的文件句柄数以及可以分配的最大文件句柄数。
（注意：如果需要增大 /proc/sys/fs/file-max 中的值，请确保正确设置 ulimit。对于 2.4.20，通常将其设置为 unlimited。使用 ulimit 命令来验证 ulimit 设置：
# ulimit
unlimited

如何设置

1．通过在/etc/sysctl.conf中配置
kernel.shmmax = 536870912
kernel.shmmni = 4096
kernel.shmall = 2097152
kernel.sem = 250 32000 100 128
fs.file-max = 65536
net.ipv4.ip_local_port_range = 1024 65000
修改后运行sysctl -p 命令使得内核改变立即生效。以后重启机器也会按照这些配置修改机器环境
2．在/etc/rc.d/rc.local中直接写入/proc中去
在/etc/rc.d/rc.local文件的最后，加入
echo 8192 > /proc/sys/fs/file-max
echo 32768 > /proc/sys/fs/inode-max
echo 536870912 > /proc/sys/kernel/shmmax
echo 4096 > /proc/sys/kernel/shmmin
还需要在/etc/security/limits.conf文件增加如下内容：
oracle soft nofile 65536
oracle hard nofile 65536
oracle soft nproc 16384
oracle hard nproc 16384

Performance of compress tools : 7z and gzip

hmc@hmc-desktop:/mnt/vm$ ls -l xp
total 5358756
-rw------- 1 hmc hmc 5481996800 2010-07-10 22:52 xp.vdi

compress this 5Gb file of virtual image, using 7z takes near 1 hours to finish, while gzip is about 8 minutes. However, 7z's compress ratio is higher than gzip about 16%.

hmc@hmc-desktop:/mnt/vm$ time tar cvfz xp.gz ./xp
./xp/
./xp/xp.vdi

real    7m49.668s
user    6m3.430s
sys    0m28.660s

[compressed file size]

-rw-r--r-- 1 hmc hmc 2587547513 2010-07-10 23:41 xp.7z
-rw-r--r-- 1 hmc hmc 3332232784 2010-07-10 23:52 xp.gz

Here is a good article about compress tools' performance

Compression Tools Compared

Add testing result with bz2. Looks gz format will be my first choice.

hmc@hmc-desktop:/mnt/vm$ time tar cvfj xp.bz2 ./xp
./xp/
./xp/xp.vdi

real    36m11.689s
user    29m39.640s
sys    0m36.570s
hmc@hmc-desktop:/mnt/vm$ ls -l xp
total 5371056
-rw------- 1 hmc hmc 5494579712 2010-07-11 13:53 xp.vdi
hmc@hmc-desktop:/mnt/vm$ ls -l xp.bz2
-rw-r--r-- 1 hmc hmc 3318611695 2010-07-11 15:15 xp.bz2

Saturday, July 10, 2010

SQL Coding Conventions, Best Practices, and Programming Guidelines

SQL Server TSQL Coding Conventions, Best Practices, and Programming Guidelines

Database Coding Standard and Guideline

SQL编程规范

Thursday, July 08, 2010

详解/etc/fstab文件内容

/etc/fstab内容主要包括六项：

例如：打印出中间的两行内容，如下
LABEL=/    /    ext3   defaults     1     1
/dev/sda2   /mnt/D/     vfat    defaults    0   0

第一列：设备名或者设备卷标名，（/dev/sda10 或者 LABEL=/）

第二列：设备挂载目录        （例如上面的“/”或者“/mnt/D/”）

第三列：设备文件系统          （例如上面的“ext3”或者“vfat”）

第四列：挂载参数     （看帮助man mount）
对于已经挂载好的设备，例如上面的/dev/sda2，现在要改变挂载参数，这时可以不用卸载该设备，而可以使用下面的命令（没有挂载的设备，remount 这个参数无效）
#mount /mnt/D/ -o remount,ro （改defaults为ro）
为了安全起见，可以指明其他挂载参数，例如：
noexec（不允许可执行文件可执行，但千万不要把根分区挂为noexec，那就无法使用系统了，连mount 命令都无法使用了，这时只有重新做系统了！
nodev（不允许挂载设备文件）
nosuid,nosgid（不允许有suid和sgid属性）
nouser（不允许普通用户挂载）

第五列：指明是否要备份，（0为不备份，1为要备份，一般根分区要备份）

第六列：指明自检顺序。（0为不自检，1或者2为要自检，如果是根分区要设为1，其他分区只能是2）

Wednesday, July 07, 2010

Linux tips

noclobber

noclobber变量可以在重定向输出时保护已存在的文件，防止被意外地覆盖。在下例中，用户设置 noclobber为有效，在重定向时，用户试图去覆盖已经存在的文件myfile，此时系统将返回一个错误信息。

［例］

$ set –o noclobber

$ cat preface>myfile

bash: myfile: cannot overwrite existing file

http://space.itpub.net/batch.viewlink.php?itemid=662916

http://linux.vbird.org/linux_basic/0160startlinux.php
man一條指令后，在第一行，COMMAND(1)，或者是其他数字。
比如：
man date DATE(1)
man fstab FSTAB(5)

代號	代表內容
1	使用者在shell環境中可以操作的指令或可執行檔
2	系統核心可呼叫的函數與工具等
3	一些常用的函數 (function)與函式庫(library)，大部分為C的函式庫(libc)
4	裝置檔案的說明，通常在/dev下的檔案
5	設定檔或者是某些檔案的格式
6	遊戲(games)
7	慣例與協定等，例如Linux檔案系統、網路協定、ASCII code等等的說明
8	系統管理員可用的管理指令
9	跟kernel有關的文件

add Linux service for auto boot Oralce instances

Key steps:

—操作详细过程：

[root@localhost ~]#cd /etc/rc.d/init.d

[root@localhost init.d]#touch dbauto              

用命令新建好文件(或是 在root用户etc/rc.d/init.d目录下直接新建文件),然后在文件中加入--下面--的脚本内容,保存

[root@localhost ~]# chmod 755 /etc/rc.d/init.d/dbauto    // 设置文件权限

[root@localhost ~]# ls -l /etc/rc.d/init.d/dbauto

-rwxr-xr-x  1 oracle oinstall 785 Oct 23 08:27 /etc/rc.d/init.d/dbauto

[root@localhost ~]# chkconfig --add dbauto   服务添加服务列表

[root@localhost ~]# chkconfig --level 345 dbauto on    //设置dbauto服务在指定的运行级别内被启动

Monday, July 05, 2010

centos 5.5 kernel failed to reboot during start udev

[Symptom] after upgrade to CentOS 5.5 with new kernel 2.6.18-194.el5 keeps reboot during running udev, while using kernel 2.6.18-164 succeeds reboot.

someone says disable /sbin/start_udev inside /etc/rc.sysinit , tried "works" but failed to start x server and network.

someone says any incompatible hardware, this remind me I have one TV tuner plugged , but doesn't work.

after remove it, boots perfectly.

Monday, June 28, 2010

config cygwin cron

$ which cron-config
/usr/bin/cron-config
$ cron-config
Do you want to install the cron daemon as a service? (yes/no) yes
Enter the value of CYGWIN for the daemon: [ ] Cygwin Cron
ERROR: Only "[no]ntsec" "[no]smbntsec" "[no]traverse" allowed.
Enter the value of CYGWIN for the daemon: [ ] ntsec

You must decide under what account the cron daemon will run.
If you are the only user on this machine, the daemon can run as yourself.
   This gives access to all network drives but only allows you as user.
Otherwise cron should run under the local system account.
It will be capable of changing to other users without requiring a
password, using one of the three methods detailed in
http://cygwin.com/cygwin-ug-net/ntsec.html#ntsec-nopasswd1
Do you want the cron daemon to run as yourself? (yes/no) no

Running cron_diagnose ...
... no problem found.

INFO: A cron daemon is already running.

In case of problem, examine the log file for cron,
/var/log/cron.log, and the Windows event log (using /usr/bin/cronevents)
for information about the problem cron is having.

Examine also any cron.log file in the HOME directory
(or the file specified in MAILTO) and cron related files in /tmp.

If you cannot fix the problem, then report it to cygwin@cygwin.com.
Please run the script /usr/bin/cronbug and ATTACH its output
(the file cronbug.txt) to your e-mail.

WARNING: PATH may be set differently under cron than in interactive shells.
         Names such as "find" and "date" may refer to Windows programs.

$ cygrunsrv -R "Cron daemon"
cygrunsrv: Error removing a service: OpenService: Win32 error 1060:
The specified service does not exist as an installed service.

$ cygrunsrv -R cron
$ cron-config
Do you want to install the cron daemon as a service? (yes/no) yes
Enter the value of CYGWIN for the daemon: [ ] ntsec

You must decide under what account the cron daemon will run.
If you are the only user on this machine, the daemon can run as yourself.
   This gives access to all network drives but only allows you as user.
Otherwise cron should run under the local system account.
It will be capable of changing to other users without requiring a
password, using one of the three methods detailed in
http://cygwin.com/cygwin-ug-net/ntsec.html#ntsec-nopasswd1
Do you want the cron daemon to run as yourself? (yes/no) yes

Please enter the password for user 'liqy':
Reenter:
Running cron_diagnose ...
... no problem found.

INFO: A cron daemon is already running.

In case of problem, examine the log file for cron,
/var/log/cron.log, and the Windows event log (using /usr/bin/cronevents)
for information about the problem cron is having.

Examine also any cron.log file in the HOME directory
(or the file specified in MAILTO) and cron related files in /tmp.

If you cannot fix the problem, then report it to cygwin@cygwin.com.
Please run the script /usr/bin/cronbug and ATTACH its output
(the file cronbug.txt) to your e-mail.

WARNING: PATH may be set differently under cron than in interactive shells.
         Names such as "find" and "date" may refer to Windows programs.

$ cat /var/log/cron.log
/usr/sbin/cron: can't lock /var/run/cron.pid, otherpid may be 4428: Resource temporarily unavailable
/usr/sbin/cron: can't lock /var/run/cron.pid, otherpid may be 4428: Resource temporarily unavailable
$ ps -ef |grep 4428
SYSTEM    4428       1   ? 13:46:40 /usr/sbin/cron
$ > /var/log/cron.log
$ cat /var/log/cron.log
$ crontab -l
# DO NOT EDIT THIS FILE - edit the master and reinstall.
# (t.cron installed on Mon Jun 21 13:41:56 2010)
# (Cron version V5.0 -- $Id: crontab.c,v 1.12 2004/01/23 18:56:42 vixie Exp $)
* * * * * date>>t.log
$ pwd
/home/liqy
$ cat t.log
Mon Jun 21 13:47:02 MPST 2010
Mon Jun 21 13:48:02 MPST 2010
Mon Jun 21 13:49:02 MPST 2010
Mon Jun 21 13:50:03 MPST 2010
Mon Jun 21 13:51:03 MPST 2010
Mon Jun 21 13:52:02 MPST 2010
Mon Jun 21 13:53:02 MPST 2010
Mon Jun 21 13:54:02 MPST 2010
Mon Jun 21 13:55:02 MPST 2010

Update on 25-Jan-2016

Similar error encountered on Windows 10.

I found this is easier way to install cron as service.

and I changed the account to start the service using my own ID.

sample code for load top AWR sql into sql plan baseline

--in this example we load top 30 sql inside AWR snapshot 710 to 714 , into sql tuning set , followed by loaded and create as sql plan baseline.

EXEC DBMS_SQLTUNE.DROP_SQLSET('tset1');
EXEC DBMS_SQLTUNE.CREATE_SQLSET('tset1');

DECLARE
baseline_cursor DBMS_SQLTUNE.SQLSET_CURSOR;
my_plans PLS_INTEGER;
BEGIN
OPEN baseline_cursor FOR
    SELECT VALUE(p)
    FROM TABLE (DBMS_SQLTUNE.SELECT_WORKLOAD_REPOSITORY(
                  710,714,
                   NULL, NULL,
                   'elapsed_time',
                   NULL, NULL, NULL,
                   30)) p;

    DBMS_SQLTUNE.LOAD_SQLSET(
             sqlset_name     => 'tset1',
             populate_cursor => baseline_cursor);
    my_plans := DBMS_SPM.LOAD_PLANS_FROM_SQLSET( sqlset_name => 'tset1');
END;
/

Quick NFS HOWTO for Centos

On the server

vi /etc/exports
add lines like:
/data1/sessions 192.168.0.0/255.255.0.0(rw) 10.0.0.0/255.0.0.0(rw)
vi /etc/hosts.allow
add lines like:
portmap: 192.168.0.0/255.255.0.0, 10.0.0.0/255.0.0.0
/etc/init.d/nfsd start

On the Client

vi /etc/fstab, adding the following line:
nfshostname:/data1/sessions /mnt nfs rw,hard,intr 0 0
make sure to mkdir /mnt/sessions, or it won’t work. To do it manually, just:
mount nfshostname:/data1/sessions /mnt/sessions

What is sytem privilege "export/import full database"

SYS@ODST> select * from dba_sys_privs where grantee='LIQY';

GRANTEE                        PRIVILEGE                                ADM
------------------------------ ---------------------------------------- ---
LIQY                           CREATE VIEW                              NO
LIQY                           CREATE TABLE                             NO
LIQY                           ALTER SESSION                            NO
LIQY                           CREATE SESSION                           NO

SYS@ODST> grant export full database to liqy;

Grant succeeded.

odsdev01:ODST:/software/oraods/temp> exp liqy/liqyliqy@ODST file=icc.dmp log=icc.log tables=DBAM1.M1_icc_call

Export: Release 10.2.0.2.0 - Production on Fri Jun 25 10:45:50 2010

Copyright (c) 1982, 2005, Oracle. All rights reserved.

Connected to: Oracle Database 10g Enterprise Edition Release 10.2.0.2.0 - 64bit Production
With the Partitioning, OLAP and Data Mining options
Export done in US7ASCII character set and AL16UTF16 NCHAR character set
server uses UTF8 character set (possible charset conversion)

About to export specified tables via Conventional Path ...
EXP-00009: no privilege to export DBAM's table M_ICC_CALL
Export terminated successfully with warnings.
odsdev01:ODST:/software/oraods/temp> oerr exp 9
00009, 00000, "no privilege to export %s's table %s"
// *Cause: An attempt was made to export another user's table. Only a
//          database administrator can export another user's tables.
// *Action: Ask your database administrator to do the export.

SYS@ODST> select * from dba_sys_privs where grantee='LIQY';

GRANTEE                        PRIVILEGE                                ADM
------------------------------ ---------------------------------------- ---
LIQY                           CREATE VIEW                              NO
LIQY                           CREATE TABLE                             NO
LIQY                           ALTER SESSION                            NO
LIQY                           CREATE SESSION                           NO
LIQY                           EXPORT FULL DATABASE                     NO

SYS@ODST> select * from dba_role_privs where grantee='LIQY';

GRANTEE                        GRANTED_ROLE                   ADM DEF
------------------------------ ------------------------------ --- ---
LIQY                           T_ROLE                         NO YES
LIQY                           EXP_FULL_DATABASE              NO YES

odsdev01:ODST:/software/oraods/temp> exp liqy/liqyliqy@ODST file=icc.dmp log=icc.log tables=DBAM1.M1_icc_call

Export: Release 10.2.0.2.0 - Production on Fri Jun 25 10:51:41 2010

Copyright (c) 1982, 2005, Oracle. All rights reserved.

Connected to: Oracle Database 10g Enterprise Edition Release 10.2.0.2.0 - 64bit Production
With the Partitioning, OLAP and Data Mining options
Export done in US7ASCII character set and AL16UTF16 NCHAR character set
server uses UTF8 character set (possible charset conversion)

About to export specified tables via Conventional Path ...
Current user changed to DBAM. . exporting table                    M_ICC_CALL

SYS@ODST> select * from dba_sys_privs where grantee='EXP_FULL_DATABASE';

GRANTEE                        PRIVILEGE                                ADM
------------------------------ ---------------------------------------- ---
EXP_FULL_DATABASE              RESUMABLE                                NO
EXP_FULL_DATABASE              BACKUP ANY TABLE                         NO
EXP_FULL_DATABASE              EXECUTE ANY TYPE                         NO
EXP_FULL_DATABASE              SELECT ANY TABLE                         NO
EXP_FULL_DATABASE              READ ANY FILE GROUP                      NO
EXP_FULL_DATABASE              SELECT ANY SEQUENCE                      NO
EXP_FULL_DATABASE              EXECUTE ANY PROCEDURE                    NO
EXP_FULL_DATABASE              ADMINISTER RESOURCE MANAGER              NO

8 rows selected.

Finally, get the answer from metalink:

The system privileges EXPORT/IMPORT FULL DATABASE, introduced with 10gR1, are currently not used. These will be implemented in future releases with new functionality but in 10/11g these are not operational.

So confusing ...

Simple log miner

本身这个步骤很多高手都已经贴过了，只是我在使用中发现大体上大家写的都有些复杂，于是，我总结了个超级简化版的，方便大家使用：

1.安装LOGMNR包，需要本步骤没什么可多说的，只是需要注意在连接数据库的时候默认最好使用本地验证方式
C:\>sqlplus /nolog
SQL> conn / as sysdba
SQL> @D:\oracle\product\10.2.0\db_2\RDBMS\ADMIN\dbmslm.sql
SQL> @D:\oracle\product\10.2.0\db_2\RDBMS\ADMIN\dbmslmd.sql
SQL> @D:\oracle\product\10.2.0\db_2\RDBMS\ADMIN\dbmslms.sql"
SQL> show parameter utl;

2.创建数据字典
SQL> alter system set utl_file_dir='d:\oracle\logmnr' scope=both;
SQL> EXECUTE dbms_logmnr_d.build('dictionary.ora','d:\oracle\logmnr');

3.添加日志文件
SQL> EXECUTE dbms_logmnr.add_logfile(LogFileName=>'D:\1_15969.dbf',Options=>dbms_logmnr.new);
SQL> EXECUTE dbms_logmnr.add_logfile(LogFileName=>'D:\1_15969.dbf',Options=>dbms_logmnr.addfile);
或
SQL> begin
sys.dbms_logmnr.add_logfile(LogFileName=>'D:\1_15969.dbf',options =>dbms_logmnr.addfile);
end;

4.使用字典分析日志文件
SQL> execute dbms_logmnr.start_logmnr(dictfilename=>'d:\oracle\logmnr\dictionary.ora');

5.查询结果
SQL> select scn,sql_redo from v$logmnr_contents;

6.退出logmnr
SQL> execute dbms_logmnr.end_logmnr;

PS:这里最重要的是第5步，如果结果集很大的话建议使用PL/SQL等工具进行操作，这样便于后期修改，相对于SQLPLUS的格式化输出命令来说使用 PL/SQL DEVELOPER确实能方便很多。

Safety shutdown abort

以下方式是本人认为比较安全的的shutdown abort步骤，非官方，仅作参考
1. kill all dedicated server process
$kill -9 `ps -ef|grep LOCAL=NO|grep -v grep|awk '{print $9}'`
$ps -ef|grep LOCAL=NO|grep -v grep|awk '{print $9}'|xargs kill -9
可以等等一下事物回滚，呵呵
2.switch log file
sqlplus '/as sysdba'
SQL> alter system switch logfile;
3.checkpoint and suspend IO
SQL> alter system checkpoint;
SQL> alter system suspend;
执行完这个别把这个session终止，也不要执行其他语句，否则可能sqlplus "/as sysdba"上不去了。
4.shutdown db
SQL> shutdown abort
5.restart db
sqlplus '/as sysdba'
SQL> start up
6.review transaction rollback
SQL> select * from V$FAST_START_TRANSACTIONS;
通常shutdown abort不会损坏数据文件，即使损坏，也可能只是一些Block corruption，recove就OK了，更坏的可能是数据文件损坏，可能就需要做恢复了，比较麻烦。
不过我觉得这种方式还比较安全一些吧。
到是对于shutdown immediate不成功的情况，没有按照这个方法试过，呵呵。

Saturday, May 22, 2010

redo log size of shrink table

Table size 3.x Gb during my alter table shrink space compact, the redo log size is about 13Gb.
Be careful if the archived log space is too small.

500强金融行业需要什么样的IT人才

500强金融行业需要什么样的IT人才
                                    ----访太保集团OCM大师孙俊
  孙俊，作为Oracle技术在全球的顶级认证大师，在中国的人数不超过200人。现在中国太平洋保险集团公司从事数据库的管理工作，保障太平洋保险在中国地区核心业务系统正常运作。曾经管理与参与的重点项目有太平洋保险集团电子商务系统、集团P05 ods数据仓库系统、寿险P10核心业务系统、寿险营销员系统、太平洋保险集团恒生的估值系统等。依托其优秀的Oracle技术能力，为太平洋保险集团中国区业务的正常运行保障护航，为公司创造了巨大的价值。通过多年的职业历练，目前孙俊已成为 Oracle技术领域的大师级别认证高手，也是太平洋保险集团Oracle技术领域不可或缺的顶尖技术高手。

1、孙老师在太保公司已经待了比较长的时间，一方面可以就之前的经历和我们做一些回顾，另一方面，您也给公司的Oracle技术管理这个层面带来了很多有价值的东西，在开始之前孙老师可以聊一下太保这边Oracle大概的情况。
太保目前正在逐步走向国际化，公司目前主要使用的数据库版本有Oracle 、DB2、Informix 、SQL Server、MYSQL，由于Informix 在海量数据方面能力不是很强，所以目前太保集中式的系统都由原来的infomix转换成Oracle。其次，由于Oracle数据库始终坚持走开放路线且对Unix、Linux系统支持较好，而且 Oracle在版本更新这块能够与时俱进，所以太保这边集中式的系统都采用Oracle的体系架构。

2、       现在中国很多大型的集团公司包括垄断性行业用的都是Oracle的产品，Oracle可以说是这个行业的主流了，您能不能大致跟我们谈一下 Oracle技术在未来的一个行业发展趋势？
Oracle分几块，第一块是传统的Oracle技术RAC，第二Oracle在中间件这块的网格计算将来也会得到加强，从Oracle收购BEA之后，可以看到Oracle 在中间件融合方面进步很神速，最近收购SUN之后，基本上完成软件（操作系统、中间件、数据库）、硬件产品等方面的融合。下一步的话，Oracle应该对这些收购的产品做一个产品整合，然后推出一个更加完善的产品或者是解决方案。目前，Oracle主推的Exadata存储服务器就可以看到，它是将Oracle的存储软件和Sun硬件相结合的产物，提升了数据库的存储性能。

3、       像您这样的技术高管工作内容想必很复杂，日程通常也排得很满，您能否描述一下您工作的内容，同时在这么繁重的工作中，您是怎样来规避风险以及处理各种突发性和问题？
对于太平洋保险这样企业级系统和平台，工作重点在于带领团队进行整体技术架构的设计与规划，不仅仅立足于简单的技术层面，需要从保险业务视角去审视合理的、可扩展的技术框架，包括硬件、数据库以及应用层该如何高效地去协同工作，去支撑太保业务的高速发展。我带领整个团队，以Oracle数据库为核心和出发点，进行日常运行流程的制定，巡查数据库的日志、查看监控报告；同时在各个项目启动初期，安排资深团队提前介入，制定一个有针对性的容量规划和物理设计，帮助项目组建立数据库开发规范、项目验收要求，确保系统设计先天品质优良。在日常工作中，带领团队成员共同通过对以往经验的归纳总结，在风险管控和处理时效上找寻平衡点，制定出切实可行的制度，以制度和流程来保证数据库运维工作平稳，其中包括项目验收制度、风险评估流程、变更审批制度等一系列内容，通过这些工作实现运维问题管理，而不是运维危机的管理。
上述工作都需要建立在两个重要基础之上：强有力的技术团队+严谨的工作规范，带领团队对已有技术的精益求精、对新技术的不断尝试，同时通过对以往经验和问题总结归纳，制定切实可行的工作规范，这才有可能使我们的工作进入良性循环。我们的工作不是危机管理，不是制造“惊喜”，应用系统的安全、高效、平稳地运行才是我们的目标。

4、在您这么多年和Oracle 技术打交道的职业生涯中，比如说数据库这块，您认为是开发方向的技术更加核心还是管理方向更加核心？
首先呢，我个人觉得管理方向相对开发来说是更加核心的，但这两者之间是一个密不可分的关系，这就好比是枪和靶子的关系。管理是靶子，开发是枪，先划靶子再打抢，当枪和靶子都有了，就是一个执行的问题，管理的关键不在于知道怎么去做，而在于执行。海尔集团CEO张瑞敏曾经说过，管理的关键不在于知，而在于行。实质上也是强调的一个执行力的问题。

5、在您这么多年和Oracle 技术打交道的职业生涯中，能否分享一下，在技术的学习及提高方面，有什么样的心得和技巧？
唯一有的技巧就是勤奋。每天坚持去学习新的技术和学习资料，当别人在休闲之余，我们要耐得住性子，去钻研技术，勤奋和坚持是很重要的亮点。我始终相信，机会是留给有准备的人。

6、关于Oracle为什么要收购Sun已经谈论了很久，作为客户，您觉得这起收购能够给你们带来哪些方面的影响？
SUN和ORACLE经过了一段比较长的时间，Oracle收购SUN 其实是一个“软硬结合、软硬互补，两条腿走路”的策略，一个数据库的厂商和一个搞服务器的厂商能走在一起，这在以前是很难想象的，但是在去年年底却奇迹般的发生。SUN除了有强大的硬件平台，还有丰富的开源软件产品线。因为Oracle是一个强大的应用数据库平台。两者的结合将会给我们带来更加完整的解决方案，这些方案就包括软件，硬件，甚至操作系统等，这是我感受最深的。

7、您作为公司的资深Oracle技术专家，希望您能跟我们分享一下公司在招募新的员工时，什么样的技术水平比较符合公司对Oracle人才的需求？
公司在招人的时候，特别是应届生，对于这类人的职位，一般定位在助理管理员，技术门槛也较低，关键是要求他们有肯学、肯干的品质。在面试的时候怎么去考量他们有没有这两个品质呢，可以看他们在业余时间去学了什么，而有没有相关证书，对于我们来说可以是一个比较重要的参考依据，能看出他们对于技术这块有没有去付出努力。

8、您本人也参加过很多次Oracle的培训，您觉得培训后对于您个人在IT职业规划上面能起到多大作用？
从刚接触Oracle开始我都是参加比较正规的培训，中国有句老话：“师傅领进门，修行看个人”。参加一个系统的培训，能让我们对数据库有一个系统的认识。培训机构能够给我们的是时间和空间，如果是自己学习的话，很难做到在这段时间内全身心的去关注在学习这块。而培训机构能够让我们几天内全身心投入在这块的学习，并且能够提供一个非常不错的实验环境，这一点对于我们来说是非常有帮助的。

9、对于Oracle认证价值您是怎么看的？包括您本人在招聘新员工的时候对于这些证书是不是一个衡量的指标？
是一个衡量指标，不是一个决定性指标，但有总比没有好，至少代表了这个人学习技术的态度。拥有证书可以让我们看到这个人确实为提高自己的技术付出了努力，那么即使现在他的能力和我们招的职位不是很匹配，我们也会考虑录用这样的年轻人，因为我们相信这个人的学习能力肯定是比较强的。

10、目前Oracle在中国的用户及合作伙伴已经有一万多家，那么未来人才市场对于Oracle人才需求会是一个什么样的局面，能不能请您做个预测。
Oracle在收购了BEA、 Sun 、Peoplesoft等厂商后, Oracle人才的需求肯定会增加，而且还将是一个比较大的需求增长，这和Oracle公司本身发展速度有关，而且会需要更多高、精、尖的人才。包括 Oracle的中间件，ERP也会需要很多的人才，需求量一旦放大之后，那么对于人才的要求就是偏向更高端了。

Saturday, April 24, 2010

Learning PL/SQL

Driven by application PL/SQL troubleshooting, as the Pro*C doesn't work . while succeeds to OPEN CURSOR for a very complex SQL in less than 1 second (manually run it takes few minutes). The processing part is taken care by PRO*C.

From http://docstore.mik.ua/orelly/oracle/langpkt/ch01_09.htm, it says,
"You must open an explicit cursor before you can fetch rows from that cursor. When the cursor is opened, the processing includes the PARSE, BIND, OPEN, and EXECUTE statements. This OPEN processing includes: determining an execution plan, associating host variables and cursor parameters with the placeholders in the SQL statement, determining the result set, and, finally, setting the current row pointer to the first row in the result set. "

However, I doubt "the processing" includes " EXECUTE . Reason being, from my 10046 tracing comparison of without FETCH or with FETCH, there is no EXECUTION PLAN showed for without FETCH.

alter session set events '10046 trace name context forever, level 12' ;

     OPEN emp_cv FOR SELECT first_name, salary FROM employees where employee_id < emp_id;

--WITHOUT fetch

SELECT FIRST_NAME, SALARY
FROM
EMPLOYEES WHERE EMPLOYEE_ID < :B1

call     count       cpu    elapsed       disk      query    current        rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        0      0.00       0.00          0          0          0           0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total        2      0.00       0.00          0          0          0           0

Misses in library cache during parse: 0
Optimizer mode: ALL_ROWS
Parsing user id: 33     (recursive depth: 1)
********************************************************************************

alter session set events '10046 trace name context off'

--corresponding 10046 trace file of WITHOUT FETCH processing

=====================
PARSING IN CURSOR #1 len=65 dep=1 uid=33 oct=3 lid=33 tim=1242222958636520 hv=3716011877 ad='4cd98690'
SELECT FIRST_NAME, SALARY FROM EMPLOYEES WHERE EMPLOYEE_ID < :B1
END OF STMT
PARSE #1:c=0,e=71,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=1,tim=1242222958636514
BINDS #1:
kkscoacd
Bind#0
oacdty=02 mxl=22(21) mxlc=00 mal=00 scl=00 pre=00
oacflg=03 fl2=1206001 frm=00 csi=00 siz=24 off=0
kxsbbbfp=f6fe60ac bln=22 avl=03 flg=05
value=106
EXEC #1:c=0,e=144,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=1,tim=1242222958636745
WAIT #3: nam='SQL*Net message to client' ela= 5 driver id=1650815232 #bytes=1 p3=0 obj#=-1 tim=1242222958636793
EXEC #3:c=10000,e=3030,p=0,cr=0,cu=0,mis=1,r=1,dep=0,og=1,tim=1242222958636870
WAIT #3: nam='SQL*Net message from client' ela= 236 driver id=1650815232 #bytes=1 p3=0 obj#=-1 tim=1242222958637166

-- with FETCH processing

    OPEN emp_cv FOR SELECT first_name, salary FROM employees where employee_id < emp_id;
     loop
      fetch emp_cv into emp_dataPkg.er;
     exit when emp_cv%notfound;
      dbms_output.put_line(emp_dataPkg.er.name || ' - ' || emp_dataPkg.er.sal);
    end loop;
   CLOSE emp_cv;

********************************************************************************

SELECT FIRST_NAME, SALARY
FROM
EMPLOYEES WHERE EMPLOYEE_ID < :B1

call     count       cpu    elapsed       disk      query    current        rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        7      0.00       0.00          0         13          0           6
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total        9      0.00       0.00          0         13          0           6

Misses in library cache during parse: 1
Misses in library cache during execute: 1
Optimizer mode: ALL_ROWS
Parsing user id: 33     (recursive depth: 1)

Rows     Row Source Operation
------- ---------------------------------------------------
      6 TABLE ACCESS BY INDEX ROWID EMPLOYEES (cr=13 pr=0 pw=0 time=104 us)
      6   INDEX RANGE SCAN EMP_EMP_ID_PK (cr=7 pr=0 pw=0 time=120 us)(object id 12082)

********************************************************************************

select condition
from
cdef$ where rowid=:1

call     count       cpu    elapsed       disk      query    current        rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        1      0.00       0.00          0          2          0           1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total        3      0.00       0.00          0          2          0           1

Misses in library cache during parse: 0
Optimizer mode: CHOOSE
Parsing user id: SYS   (recursive depth: 2)

Rows     Row Source Operation
------- ---------------------------------------------------
      1 TABLE ACCESS BY USER ROWID CDEF$ (cr=1 pr=0 pw=0 time=67 us)

********************************************************************************

alter session set events '10046 trace name context off';

--corresponding 10046 trace file of WITH FETCH processing

=====================
PARSING IN CURSOR #1 len=65 dep=1 uid=33 oct=3 lid=33 tim=1242222802758295 hv=3716011877 ad='4cd98690'
SELECT FIRST_NAME, SALARY FROM EMPLOYEES WHERE EMPLOYEE_ID < :B1
END OF STMT
PARSE #1:c=0,e=548,p=0,cr=0,cu=0,mis=1,r=0,dep=1,og=1,tim=1242222802758286
=====================
PARSING IN CURSOR #4 len=42 dep=2 uid=0 oct=3 lid=0 tim=1242222802759562 hv=844002283 ad='4cf6f080'
select condition from cdef$ where rowid=:1
END OF STMT
PARSE #4:c=0,e=123,p=0,cr=0,cu=0,mis=0,r=0,dep=2,og=4,tim=1242222802759554
BINDS #4:
kkscoacd
Bind#0
oacdty=11 mxl=16(16) mxlc=00 mal=00 scl=00 pre=00
oacflg=18 fl2=0001 frm=00 csi=00 siz=16 off=0
kxsbbbfp=f6f62148 bln=16 avl=16 flg=05
value=00006F44.001D.0001
EXEC #4:c=0,e=362,p=0,cr=0,cu=0,mis=0,r=0,dep=2,og=4,tim=1242222802760102
FETCH #4:c=0,e=99,p=0,cr=2,cu=0,mis=0,r=1,dep=2,og=4,tim=1242222802760255
STAT #4 id=1 cnt=1 pid=0 pos=1 obj=31 op='TABLE ACCESS BY USER ROWID CDEF$ (cr=1 pr=0 pw=0 time=67 us)'
BINDS #1:
kkscoacd
Bind#0
oacdty=02 mxl=22(21) mxlc=00 mal=00 scl=00 pre=00
oacflg=03 fl2=1206001 frm=00 csi=00 siz=24 off=0
kxsbbbfp=f6f6254c bln=22 avl=03 flg=05
value=106
EXEC #1:c=0,e=3997,p=0,cr=2,cu=0,mis=1,r=0,dep=1,og=1,tim=1242222802762431
FETCH #1:c=0,e=108,p=0,cr=2,cu=0,mis=0,r=1,dep=1,og=1,tim=1242222802762631
FETCH #1:c=0,e=24,p=0,cr=2,cu=0,mis=0,r=1,dep=1,og=1,tim=1242222802762862
FETCH #1:c=0,e=18,p=0,cr=2,cu=0,mis=0,r=1,dep=1,og=1,tim=1242222802762950
FETCH #1:c=0,e=15,p=0,cr=2,cu=0,mis=0,r=1,dep=1,og=1,tim=1242222802763030
FETCH #1:c=0,e=14,p=0,cr=2,cu=0,mis=0,r=1,dep=1,og=1,tim=1242222802763108
FETCH #1:c=0,e=15,p=0,cr=2,cu=0,mis=0,r=1,dep=1,og=1,tim=1242222802763185
FETCH #1:c=0,e=9,p=0,cr=1,cu=0,mis=0,r=0,dep=1,og=1,tim=1242222802763257
STAT #1 id=1 cnt=6 pid=0 pos=1 obj=12080 op='TABLE ACCESS BY INDEX ROWID EMPLOYEES (cr=13 pr=0 pw=0 time=104 us)'
STAT #1 id=2 cnt=6 pid=1 pos=1 obj=12082 op='INDEX RANGE SCAN EMP_EMP_ID_PK (cr=7 pr=0 pw=0 time=120 us)'
WAIT #3: nam='SQL*Net message to client' ela= 10 driver id=1650815232 #bytes=1 p3=0 obj#=-1 tim=1242222802763502
EXEC #3:c=10000,e=6638,p=0,cr=15,cu=0,mis=0,r=1,dep=0,og=1,tim=1242222802763589
WAIT #3: nam='SQL*Net message from client' ela= 359 driver id=1650815232 #bytes=1 p3=0 obj#=-1 tim=1242222802764056
=====================

Tuesday, April 20, 2010

SSH public key authentication

Finally I configured SSH public key authentication on my Cygwin. So it is so convenient to log on to various Unix servers. :-)

-- on my local PC
$ pwd
/home/liqy/.ssh
$ ls -l
total 8
-rw------- 1 liqy Domain Users 898 2010-04-19 11:09 id_rsa
-rw-r--r-- 1 liqy Domain Users 1196 2010-04-19 10:56 known_hosts
$ ssh rpt02
Last   successful login for liqy: Mon Apr 19 11:10:48 SST-8 2010
Last unsuccessful login for liqy: Tue Mar 16 01:59:06 SST-8 2010
Last login: Mon Apr 19 11:10:48 2010 from 146-105-22.int.m1.com.sg
rpt02@/home/liqy> cd .ssh
-- on remote server
rpt02@/home/liqy/.ssh> ls -lrt
total 64
-rw-------   1 liqy       users         1024 Nov 27 2007 prng_seed
-rw-r--r--   1 liqy       users         9061 Dec 14 09:57 known_hosts
-rw-r--r--   1 liqy       users          220 Apr 1 13:57 authorized_keys

prerequsite of using sql loader direct path loading

Be careful with using sql*loader, if the target table :

is big and having indexes during data loading with logging = YES.

Reason being during data loading, index becomes UNUSABLE in dba_indexes (remains VALID in dba_objects). After data loading, then start rebuild indexes. When logging=YES, image FULL TABLE SCAN on a huge table, how many archived log we shall we ? multiply by number of indexes ... It is a big shock. and how much more time it will take , assuming the job loads multiple files , each invoke of sql*loader loads one file only .

The common understanding of sql*load

Use Direct Path Loads - The conventional path loader essentially loads the data by using standard insert statements. The direct path loader (direct=true) loads directly into the Oracle data files and creates blocks in Oracle database block format. The fact that SQL is not being issued makes the entire process much less taxing on the database. There are certain cases, however, in which direct path loads cannot be used (clustered tables). To prepare the database for direct path loads, the script $ORACLE_HOME/rdbms/admin/catldr.sql.sql must be executed.

In my case, apps team is not happy with 5 minutes performance, hence added "direct=ture", in the end caused archive log disk space full, job can't finish after running for 3 hours, generated 40+Gb archived log until archiver log hang.

During indexes rebuild after "direct=ture", see tremendous I/O incurred.

Report of conventional loading
Top 5 Timed Events

Event    Waits    Time(s)    Avg Wait(ms)    % Total Call Time    Wait Class
db file parallel write    6,358    240    38    97.5    System I/O
db file sequential read    44,238    128    3    52.0    User I/O
CPU time         115         46.7
log file parallel write    11,538    79    7    32.1    System I/O
log file sync    11,094    76    7    30.9    Commit

Tablespace    Reads    Av Reads/s    Av Rd(ms)    Av Blks/Rd    Writes    Av Writes/s    Buffer Waits    Av Buf Wt(ms)
IDX     40,216     11     2.84     1.00     99,853     28     0     0.00

report of direct path loading
Top 5 Timed Events

Event   Waits   Time(s)   Avg Wait(ms)   % Total Call Time   Wait Class
db file sequential read   2,901,803   1,610   1   42.4   User I/O
log file parallel write   22,845   1,410   62   37.1   System I/O
CPU time      729      19.2
Log archive I/O   46,028   455   10   12.0   System I/O
log file sequential read   22,795   87   4   2.3   System I/O

Tablespace IO Stats

    * ordered by IOs (Reads + Writes) desc

Tablespace   Reads   Av Reads/s   Av Rd(ms)   Av Blks/Rd   Writes   Av Writes/s   Buffer Waits   Av Buf Wt(ms)
IDX    2,866,984    794    0.52    1.00    427,225    118    0    0.00


rebuild indexes takes more time than gained from data loading part. Below is the figure I just used to rebuild unusable indexes for surepay04 tables, took near 10 minutes for nonvoice04 for each invoke of sql*loader, event with rebuild parallel.

    For the remedy, after I remove those backed up archived log, set relevant tables and indexes to NOLOGGING mode. Ask application team to remove "deirct=true".

> select index_name, status from dba_indexes where index_name like 'M1_SUREPAY_%04%_IDX%' ;

INDEX_NAME                     STATUS
------------------------------ --------
NONVOICE04_IDX1     UNUSABLE
NONVOICE04_IDX2     UNUSABLE
NONVOICE04_IDX4     UNUSABLE
VOICE04_IDX1        VALID
VOICE04_IDX2        VALID
VOICE04_IDX3        VALID
VOICE04_IDX4        VALID
OTHER04_IDX2        VALID
OTHER04_IDX3        VALID
OTHER04_IDX1        VALID
NONVOICE04_IDX3     UNUSABLE

11 rows selected.

> alter index NONVOICE04_IDX1 rebuild parallel 3 ;

Index altered.

Elapsed: 00:02:43.39
> alter index NONVOICE04_IDX2 rebuild parallel 3 ;

Index altered.

Elapsed: 00:02:40.96
> alter index NONVOICE04_IDX3 rebuild parallel 3 ;

Index altered.

Elapsed: 00:02:23.70
> alter index NONVOICE04_IDX4 rebuild parallel 3 ;

Index altered.

alter index NONVOICE04_IDX1 noparallel ;
alter index NONVOICE04_IDX2 noparallel ;
alter index NONVOICE04_IDX3 noparallel ;
alter index NONVOICE04_IDX4 noparallel ;