Wednesday, November 17, 2010

vmware player failed to compile module vmmon after upgrade to Ubuntu 10.10

Installing VMware Workstation 7.1.1 64 bit on Ubuntu 10.10 » Pario TechnoBlob
wget http://www.sputnick-area.net/scripts/vmware7.1.1-patch-kernel-2.6.35.bash

VirtualBox can’t operate in VMX root mode. Please disable the KVM kernel extension, recompile your kernel and reboot (VERR_VMX_IN_VMX_ROOT_MODE).

After I changed CPU from AMD to Intel, encounter the following error :
VirtualBox can’t operate in VMX root mode. Please disable the KVM kernel extension, recompile your kernel and reboot (VERR_VMX_IN_VMX_ROOT_MODE).
The root cause is either the kvm-intel or kvm-amd module  has been loaded at boot time. To remove it   :
root@localhost:~# lsmod |grep kvm
kvm_intel              32832  0
kvm                   182683  1 kvm_intel
root@localhost:~# modprobe -r kvm_intel
No more message like the one above will be displayed.

CentOS: Unable to resume swap device SWAP-sda2

Last night, upgrade my AMD CPU to Intel CPU, everything works fine Ubuntu, but not able to boot Cent OS , with above error message .

Googled and got idea from http://fedoraforum.org/forum/printthread.php?t=120868



It seems vladak's message match to my case.
"I hit this problem while swapping motherboard+CPU (Athlon X2 instead of Sempron) for Fedora 10 box. After the hardware change (could be caused because the old motherboard had 2 IDE channels and the new one has only 1 ? the boot disk is connected via IDE.) the system reported the dreaded "Unable to access resume device" message. Booting from Fedora 10 install DVD into rescue mode, chroot to /mnt/sysimage and mkinitrd fixed the problem."

Fixed the problem with below steps.
1. Find the CentOS 5.5 install DVD and boot into rescue mode.
2. chroot /mnt/sysimage
3. cd /boot and move existing initrd*.img ./initrd_bak
[root@localhost boot]# ll initrd_bak
total 15648
-rw------- 1 root root 2663536 Aug 21 12:46 initrd-2.6.18-194.11.1.el5.img
-rw------- 1 root root 2663856 Aug 21 12:45 initrd-2.6.18-194.11.1.el5xen.img
-rw------- 1 root root 2663556 Sep 12 22:19 initrd-2.6.18-194.11.3.el5.img
-rw------- 1 root root 2663874 Sep 12 22:19 initrd-2.6.18-194.11.3.el5xen.img
-rw------- 1 root root 2663599 Sep 29 22:56 initrd-2.6.18-194.11.4.el5.img
-rw------- 1 root root 2663935 Sep 29 22:55 initrd-2.6.18-194.11.4.el5xen.img

4. invoke below command to initialize RAM disk

mkinitrd initrd-2.6.18-194.11.4.el5.img initrd-2.6.18-194.11.4.el5

5.  then I have below file, note tha the size is different with backup copy.
rw------- 1 root root 2672372 Nov 17  2010 initrd-2.6.18-194.11.4.el5.img

6. exit twice to boot up. cheers!

Friday, November 12, 2010

Using advantage of partition elimination

[The problem]

    Application team asked for help:
  • Daily financial report delayed near one month because job running very slow while DB to shutdown everyday for cold backup. 
  • My estimation is about 30 hours for the job to completed. 
  • Vendor not able to provide solution even tried changing code a few times.



[Diagnostic]
    This range partitioning table is about 140Gb big, with 500+ partitions, even partition stores 5 values of job_id.
    From execution plan, partition is performed but no parallelism regardless PARALLEL server enabled, and no performing full table scan, how ever generating lots of I/O , consist gets requires , physical reads etc.
    Total consist gets is about 5 times of partitions needed for scan.
    The statement is like this:
     select ... from p_table where part_key_col in (select distinct job_id from job_table...);

    The execution plan shows NEST LOOP for each value return from sub-query, caused redundantly access to partitions.
    ie.
     for each job_id returned from sub-query (110 distinct job_id)
       do
         full partition scan  ( one of total 22 partitions )
       done.
  Hence, each partition is scanned 5 times in worst situation. Total times: 110
  While parallelized scan can't happen in single partition, which only occurs for simultaneoustly access to multiple partitions.

   
[Solution]
    Rewrite the code , to make partition elimination happen.
    select  ... from p_table where part_key_col between (select min(distinct job_id) from job_table ...) and (select max(distinct job_id) from job_table ...);
    Not that the logic slightly changed, but applicable in this case.

     After make this change according to my suggestion, job finished within 30 minutes, while observing 12 parallel processes running happily to scan 22 partitions once only.
    Cheers!

[Update on 17-Nov]

One more think,  SQL logic should not be changed.  Studied more about partition pruning from data warehousing guide, found USE_HASH hint achieved same effects without rewrite SQL.