Home » RDBMS Server » Performance Tuning » Oracle 9i redo logging performance poor in relation to DB2 on Solaris. Please help.
Oracle 9i redo logging performance poor in relation to DB2 on Solaris. Please help. [message #188432] Fri, 18 August 2006 09:52 Go to next message
adamb
Messages: 6
Registered: April 2006
Location: London, UK
Junior Member
Solaris 8, Sun V440, Oracle 9i, DB2 v8.2, RAID 5.

The application writes out 25MB of data as 3 BLOBs, committing after each BLOB. Redo logging is turned on. The application can use Oracle and DB2 for persistence.
Interaction with the db instances is achieved using Oracle’s OCI and DB2’s CLI. The BLOBs are fully prepared in memory and written out using techniques set out in OCI/CLI manuals.

The performance when using Oracle is significantly worse than when using DB2. Timing BLOB write and commit stages produces the following results:
Oracle 25 secs
DB2 3 secs

Switching redo logging off in Oracle drops the time to about 3 secs. Hence the increase must account for logging. There are 3 log files in 3 log groups. Each log file is 100MB.

Iostat trace shows that both DB2 and Oracle write a similar amount to disk (2 and a bit times the 25MB of data). Logging is activated for both DBMS. Importantly, iostat shows that Oracle performs 10 times more disk transfers than DB2.

Here are the things we have tried to improve Oracle performance:

• log_buffer: altering the value between 512K and 16Mb leads to no significant improvements, 2 secs variance at most. Unsurprisingly, larger log_buffer size decreases the write time and increases the commit time. The sum of these two times stays approximately the same.
• The logs were moved off the RAID onto a separate OS managed disk. This has not significantly improved the write time. Moving logs to /tmp cancels the log write cost – as if logging is off.

Things we can try to do but haven’t:

• Put redo log files on to raw partitions on separate disks with separate controllers. Optimise those disks for sequential write activity. This is bound to improve performance but fails to explain why DB2 is performing so much better without special hardware.
• Experiment with LOB buffering. The LOB performance guidelines suggest that writing small parts of a LOB can be improved with buffering. However, our app builds the entire LOB in RAM before writing out to the db in one go. Moreover, writing the LOB writing part appears to be good enough in Oracle.

What else can we do to improve Oracle’s LGWR performance? I am surprised that DB2 is so much better in this relatively common task.

Adam
Re: Oracle 9i redo logging performance poor in relation to DB2 on Solaris. Please help. [message #188510 is a reply to message #188432] Fri, 18 August 2006 21:17 Go to previous messageGo to next message
oracleminded
Messages: 1
Registered: August 2006
Junior Member
Hello,

there could be many reasons for this....

First, I assume you are referring to a logging vs. nologging operation and not archive mode. Is this correct? Is archiving turned on as well...for both?

Are DB2 and Oracle on the same machine or different machines?

Have you reviewed statspack? What are you top wait events and wait times during this period. You must start with statspack to get a reference point for your question as there are many many places to look. Please post.

10046 trace event for the session...look on Metalink for some good info on how to do this as well. Really important to see what that session is doing. Unlike statspack, 10046 events are surgical to the session as opposed to the database. If you don't need bind variable analysis, use the levels associated with wait events. Note - the trace files can be large. Oracle also has a trace analyzer zip file you can download for this(trca.zip...i think). Very useful to get a nice output. Make you practice this on development first, so you get the hang of this. If your unfamiliar with this, open a tar with oracle.


I/O spread
What is this distribution of data files, indexes files, archive logs, redo logs, undo, temp, and the number of disks (including the raid array)? How about DB2? Any relationship and overlap to the Oracle side? Is OS Logical volume striping involved on top of the raid 5 using the "luns" in a VG ? How many disks are in each luns array group on the raid array (I'm assuming a clarion or emc here...but it doesn't have to be).

Note - Raid5 is not the best place for redo sequential...but, some of today's raid 5 can handle it....check with the vendor as some raid arrays have special optimizations for parity storage to reduce writes from the write cache to the underlying disks.

OS
What do your Vmstats look like? Paging, swapping, large amounts of pages being scanned but few freed? How's CPU distribution? Is system CPU much higher than User cpu? What's your distribution of Free, buffer cache, SGA, and System memory. Is there an automation involved where the OS manages it all (like in AIX 5L)

In place Traces:
I know this sounds strange but do you have any traces turned on in init.ora, sqlnet.ora, or listener.ora? They can absolutely wreck i/o if you are not careful and are not using them surgically for specifically targeted situations.


Bottom line...lets start with Statspack.

Re: Oracle 9i redo logging performance poor in relation to DB2 on Solaris. Please help. [message #190359 is a reply to message #188510] Wed, 30 August 2006 06:32 Go to previous message
adamb
Messages: 6
Registered: April 2006
Location: London, UK
Junior Member
oracleminded wrote on Fri, 18 August 2006 21:17

Hello,

there could be many reasons for this....

First, I assume you are referring to a logging vs. nologging operation and not archive mode. Is this correct? Is archiving turned on as well...for both?

Are DB2 and Oracle on the same machine or different machines?

Have you reviewed statspack? What are you top wait events and wait times during this period. You must start with statspack to get a reference point for your question as there are many many places to look. Please post.

10046 trace event for the session...look on Metalink for some good info on how to do this as well. Really important to see what that session is doing. Unlike statspack, 10046 events are surgical to the session as opposed to the database. If you don't need bind variable analysis, use the levels associated with wait events. Note - the trace files can be large. Oracle also has a trace analyzer zip file you can download for this(trca.zip...i think). Very useful to get a nice output. Make you practice this on development first, so you get the hang of this. If your unfamiliar with this, open a tar with oracle.


I/O spread
What is this distribution of data files, indexes files, archive logs, redo logs, undo, temp, and the number of disks (including the raid array)? How about DB2? Any relationship and overlap to the Oracle side? Is OS Logical volume striping involved on top of the raid 5 using the "luns" in a VG ? How many disks are in each luns array group on the raid array (I'm assuming a clarion or emc here...but it doesn't have to be).

Note - Raid5 is not the best place for redo sequential...but, some of today's raid 5 can handle it....check with the vendor as some raid arrays have special optimizations for parity storage to reduce writes from the write cache to the underlying disks.

OS
What do your Vmstats look like? Paging, swapping, large amounts of pages being scanned but few freed? How's CPU distribution? Is system CPU much higher than User cpu? What's your distribution of Free, buffer cache, SGA, and System memory. Is there an automation involved where the OS manages it all (like in AIX 5L)

In place Traces:
I know this sounds strange but do you have any traces turned on in init.ora, sqlnet.ora, or listener.ora? They can absolutely wreck i/o if you are not careful and are not using them surgically for specifically targeted situations.


Bottom line...lets start with Statspack.




I took my while replying as I was away last week. Yes I am referring to [NO]LOGGING. Archiving is off. DB2 and Oracle are on the same machine and the same disks.

To wait events are copied from the statspace output:

Top 5 Timed Events
~~~~~~~~~~~~~~~~~~
Event Waits Time (s) Ela Time
--------------- ------------ ----------- --------
log buffer space 60 24 54.40
log file parallel write 92 7 15.99
log file sync 23 6 13.66
CPU time 6 13.15
ctl file par write 60 1 1.86

vmstat shows nothing unusual. There CPU is not overloaded. There's a lot of disk I/O and as I've written before it's related with LOGGING. NOLOGGING reduces I/O a lot.

Here's the mem allocation for the above statspace snapshots:
SGA regions Size in Bytes
------------------------------ ----------------
Database Buffers 33,554,432
Fixed Size 730,032
Redo Buffers 1,179,648
Variable Size 167,772,160
----------------
sum 203,236,272

See my original post about experimentation with Redo Buffers.

There are no traces turned on. This is a development environment with no load other than the app I describe.

If you can glean anything from above then please let me know.

Adam
Previous Topic: Temp Data files Size
Next Topic: what to do about this one !!!
Goto Forum:
  


Current Time: Fri May 03 06:51:18 CDT 2024