Tuesday, May 26, 2009

Oracle Streams Performance Tests

Oracle Streams Performance Tests

Have used the following parameters for capture and apply processes for Streams stress tests. This POC was done using only one-way streams test setup.

Capture Parameters
===============

PARALLELISM =
_CHECKPOINT_FORCE = Y
_SGA_SIZE = 40M
_CHECKPOINT_FREQUENCY = 495


* Capture checkpoint retention is set to 1 day.
* Propagation Latency is set to 1 sec.


Apply Parameters
=============

PARALLELISM =
DISABLE_ON_ERROR = N
TXN_LCR_SPILL_THRESHOLD = 100,000
_TXN_BUFFER_SIZE = 2
_DYNAMIC_STMTS = Y
_CMPKEY_ONLY = Y
_RESTRICT_ALL_REF_CONS = N
_HASH_TABLE_SIZE = 10000000


Server Parameters on both databases source and destination:
===========================================

streams_pool_size
= 4G
_job_queue_interval = 1

Test 1 - Load 1m rows with 500 row commit:

Have loaded 1m rows into a test table that is being replicated by Streams with commit for every 500 rows. Have used 8 degree parallelism on apply process and the table and primary key index on destination defined with over 16 initrans on destination to avoid contention.

Findings:

  • Oracle Streams took 9 minutes to replicate all changes. An average throughput of 111k messages per minute was achieved during this test.
  • Capture process on source waited for 1.8 mins on 'Paused for Flow control' event due to apply server bottleneck.
  • No messages were spilled to disk during this test.

Test 2 - Update 100k rows in one transaction:

Bulk-Updated 100k rows in a test table in one transaction. This test was performed using apply parameter TXN_LCR_SPILL_THRESHOLD set to little over 100k.

Findings:

  • Oracle Streams took 4 minutes to replicate all changes.
  • Capture process on source waited for 0.17 mins on 'Paused for Flow control' event due to apply server bottleneck.
  • No message were spilled to disk during this test.

Test 3 - Update 300k rows in one transaction:

Bulk-Updated 300k rows in a test table in one transaction. This test was performed using apply parameter TXN_LCR_SPILL_THRESHOLD set to little over 100k.

Findings:

  • Oracle Streams took 21 minutes to replicate all changes.
  • Capture process on source waited for 8 mins on 'Paused for Flow control' event due to apply server bottleneck.
  • No message were spilled to disk during this test.
  • Streams used upto 50% of allocated 4GB streams pool on source and destination.

Test 4 - Update 300k rows in one transaction:

Bulk-Updated 300k rows in a test table in one transaction. This test was performed using apply parameter TXN_LCR_SPILL_THRESHOLD set to little over 500k.

Findings:

  • Oracle Streams took 9 minutes to replicate all changes.
  • Capture process on source waited for 1.05 mins on 'Paused for Flow control' event due to apply server bottleneck.
  • Spilled 104K message to disk on source and 80k messages on destination during this test. This spill occurred due to fixed 5 minute LCR lifespan time in buffered queue restriction.
  • Streams used upto 50% of allocated 4GB streams pool on source and destination.

Findings

  • As you have noticed from above tests, Oracle Streams performed quite well when replicate small transactions.
  • But, with medium to large transactions, Oracle Streams performance was average to fair. As you have seen from test 3, Oracle streams took 21 mins to replicate 300k changes. Since the TXN_LCR_SPILL_THRESHOLD apply parameter was set to 100k and the Streams was propagating messages in 100k batches and storing them in a staging table sys.streams$_apply_spill_msgs_part at destination. And, then it started applying those messages from staging table after all the messages are propagated. This process was affecting Streams performance and as result it took 21 minutes to replicate changes in test 3. The similar test from test 4, but with txn_lcr_spill_threshold apply parameter set to little over 500k took only 9 minutes to replicate the same transaction. That is over 50% improvement. But, the downside from test 4 is that it generated significant amount of redo on source and destination due to disk spills and that was occurred due to fixed short LCR lifespan time in buffered queue.
  • The TXN_LCR_SPILL_THRESHOLD apply parameter is basically a safeguard that prevents messages being spilled to disk especially for very large txns that takes over 5 minute to replicate. Setting this value to infinite may have significant downside such as excessive redo on source and destination and that may even affect Streams performance when there is short of space in streams_pool. One has to test it properly and set this parameter value to optimal number to prevent disk spills and excessive redo and at the same time to get acceptable performance.
  • Set the apply parameter "_DYNAMIC_STMTS=Y" to improve performance of frequent updates. One can enable apply process(s) to trace (10046 is sufficient) to see the difference between the sqls being issued by apply processes with and without this parameter setting.

No comments:

Post a Comment