exadata IO etiketine sahip kayıtlar gösteriliyor. Tüm kayıtları göster
exadata IO etiketine sahip kayıtlar gösteriliyor. Tüm kayıtları göster

9 Mayıs 2018 Çarşamba

Exadata Disk Scrubbing Problem

Problem:
--------

We hit a problem previous weeks related Exadata storage. In fact not related to Exadata but the impact is big.

We are using Attunity Replicate for a CDC tool and Attunity Replicate CDC method is different from goldengate.

As you know Goldengate uses the stream for reading redo logs and archives. But Attunity Replicate is different.

(I will use shortcut AR for Attunity replicate)

AR is reading online redo logs and archive logs with Oracle DBMS package so it may be affected by disk IO or anything else.


Solution:
-----------
Another story that;

Disk scrubbing is a new feature that introduced in Oracle 11.2.0.4 and Exadata 11.2.3.3.0 storage.

Disk scrubbing checks the disks for the corruption risks.

In default configuration,checks every 2 weeks.(hardDiskScrubInterval=biweekly)

Let me show;

It is the default configuration

CellCLI> list cell attributes name,hardDiskScrubInterval
         x6celadm05   biweekly

Till this step, everything is ok, Oracle protects our disks and data. But every cells start scrubbing same time, you may have IO problem.

Our database has a lot of transactions, but we don't see any impact of this scrubbing, till AR went slow done(latency is growing) while scrubbing is continues.

After we found the scrubbing information from cell alert logs, we decide to start scrubbing different times. Then we set a different start time
for every cell.

For example;

One cell disk scrubbing takes 44 hours and we have 7 cells. So I didn't change the interval(biweekly). After Cell 1 disk scrubbing is finished,
cell 2 will start, after cell 2 is finished cell 3 will start etc. So in this method, only one cell will do scrubbing in a meantime.

dcli -g cell_group  -l root cellcli -e "list cell attributes name,hardDiskScrubInterval,hardDiskScrubStartTime"
x6celadm07: x6celadm07     biweekly        2018-04-22T05:00:00+03:00
x6celadm06: x6celadm06     biweekly        2018-04-20T08:00:00+03:00
x6celadm05: x6celadm05     biweekly        2018-04-18T11:00:00+03:00
x6celadm04: x6celadm04     biweekly        2018-04-16T14:00:00+03:00
x6celadm03: x6celadm03     biweekly        2018-04-14T17:00:00+03:00
x6celadm02: x6celadm02     biweekly        2018-04-12T20:00:00+03:00
x6celadm01: x6celadm01     biweekly        2018-04-10T23:00:00+03:00

After this implementation, AR(attunnity replicate) latencies was going down.