9 Mayıs 2018 Çarşamba

Exadata Disk Scrubbing Problem

Problem:
--------

We hit a problem previous weeks related Exadata storage. In fact not related to Exadata but the impact is big.

We are using Attunity Replicate for a CDC tool and Attunity Replicate CDC method is different from goldengate.

As you know Goldengate uses the stream for reading redo logs and archives. But Attunity Replicate is different.

(I will use shortcut AR for Attunity replicate)

AR is reading online redo logs and archive logs with Oracle DBMS package so it may be affected by disk IO or anything else.


Solution:
-----------
Another story that;

Disk scrubbing is a new feature that introduced in Oracle 11.2.0.4 and Exadata 11.2.3.3.0 storage.

Disk scrubbing checks the disks for the corruption risks.

In default configuration,checks every 2 weeks.(hardDiskScrubInterval=biweekly)

Let me show;

It is the default configuration

CellCLI> list cell attributes name,hardDiskScrubInterval
         x6celadm05   biweekly

Till this step, everything is ok, Oracle protects our disks and data. But every cells start scrubbing same time, you may have IO problem.

Our database has a lot of transactions, but we don't see any impact of this scrubbing, till AR went slow done(latency is growing) while scrubbing is continues.

After we found the scrubbing information from cell alert logs, we decide to start scrubbing different times. Then we set a different start time
for every cell.

For example;

One cell disk scrubbing takes 44 hours and we have 7 cells. So I didn't change the interval(biweekly). After Cell 1 disk scrubbing is finished,
cell 2 will start, after cell 2 is finished cell 3 will start etc. So in this method, only one cell will do scrubbing in a meantime.

dcli -g cell_group  -l root cellcli -e "list cell attributes name,hardDiskScrubInterval,hardDiskScrubStartTime"
x6celadm07: x6celadm07     biweekly        2018-04-22T05:00:00+03:00
x6celadm06: x6celadm06     biweekly        2018-04-20T08:00:00+03:00
x6celadm05: x6celadm05     biweekly        2018-04-18T11:00:00+03:00
x6celadm04: x6celadm04     biweekly        2018-04-16T14:00:00+03:00
x6celadm03: x6celadm03     biweekly        2018-04-14T17:00:00+03:00
x6celadm02: x6celadm02     biweekly        2018-04-12T20:00:00+03:00
x6celadm01: x6celadm01     biweekly        2018-04-10T23:00:00+03:00

After this implementation, AR(attunnity replicate) latencies was going down.


ORA-00979: not a GROUP BY expression

Problem:
-----------
ORA-00979: not a GROUP BY expression


We have upgraded our database from 11.2.0.4 to 12.2.0.1.

After upgrade one sql got "ORA-00979: not a GROUP BY expression" error.

Solution:
-----------

It was similar to Bug 18749211 ORA-979 FROM SELECT WITH COLUMN MASKING VPD AND VIEW MERGING VPD, but we don't use VPN or anything else, only we upgraded the database.

Also, /*+ materialize */ didn't work advice as a workaround in bug document.

Then I found another bug 27170305: ORA-00979 WHEN A CASE STATEMENT IS THERE IN THE "GROUP BY" EXPRESSION 



Workaround:
---------------------

/*+ optimizer_features_enable('11.1.0.6') */


But these hints caused performance problems, we need to apply the patch if it could be done :)

So, if this patch is not available for your version like us, you need to ask to oracle support.

Now, this patch is ready for our system in a one week:)