3510 performance (again)
Hi,
I read some other posts about 3510 performance but most of them were focused on MB/s rather than IOPS. I have very strange issue with one of our database on 3510 - DBWR (Oracle process responsible for flush buffer cache) checkpoints takes ages. After some investigation I discovered that on average one (8k) write takes 17 ms with no ongoing reads and can grow up to 20 ms with ongoing reads. I am not expert in this area but I would expect in worse case scenario no more than 10ms (100 IOPS/disk).
1. Firmware version is 413c (probably bad):
sccli>show inquiry-data
Vendor: SUN
Product: StorEdge 3510
Revision: 413C
Peripheral Device Type: 0x0
NVRAM Defaults: 413C 3510 S442F
2. There is ongoing process for checking discs (is it automatic?):
sccli> show media-check
Ch ID Iteration Status
20 89 13% complete
21 89 34% complete
22 89 20% complete
...
3. One PLD is in different version than others:
sccli> show ses-devices
Ch Id Chassis Vendor/Product IDRev PLD WWNN WWPN
-
2 12 003663 SUN StorEdge 3510F A1080 1000 204000C0FF003663 214000C0FF003663
...
3 28 084B7E SUN StorEdge 3510F D1080 A000* 205000C0FF084B7E 225000C0FF084B7E
...
Topology: loop(b)
3 44 00375A SUN StorEdge 3510F D1080 1000 205000C0FF00375A 225000C0FF00375A
4. There are some events about battery charging in show events and battery-status:
sccli> show battery-status
sccli: Upper Battery: error: in service date not set in the battery
sccli: Lower Battery: error: in service date not set in the battery
5. Database uses 4 logical drives in RAID1 (+0) with # disks 10,10,14,2
and StripeSize 256 KB.
6. Cache is setup in write-back mode:
sccli>show cache-parameters
mode: write-back
optimization: sequential
sync-period: disabled
current-global-write-policy: write-through
7. Solaris 9, VxFS/VxVM with DMP, Oracle 9.2
Can you help me with questions:
* is it correct ?
* can reconfiguration of storage/disks really help?
* what are your real life IOPS and service times
Thanks
Maniek
# 1
Hi,
I switched cache-policy at logical-drive level, from default to write-back and now average write is less than 4ms.
I found somewhere in docs that current-global-write-policy can be setup via
"set cache-parameters write-back". I run this command but I still have same results:
sccli> show cache-parameters
mode: write-back
optimization: sequential
sync-period: disabled
current-global-write-policy: write-through
Would it be a problem with sccli version 2.1 or problem with firmware 413C or maybe I should do this different way?
BTW: How mirroring is done in 3510, is it run serially? I mean write to first disk and next write to second disc?. This 16ms with write-through would suggest that it is serial rather then parallel.
Regards,
Maniek
# 2
I won't pretend to be a performance guy, but I did notice a couple of things in the cli outputs you provided.
With regard to the media checks...
sccli> show media-check
Ch ID Iteration Status
2 0 89 13% complete
2 1 89 34% complete
2 2 89 20% complete
You may want to check on the priority level. You may see an improvement if you set priority to "Low"
413C is low... Current code is 415G. I suggest you review the release notes as I believe there have been improvements in Cache handling. Also the cli 2.1 is limited when compared to 2.3. I've found added commands in using 2.3
sccli> show battery-status
sccli: Upper Battery: error: in service date not set in the battery
sccli: Lower Battery: error: in service date not set in the battery
Not that this would impact performance, but it appears that the battery install date was not set. Sun info doc # 83023 outlines a procedure to correct this.
Sun info doc # 71182 does a nice job explaining how the physical disks are mirrored in the array.
You didn't mention if this was a single or dual controller array. By default write cache is turned off (write through) in single controller configurations. If you do have a single controller configuration, there is a work around, but understand that it comes with some risk. Take a look at the auto write through triggers (seen below). These parameters are trigger events to protect data by turning of write cache should certain events occur. One of these events is a controller failure. In a single controller config the "controller-failure" trigger prohibits global cache from being turned on. You can change this to "disabled" but understand their will be no cache mirroring and the risk of data loss increases.
* auto-write-through-trigger
controller-failure: enabled
battery-backup-failure: enabled
ups-ac-power-loss: disabled
power-supply-failure: enabled
fan-failure: enabled
temperature-exceeded-delay: 30min
Hope some of this helps......
flogia at 2007-7-12 21:32:18 >
