Performance to expect from a StorEdge 3510

Hi all,

Really need some advice regarding a 3510FC array. We currently have a dual-controller setup with a RAID array + JBOD expansion. Each array has 10 drives, which we've setup as two RAID5 LD's as 8 data, 1 parity + 1 hotspare. The array firmware is 4.15.

Performance isn't good. We can't seem to get anything better than 65Mb/s, (based upon "mkfile 2g testfile"). Whilst I appreciate this is a far from scientific test, our HDS AMS200 performs this little task in 18secs, as opposed to 32secs on the 3510.

In the real world, Oracle is running like a dog on it :-(

I hoping that it's just our RAID5 setup that is not good, but before I consider the pain of -- backup data, reconfigure LD's, restore data -- I want to know if really the 3510 should be performing better.

If I re-jig the array with RAID1 Luns, + perhaps sequential for redo logs, should I be able to get some decent throughput from this array?

All help appreciated!

[984 byte] By [TomSimpsona] at [2007-11-26 18:12:54]
# 1

Tom, the first thing I would look into here is the 3510 array controller configuration. Before we go into a discussion about the array's performance and the poor results of testing any storage system with single threaded applications like mkfile, I'd like to make the point that there is a known issue pertaining to the 4.1x firmware, under certain conditions, can trigger a change in controller configuration which ultimately causes a reduction in I/O performance. Your system must be examined by a Sun field service engineer, who will in turn recommend that you upgrade the firmware to the latest revision. Start a service ticket with Sun, upgrade your firmware and then post back. After that I can recommend some performance analysis tools for your storage systems.

Kind regards

M Lennon

m-lennona at 2007-7-9 5:45:51 > top of Java-index,Storage Forums,Storage General Discussion...
# 2

But, what's my justification for opening a ticket? - the point I'm making is that "is my performance slow"? or is this simply as good as it gets?

I'm hoping for confirmation that 2g in 30 secs is indeed slow performance, and something can be done about it!

That said, already been on to our supplier about getting somebody to come in and look at the config. BTW - I thought 4.15 was the latest firmware - can you give me any guidance about what the issue is, or how we could spot it. I've got a dump of the array config if that helps.

Thx. Tom

TomSimpsona at 2007-7-9 5:45:51 > top of Java-index,Storage Forums,Storage General Discussion...
# 3

The justification opening the ticket is that there are at least two known issues with the 3510 that can cause a reduction in I/O performance. Only a field service engineer has details about the two issues and it's probably a good idea to eliminate these issues prior to making any configuration changes. First issue causes the cache to change to write through while the battery is charging, which in turn causes a reduction in I/O performance. This issue was corrected in patch 113723-15 ( 415F I think I missed that while I was reading your initial post ). Of course RAID 5 does not have good write performance and the only systems I know of that can be configured with RAID 5 and perform well are storage systems with large SP cache. The AMS 200 can have 1 - 4GB mirrored cache, the 3510 only supports 1GB on each controller. This brings me to the second issue, this issue is caused by the array controller configuration. While configured with dual redundant controllers, sequential write performance is reduced, the reduction can be as much as 1/3 of the performance of a system configured with a single controller. A simple way to see if it is this issue that is affecting your system is to reconfigure the storage system to single controller and run your test again. If there is a significant increase in performance then you will have to look at an alternative configuration. One such configuration is single controller 3510 RAID unit mirrored with another 3510 RAID unit. This configuration will allow the highest performance available from the 3510s and still provide redundant, high availability.

m-lennona at 2007-7-9 5:45:51 > top of Java-index,Storage Forums,Storage General Discussion...
# 4
Cheers. We are at 415F firmware revision. Given that we have a dual-controller RAID+JBOD config, could we test the dual controller theory by simply turning off the cache synchronisation? or would that force it into write-through?
TomSimpsona at 2007-7-9 5:45:51 > top of Java-index,Storage Forums,Storage General Discussion...
# 5

I'll have to do some research to come up with a reasonable configuration to improve performance. The dual array configuration will not perform as well as I first thought on a production system, in a clustered single controller array configuration you need to disable write back cache to avoid the risk data corruption, I used this configuration to test the outright performance of the storage system, but this was only a lab system. Can you post your current cache configuration?

m-lennona at 2007-7-9 5:45:51 > top of Java-index,Storage Forums,Storage General Discussion...
# 6

Here's our config ...

Sun StorEdge 3000 Family CLI

Copyright 2002-2005 Dot Hill Systems Corporation.

All rights reserved. Use is subject to license terms.

sccli version 2.3.0

built 2006.03.15.09.49

build 12 for solaris-sparc

* inquiry-data

Vendor: SUN

Product: StorEdge 3510

Revision: 415F

Peripheral Device Type: 0x0

NVRAM Defaults: 415F 3510 S470F

Bootrecord version: 1.31H

Serial Number: 0A7805

Page 80 Serial Number: 0A78055F4B554305

Page 83 Logical Unit Device ID: 600C0FF0000000000A78055F4B554305

Page 83 Target Device ID: 206000C0FF0A7805

IP Address:

Page D0 Fibre Channel Address: 05 (id 255)

Page D0 Node Name: 206000C0FF0A7805

Page D0 Port Name: 256000C0FFCA7805

Ethernet Address: 00:C0:FF:0A:78:05

Device Type: Primary

unique-identifier: A7805

controller-name: "R2 3510"

* network-parameters

ip-address:

netmask: 255.255.255.224

gateway:

mode: static

* host-parameters

max-luns-per-id: 32

queue-depth: 1024

fibre-connection-mode: point-to-point

inband-management: enabled

* drive-parameters

spin-up: disabled

disk-access-delay: 15s

scsi-io-timeout: 30s

queue-depth: 32

polling-interval: 30s

enclosure-polling-interval: 30s

auto-detect-swap-interval: disabled

smart: detect-clone-replace

auto-global-spare: disabled

* redundant-controller-configuration

Redundant Controller Configuration: primary

Cache Synchronization: enabled

Host Channel Failover Mode:shared

Local/Remote Redundant Mode:local

Write-Through Data Synchronization: disabled

Secondary RS-232 Port Status:disabled

Communication Channel Type:Fibre

* redundancy-mode

Primary controller serial number: 8104488

Primary controller location: Lower

Redundancy mode: Active-Active

Redundancy status: Enabled

Secondary controller serial number: 8103724

* cache-parameters

mode: write-back

optimization: random

sync-period: disabled

current-global-write-policy: write-back

* RS232-configuration

COM1 speed: 9600bps

* channels

Ch TypeMediaSpeedWidth PID / SID

--

0 HostFC(P)2GSerial 40 / N/A

1 HostFC(P)N/ASerial N/A / 42

2 DRV+RCC FC(L)2GSerial 14 / 15

3 DRV+RCC FC(L)2GSerial 14 / 15

4 HostFC(P)2GSerial 44 / N/A

5 HostFC(P)N/ASerial N/A / 46

6 HostLANN/ASerial N/A / N/A

* disks

ChIdSizeSpeed LDStatusIDs Rev

-

2(3)0 279.40GB200MB ld1ONLINESEAGATE ST330000FSUN300G 055A

S/N 38528TGW

WWNN 2000001862363F9C

2(3)1 279.40GB200MB ld0ONLINESEAGATE ST330000FSUN300G 055A

S/N 0751KPTT

WWNN 20000014C38511B6

2(3)2 279.40GB200MB GLOBAL STAND-BYSEAGATE ST330000FSUN300G 055A

S/N 38528FH8

WWNN 2000001862364997

2(3)3 279.40GB200MB ld0ONLINESEAGATE ST330000FSUN300G 055A

S/N 385263B5

WWNN 20000018623649A3

2(3)4 279.40GB200MB ld0ONLINESEAGATE ST330000FSUN300G 055A

S/N 38528TS7

WWNN 20000018623647DD

2(3)5 279.40GB200MB ld0ONLINESEAGATE ST330000FSUN300G 055A

S/N 3852GAAM

WWNN 20000018623644A8

2(3)6 279.40GB200MB ld0ONLINESEAGATE ST330000FSUN300G 055A

S/N 38525B67

WWNN 200000186236463E

2(3)7 279.40GB200MB ld0ONLINESEAGATE ST330000FSUN300G 055A

S/N 385293LK

WWNN 20000018623647D4

2(3)8 279.40GB200MB ld0ONLINESEAGATE ST330000FSUN300G 055A

S/N 38528E2E

WWNN 2000001862364A4A

2(3)9 279.40GB200MB ld0ONLINESEAGATE ST330000FSUN300G 055A

S/N 3852FPF2

WWNN 2000001862364137

2(3) 10 279.40GB200MB ld0ONLINESEAGATE ST330000FSUN300G 055A

S/N 38525BZE

WWNN 200000186236438E

2(3) 11 279.40GB200MB ld1ONLINESEAGATE ST330000FSUN300G 055A

S/N 3852F9C1

WWNN 2000001862364519

2(3) 12 279.40GB200MB ld1ONLINESEAGATE ST330000FSUN300G 055A

S/N 22526XFP

WWNN 20000014C3D8B5CD

2(3) 13 279.40GB200MB ld1ONLINESEAGATE ST330000FSUN300G 055A

S/N 225280ZR

WWNN 20000014C3D8B9F4

2(3) 14 279.40GB200MB ld1ONLINESEAGATE ST330000FSUN300G 055A

S/N 38527GP1

WWNN 2000001862364001

2(3) 15 279.40GB200MB ld1ONLINEFUJITSU MAW3300FCSUN300G 1303

S/N 000629D01NGJ

WWNN 500000E0126502D0

2(3) 16 279.40GB200MB NONEFRMTFUJITSU MAW3300FCSUN300G 1303

S/N 000629D01NG9

WWNN 500000E0126501A0

2(3) 17 279.40GB200MB ld1ONLINEFUJITSU MAW3300FCSUN300G 1303

S/N 000629D01NG0

WWNN 500000E0126500F0

2(3) 18 279.40GB200MB ld1ONLINEFUJITSU MAW3300FCSUN300G 1303

S/N 000629D01NGH

WWNN 500000E0126502B0

2(3) 19 279.40GB200MB ld1ONLINEFUJITSU MAW3300FCSUN300G 1303

S/N 000629D01NGE

WWNN 500000E012650210

* logical-drives

LDLD-IDSize Assigned TypeDisks Spare Failed Status

ld05F4B55432.18TB PrimaryRAID5 910Good

Write-Policy Default StripeSize 32KB

ld107190DF22.18TB Secondary RAID5 910Good

Write-Policy Default StripeSize 32KB

* logical-volumes

* partitions

LD/LVID-PartitionSize

-

ld0-005F4B5543-005.00GB

ld0-015F4B5543-015.00GB

ld0-025F4B5543-025.00GB

ld0-035F4B5543-0345.00GB

ld0-045F4B5543-0445.00GB

ld0-055F4B5543-05300.00GB

ld0-065F4B5543-061.79TB

ld1-0007190DF2-005.00GB

ld1-0107190DF2-015.00GB

ld1-0207190DF2-025.00GB

ld1-0307190DF2-03515.00GB

ld1-0407190DF2-04500.00GB

ld1-0507190DF2-05180.00GB

ld1-0607190DF2-061023.17GB

* lun-maps

Ch Tgt LUNld/lvID-Partition Assigned Filter Map

0 400ld05F4B5543-00Primary210000E08B929600 {tvlp-node-n01p01}

0 400ld05F4B5543-00Primary210000E08B92B831 {tvlp-node-n02p01}

0 401ld05F4B5543-01Primary210000E08B927C31 {tvlp-node-n03p01}

0 401ld05F4B5543-01Primary210000E08B925A02 {tvlp-node-n04p01}

0 402ld05F4B5543-02Primary210000E08B927C31 {tvlp-node-n03p01}

0 402ld05F4B5543-02Primary210000E08B925A02 {tvlp-node-n04p01}

0 403ld05F4B5543-03Primary210000E08B927C31 {tvlp-node-n03p01}

0 403ld05F4B5543-03Primary210000E08B925A02 {tvlp-node-n04p01}

0 404ld05F4B5543-04Primary210000E08B927C31 {tvlp-node-n03p01}

0 404ld05F4B5543-04Primary210000E08B925A02 {tvlp-node-n04p01}

0 405ld05F4B5543-05Primary210000E08B927C31 {tvlp-node-n03p01}

0 405ld05F4B5543-05Primary210000E08B925A02 {tvlp-node-n04p01}

1 420ld107190DF2-00Secondary 210000E08B920100 {tvlp-node-n05p01}

1 421ld107190DF2-01Secondary 210000E08B920100 {tvlp-node-n05p01}

1 422ld107190DF2-02Secondary 210000E08B920100 {tvlp-node-n05p01}

1 423ld107190DF2-03Secondary 210000E08B920100 {tvlp-node-n05p01}

1 424ld107190DF2-04Secondary 210000E08B920100 {tvlp-node-n05p01}

1 425ld107190DF2-05Secondary 210000E08B920100 {tvlp-node-n05p01}

4 440ld05F4B5543-00Primary210000E08B92BC01 {tvlp-node-n01p02}

4 440ld05F4B5543-00Primary210000E08B92DB02 {tvlp-node-n02p02}

4 441ld05F4B5543-01Primary210000E08B91FDFF {tvlp-node-n04p02}

4 441ld05F4B5543-01Primary210000E08B924A03 {tvlp-node-n03p02}

4 442ld05F4B5543-02Primary210000E08B924A03 {tvlp-node-n03p02}

4 442ld05F4B5543-02Primary210000E08B91FDFF {tvlp-node-n04p02}

4 443ld05F4B5543-03Primary210000E08B924A03 {tvlp-node-n03p02}

4 443ld05F4B5543-03Primary210000E08B91FDFF {tvlp-node-n04p02}

4 444ld05F4B5543-04Primary210000E08B924A03 {tvlp-node-n03p02}

4 444ld05F4B5543-04Primary210000E08B91FDFF {tvlp-node-n04p02}

4 445ld05F4B5543-05Primary210000E08B924A03 {tvlp-node-n03p02}

4 445ld05F4B5543-05Primary210000E08B91FDFF {tvlp-node-n04p02}

5 460ld107190DF2-00Secondary 210000E08B91EDFF {tvlp-node-n05p02}

5 461ld107190DF2-01Secondary 210000E08B91EDFF {tvlp-node-n05p02}

5 462ld107190DF2-02Secondary 210000E08B91EDFF {tvlp-node-n05p02}

5 463ld107190DF2-03Secondary 210000E08B91EDFF {tvlp-node-n05p02}

5 464ld107190DF2-04Secondary 210000E08B91EDFF {tvlp-node-n05p02}

5 465ld107190DF2-05Secondary 210000E08B91EDFF {tvlp-node-n05p02}

* protocol

IdentifierStatusPort Parameters

--

telnet enabled23inactivity-timeout=disabled

httpenabled80n/a

httpsdisabled n/an/a

ftp enabled21n/a

ssh enabled22n/a

priagentenabled58632 n/a

snmpdisabled n/an/a

dhcpenabled68n/a

pingenabledn/an/a

* auto-write-through-trigger

controller-failure: enabled

battery-backup-failure: enabled

ups-ac-power-loss: disabled

power-supply-failure: enabled

fan-failure: enabled

temperature-exceeded-delay: 30min

* peripheral-device-status

ItemValuestatus

-

CPU Temp Sensor(primary)52.50Cwithin safety range

Board1 Temp Sensor(primary) 55.50Cwithin safety range

Board2 Temp Sensor(primary) 64.00Cwithin safety range

+3.3V Value(primary)3.352Vwithin safety range

+5V Value(primary)5.019Vwithin safety range

+12V Value(primary)12.199Vwithin safety range

Battery-Backup Battery(primary)00Hardware:OK

CPU Temp Sensor(secondary) 52.50Cwithin safety range

Board1 Temp Sensor(secondary)57.50Cwithin safety range

Board2 Temp Sensor(secondary)62.00Cwithin safety range

+3.3V Value(secondary) 3.384Vwithin safety range

+5V Value(secondary)5.099Vwithin safety range

+12V Value(secondary)12.381Vwithin safety range

Battery-Backup Battery(secondary)00Hardware:OK

* enclosure-status

Ch Id Chassis Vendor/Product IDRev PLD WWNN WWPN

-

2 124 0A7805 SUN StorEdge 3510F A 1080 1000 204000C0FF0A7805 214000C0FF0A7805

Topology: loop(a) Status:OK

3 124 0A7805 SUN StorEdge 3510F A 1080 1000 204000C0FF0A7805 224000C0FF0A7805

Topology: loop(b) Status:OK

Enclosure Component Status:

Type Unit StatusFRU P/NFRU S/NAdd'l Data

Fan 0OK371-0108 GK0XC2--

Fan 1OK371-0108 GK0XC2--

Fan 2OK371-0108 GK0XC5--

Fan 3OK371-0108 GK0XC5--

PS 0OK371-0108 GK0XC2--

PS 1OK371-0108 GK0XC5--

Temp 0OK371-0531 0A7805temp=30

Temp 1OK371-0531 0A7805temp=28

Temp 2OK371-0531 0A7805temp=31

Temp 3OK371-0531 0A7805temp=30

Temp 4OK371-0531 0A7805temp=31

Temp 5OK371-0531 0A7805temp=30

Temp 6OK371-0532 HL12LMtemp=37

Temp 7OK371-0532 HL12LMtemp=41

Temp 8OK371-0532 HL12QDtemp=37

Temp 9OK371-0532 HL12QDtemp=38

Temp 10OK371-0108 GK0XC2temp=30

Temp 11OK371-0108 GK0XC5temp=25

Voltage 0OK371-0108 GK0XC2voltage=5.110V

Voltage 1OK371-0108 GK0XC2voltage=11.750V

Voltage 2OK371-0108 GK0XC5voltage=5.020V

Voltage 3OK371-0108 GK0XC5voltage=11.520V

Voltage 4OK371-0532 HL12LMvoltage=2.480V

Voltage 5OK371-0532 HL12LMvoltage=3.250V

Voltage 6OK371-0532 HL12LMvoltage=5.000V

Voltage 7OK371-0532 HL12LMvoltage=12.120V

Voltage 8OK371-0532 HL12QDvoltage=2.500V

Voltage 9OK371-0532 HL12QDvoltage=3.300V

Voltage 10OK371-0532 HL12QDvoltage=5.050V

Voltage 11OK371-0532 HL12QDvoltage=12.240V

DiskSlot 0OK371-0531 0A7805addr=0,led=off

DiskSlot 1Absent371-0531 0A7805addr=1,led=off

DiskSlot 2Absent371-0531 0A7805addr=2,led=off

DiskSlot 3OK371-0531 0A7805addr=3,led=off

DiskSlot 4OK371-0531 0A7805addr=4,led=off

DiskSlot 5OK371-0531 0A7805addr=5,led=off

DiskSlot 6OK371-0531 0A7805addr=6,led=off

DiskSlot 7OK371-0531 0A7805addr=7,led=off

DiskSlot 8OK371-0531 0A7805addr=8,led=off

DiskSlot 9OK371-0531 0A7805addr=9,led=off

DiskSlot 10OK371-0531 0A7805addr=10,led=off

DiskSlot 11OK371-0531 0A7805addr=11,led=off

* SES

Ch Id Chassis Vendor/Product IDRev PLD WWNN WWPN

-

2 124 0A7805 SUN StorEdge 3510F A1080 1000 204000C0FF0A7805 214000C0FF0A7805

Topology: loop(a)

3 124 0A7805 SUN StorEdge 3510F A1080 1000 204000C0FF0A7805 224000C0FF0A7805

Topology: loop(b)

* port-WWNs

Ch IdWWPN

-

0 40216000C0FF8A7805

1 42226000C0FFAA7805

4 44256000C0FFCA7805

5 46266000C0FFEA7805

* inter-controller-link

inter-controller-link upper channel 0: connected

inter-controller-link lower channel 0: connected

inter-controller-link upper channel 1: connected

inter-controller-link lower channel 1: connected

inter-controller-link upper channel 4: connected

inter-controller-link lower channel 4: connected

inter-controller-link upper channel 5: connected

inter-controller-link lower channel 5: connected

* battery-status

Upper Battery Type: 1

Upper Battery Manufacturing Date: Fri Jun 16 00:00:00 2006

Upper Battery Placed In Service: Tue Aug 8 09:27:53 2006

Upper Battery Expiration Date:Thu Aug 7 09:27:53 2008

Upper Battery Expiration Status: OK

Lower Battery Type: 1

Lower Battery Manufacturing Date: Fri Jun 16 00:00:00 2006

Lower Battery Placed In Service: Tue Aug 8 09:27:52 2006

Lower Battery Expiration Date:Thu Aug 7 09:27:52 2008

Lower Battery Expiration Status: OK

Upper Battery Hardware Status:OK

Lower Battery Hardware Status:OK

* sata-router

no sata routers found

* sata-mux

0 mux boards found

* host-wwn-names

Host-ID/WWNName

--

210000E08B91EDFF tvlp-node-n05p02

210000E08B91FDFF tvlp-node-n04p02

210000E08B925A02 tvlp-node-n04p01

210000E08B927C31 tvlp-node-n03p01

210000E08B924A03 tvlp-node-n03p02

210000E08B92DB02 tvlp-node-n02p02

210000E08B92BC01 tvlp-node-n01p02

210000E08B92B831 tvlp-node-n02p01

210000E08B920100 tvlp-node-n05p01

210000E08B929600 tvlp-node-n01p01

* FRUs

7 FRUs found in chassis SN#0A7805 at ch 2 id 124

Name: FC_CHASSIS_BKPLN

Description: SE3510 FC Chassis/backplane

Part Number: 371-0531

Serial Number: 0A7805

Revision: 01

Initial Hardware Dash Level: 01

FRU Shortname:

Manufacturing Date: Sun Jul 16 05:25:36 2006

Manufacturing Location: Suzhou,China

Manufacturer JEDEC ID: 0x0301

FRU Location: FC MIDPLANE SLOT

Chassis Serial Number: 0A7805

FRU Status: OK

Name: FC_RAID_IOM

Description: SE3510 I/O w/SES RAID FC 2U

Part Number: 371-0532

Serial Number: HL12LM

Revision: 01

Initial Hardware Dash Level: 01

FRU Shortname:

Manufacturing Date: Wed Jul 5 12:11:49 2006

Manufacturing Location: Suzhou,China

Manufacturer JEDEC ID: 0x0301

FRU Location: UPPER FC RAID IOM SLOT

Chassis Serial Number: 0A7805

FRU Status: OK

Name: BATTERY_BOARD

Description: SE351X Hot Swap Battery Module

Part Number: 371-0539

Serial Number: GP15BJ

Revision: 01

Initial Hardware Dash Level: 01

FRU Shortname:

Manufacturing Date: Thu Jul 6 03:25:30 2006

Manufacturing Location: Suzhou,China

Manufacturer JEDEC ID: 0x0301

FRU Location: UPPER BATTERY BOARD SLOT

Chassis Serial Number: 0A7805

FRU Status: OK

Name: AC_POWER_SUPPLY

Description: SE3XXX AC PWR SUPPLY/FAN, 2U

Part Number: 371-0108

Serial Number: GK0XC5

Revision: 01

Initial Hardware Dash Level: 01

FRU Shortname:

Manufacturing Date: Mon May 22 08:45:16 2006

Manufacturing Location: Irvine California, USA

Manufacturer JEDEC ID: 0x048F

FRU Location: RIGHT AC PSU SLOT #1 (RIGHT)

Chassis Serial Number: 0A7805

FRU Status: OK

Name: AC_POWER_SUPPLY

Description: SE3XXX AC PWR SUPPLY/FAN, 2U

Part Number: 371-0108

Serial Number: GK0XC2

Revision: 01

Initial Hardware Dash Level: 01

FRU Shortname:

Manufacturing Date: Mon May 22 08:53:42 2006

Manufacturing Location: Irvine California, USA

Manufacturer JEDEC ID: 0x048F

FRU Location: AC PSU SLOT #0 (LEFT)

Chassis Serial Number: 0A7805

FRU Status: OK

Name: FC_RAID_IOM

Description: SE3510 I/O w/SES RAID FC 2U

Part Number: 371-0532

Serial Number: HL12QD

Revision: 01

Initial Hardware Dash Level: 01

FRU Shortname:

Manufacturing Date: Wed Jul 5 15:16:42 2006

Manufacturing Location: Suzhou,China

Manufacturer JEDEC ID: 0x0301

FRU Location: LOWER FC RAID IOM SLOT

Chassis Serial Number: 0A7805

FRU Status: OK

Name: BATTERY_BOARD

Description: SE351X Hot Swap Battery Module

Part Number: 371-0539

Serial Number: GP15GW

Revision: 01

Initial Hardware Dash Level: 01

FRU Shortname:

Manufacturing Date: Thu Jul 6 05:26:03 2006

Manufacturing Location: Suzhou,China

Manufacturer JEDEC ID: 0x0301

FRU Location: LOWER BATTERY BOARD SLOT

Chassis Serial Number: 0A7805

FRU Status: OK

* access-mode

access-mode: inband

* controller-date

Boot time: Fri Jan 5 10:20:15 2007

Current time : Thu Feb 8 10:25:36 2007

Time Zone: GMT

* disk-array

init-verify: disabled

rebuild-verify: disabled

normal-verify: disabled

rebuild-priority: normal

TomSimpsona at 2007-7-9 5:45:51 > top of Java-index,Storage Forums,Storage General Discussion...
# 7
Is the system currently in production?
m-lennona at 2007-7-9 5:45:51 > top of Java-index,Storage Forums,Storage General Discussion...
# 8
It's on the cusp of going into production, but initial performance on data loads is too restrictive. Basically, it's a cluster housing 3 oracle 10g DB's. We need to get it performing substantially better ASAP. Or I suspect I may be getting my CV up to date! :-(
TomSimpsona at 2007-7-9 5:45:51 > top of Java-index,Storage Forums,Storage General Discussion...
# 9

I will post some general performance parameters for Oracle 9i ( I haven't worked on a 10G solution yet ) later this evening. You could also convert the system to single controller configuration using the following guide:

http://www.sun.com/products-n-solutions/hardware/docs/html/817-3337-16/07_mains torage.html#pgfId-1000062

If you perform your tests with write back cache enabled you can prove that there is an inherent performance trade off while the system is configured with dual controllers. With entry level systems there can be severe performance overheads while configured for maximum availability.

Are you responsible for choosing the 3510 for your current application? I try to advise anyone in the process of choosing, upgrading or implementing a storage system to first generate an I/O profile for your application. Armed with this data you can easily select a storage system more suited to the application. For example, an Oracle database that performs high amount of write activity would perform very slow on a RAID 5 entry level storage system. There is also an issue with using redo logs on a storage subsystem, as far as I remember Oracle recommend that the server internal storage systems is used. I'll do some more research and post back.

m-lennona at 2007-7-9 5:45:51 > top of Java-index,Storage Forums,Storage General Discussion...
# 10

oh yes, I chose the array whilst under pressure about costs, resilience etc. On top of that, the storage requirements rose drastically post purchase, hence the R5 setup.

To make matters more fun, we had no test data on which to base the decision, it was a total "finger in the air" exercise.

Hey ho, these things are sent to test us - really appreciate all your help BTW.

Tom

TomSimpsona at 2007-7-9 5:45:51 > top of Java-index,Storage Forums,Storage General Discussion...
# 11

One other parameter worth looking at while you are in the testing stage is the " optimization: random " cache setting, it may be helpful to change this parameter to sequential write. I think under the circumstances you will have to come up with a different RAID configuration. I only see RAID 5 as a decent solution when the storage systems is high end with large data cache ( eg. 9900 series ).

m-lennona at 2007-7-9 5:45:51 > top of Java-index,Storage Forums,Storage General Discussion...
# 12
Sorry - I haven't had time to update this thread, it'll have to be next week.
m-lennona at 2007-7-9 5:45:51 > top of Java-index,Storage Forums,Storage General Discussion...
# 13

Hi - out of curiosity, we tried just switching off one of the controllers to avoid the redundant controller performance hit, and sure enough, as you predicted, the performance increased by 33%. We were all happiness and light.

Then we re-enabled the controller but turned off cache synchronisation, assuming that we would see the same performance. However, performance died again!

I assumed that "no cache sync" sort of equalled "non redundant controller", but obviously not! - any idea what gives?

TIA. Tom

TomSimpsona at 2007-7-9 5:45:51 > top of Java-index,Storage Forums,Storage General Discussion...
# 14

We have a similar setup with a 3510, daul RAID controllers, RAID5 logical drive. Firmware on the array is at 413C but it sounds like it would be wise to update now. We are running Oracle 9i and the performance is not great and I'm looking to improve it.

I'm just wondering if the 1/3 reduction problem is also an issue with a RAID 0 + 1 type of configuration? Is it specific to RAID 5?

I'm going to start another thread on a related 3510 topic to separate things a bit.

Thanks,

Dennis

dennis.jonesa at 2007-7-9 5:45:51 > top of Java-index,Storage Forums,Storage General Discussion...
# 15
Hey Tom,In your scenario, is this setup as a raw device or is there a file system involved? Just curious.Thanks,Dennis
dennis.jonesa at 2007-7-21 17:18:24 > top of Java-index,Storage Forums,Storage General Discussion...
# 16

Dennis,

It would appear that the 1/3 speed loss has nothing to do with the underlying RAID setup. It's purely down to the setup of the controllers.Running on a single controller (by disabling the secondary) gave us the speed increase (of 33%). That was in write-back mode however. We never got the chance to try single controller with write-through.

With the secondary brought back online, we tried disabling cache-sync, but could only run with write through mode. You can't enable write-back without cache sync. In this config, we lost the 33% speed boost.

Basically, it seems that the only config you can have to gain the 33% is to have two single-controller arrays, mirrored at the server in software.

We tried RAID5 .v. RAID0 and it made no difference whatsoever.

TomSimpsona at 2007-7-21 17:18:24 > top of Java-index,Storage Forums,Storage General Discussion...
# 17
filesystem. (latest and greatest Solaris 10 + Cluster 3.1)
TomSimpsona at 2007-7-21 17:18:24 > top of Java-index,Storage Forums,Storage General Discussion...
# 18

>

> Basically, it seems that the only config you can have

> to gain the 33% is to have two single-controller

> arrays, mirrored at the server in software.

>

> We tried RAID5 .v. RAID0 and it made no difference

> whatsoever.

Man this is a bummer. You don't expect to have to take that kind of a penalty with hw RAID.

Thanks.

dennis.jonesa at 2007-7-21 17:18:24 > top of Java-index,Storage Forums,Storage General Discussion...
# 19

I cannot elaborate on the actual issue further, it's not even covered on the known issues section of the product release notes, so there is no recommended workaround:

http://www.sun.com/products-n-solutions/hardware/docs/html/817-6597-18/relnotes .html#pgfId-1108418

This is not even an issue as such, it is merely a performance overhead from the array's dual controller configuration to provide maximum data availability. The single controller configuration I talked about earlier is an option I was looking at to squeeze the maximum performance out of the array, however, this configuration is not supported with SC or by Sun ( supported with different cache setting ). The type of configuration I'm suggesting is typical with older hardware solutions like the T3, where the administrator would combine hardware RAID and Volume Manager. This isn't a bad configuration method though, because you can use VM to balance I/O over the controllers and can be handy to manage ( old school admins might appreciate that ).

I would also like to say that ALL storage manufacturers publish performance figures for hardware products that are the absolute maximum obtainable and usually calculated by measuring I/O to data that resides in the cache. Once you start using a single threaded application like dd or mkfile, you will find the test will write data synchronously to the controller, in other words 1 I/O at a time. hence the poor performance results while testing with these tools. I consider Oracle RDBMS to need a midrange to high end storage system and though the 3510 is only entry level it should be able to provide better performance than you will see from dd or mkfile.

There are tools available that can measure performance more accurately, for example. vdbench and SWAT ( if needed you can include request for these services when you start the Sun service ticket ). At this point I would recommend that you either upgrade the storage system or begin to put the application into production and optimize the various solution layers to achieve the best performance possible. At least this way if you find that if Oracle is under performing while in production, you can upgrade the storage system later on in the project, this will guarantee performance boost as long as you select a storage system based on realistic figures.

I can provide further performance tips that may point you in the right direction if it helps, but we've got to forget about using mkfile completely!

m-lennona at 2007-7-21 17:18:24 > top of Java-index,Storage Forums,Storage General Discussion...
# 20
We are looking at a 6140 config instead (with 4Gb cache). Would you consider this to be a viable alternative in our scenario?Or to put it another way, all things being equal, should I expect significantly better performance from the 6140 compared to the 3510?Tom
TomSimpsona at 2007-7-21 17:18:24 > top of Java-index,Storage Forums,Storage General Discussion...
# 21

>

> This is not even an issue as such, it is merely a

> performance overhead from the array's dual controller

> configuration to provide maximum data availability.

Does the 6140 suffer from the same issue, that dual redundant controllers causes a performance hit?

TomSimpsona at 2007-7-21 17:18:24 > top of Java-index,Storage Forums,Storage General Discussion...
# 22

There is no record of performance related issues with the 6140 when configured with dual controllers. There are a number of bugs that you may want to look at:

http://www.sun.com/products-n-solutions/hardware/docs/html/819-7299-12/6140_Rel ease_Notes_final.html#pgfId-1040886

I would consider this platform ideal for use with small to medium sized Oracle database, but I reiterate my suggestion on the performance analysis method you used on the 3510.

I mentioned some documentation earlier, but it is for the 6130 array, so it's not applicable with the 6140 system. Contact your sales channel and ask them to provide you with a detailed overview of the systems and it's placement in regard to the 3510.

Message was edited by:

m-lennon

m-lennona at 2007-7-21 17:18:24 > top of Java-index,Storage Forums,Storage General Discussion...
# 23

Tom,

Just as a comparison, I've repeated your 2G mkfile test on my 3510 (dual RAID ctl). I realize it's not scientific. The firmware is at 413C which I know needs an update. The logical drive in question is 3 x 73GB (10K) drives mirrored to another 3 drives. (RAID 1) This is Solaris 9, ufs, forcedirectio.

So I didn't expect to see the same performance as you because of fewer drives but I wasn't anywhere close to 32seconds. I came in a little under 7 minutes which concerns me.

Things I noticed in looking more closely:

My LD stripe size is 128K and the cache optimization is sequential. Since this LD is used for Oracle DB files - OLTP, I'm assuming that the optimization should be set to random with a corresponding LD stripe size of 32K. Our Oracle DB block size is 8K.

I'm considering making some changes here and I thought I'd get your feedback before doing so. Specifically, redo the array with optimization random, recreate the LDs using 32K stripe size.

Thanks,

Dennis

dennis.jonesa at 2007-7-21 17:18:24 > top of Java-index,Storage Forums,Storage General Discussion...
# 24

Sorry for slow reply - been kinda snowed under! - see inline....

> Just as a comparison, I've repeated your 2G mkfile

> test on my 3510 (dual RAID ctl). I realize it's not

> scientific. The firmware is at 413C which I know

> needs an update. The logical drive in question is 3 x

> 73GB (10K) drives mirrored to another 3 drives. (RAID

> 1) This is Solaris 9, ufs, forcedirectio.

For reference - we were using Solaris 10, ufs, (no forcedirectio). Firmware 413F

>

> So I didn't expect to see the same performance as you

> because of fewer drives but I wasn't anywhere close

> to 32seconds. I came in a little under 7 minutes

> which concerns me.

HOLY <INSERT FAVOURITE WORD HERE> !!!!!

It would concern me too! - for reference, I tried the same test on my G4 mac mini at home - it took 1min 14secs. For a 3510 to perform 5 times slower than a small personal home mac internal drive, you have to believe that things really aren't right.

Realistically, I suspect that Oracle stats will likely show that your DB is IO bound....

>

> Things I noticed in looking more closely:

>

> My LD stripe size is 128K and the cache optimization

> is sequential. Since this LD is used for Oracle DB

> files - OLTP, I'm assuming that the optimization

> should be set to random with a corresponding LD

> stripe size of 32K. Our Oracle DB block size is 8K.

Assuming OLTP workloads, random optimisation would make more sense I believe, and is I think, the recommended option for OLTP.

>

> I'm considering making some changes here and I

> thought I'd get your feedback before doing so.

> Specifically, redo the array with optimization

> random, recreate the LDs using 32K stripe size.

I'm no expert on Oracle design, but the conclusion we came to with the help of an Oracle consultant was that the DB should be on random, and the redo logs on sequential. Since you can't have both on a 3510, random seems to be the best compromise. That said, it is important to seperate certain things on different drives physically. Our plan was to create two LUNs from seperate physical drives. LUN1 would have DB+undo+redo1, LUN2 would have indices+temp+redo2

>

> Thanks,

>

> Dennis

TomSimpsona at 2007-7-21 17:18:24 > top of Java-index,Storage Forums,Storage General Discussion...
# 25

I would consider this test you are performing as completely useless. mkfile is the worst tool to test the performance of a storage system. A single threaded tool like mkfile will send I/O to the storage system one chunk at a time. So if it takes 8 ms to write x amount data using single threaded tool, it could take 1 ms to write the same amount of data using a tool that supports 8 threads. The 3510 will only perform how you ask it to perform, in other words, if you tell it to write data 1 chunk at a time using mkfile, it will do it. If you ask it to write data 1 chunk at a time using 8 concurrent threads, it will do that too! Oracle is not a single threaded application and therefore you can not replicate an Oracle I/O profile using mkfile or dd, don't use it. Furthermore your mac OS and Solaris use different filesystems. Both of you are experiencing what is called a storage system bottleneck. This issue can be caused by any number of issues eg. storage system configuration, filesystem configuration, kernel configuration or application configuration. The correct action to determine the cause of the bottleneck is " application I/O profile ". Once you, the administrator, understands how the application is using the storage system can you take the corrective action to resolve the issue.

m-lennona at 2007-7-21 17:18:24 > top of Java-index,Storage Forums,Storage General Discussion...
# 26

I understand what you're saying regarding mkfile but given that we are both using similar hardware I would expect to see similar performance even with a single threaded application unless there is a major difference in the way we have things configured. I understand that it won't show me all that the 3510 is capable of. So in that sense I think there is some value to the comparison between his 3510 and mine using mkfile.

I have examined the I/O profile of our application. This is a database that has the typical random I/O of an OLTP system but is also trying to support batch processes with a lot of sequential read/writes. Currently, the batch processes are a problem. Can we get the best of both worlds or at least an improvement on the batch performance? Maybe, and it sounds like my current 3510 configuration is an issue.

Thank you again for your input.

dennis.jonesa at 2007-7-21 17:18:24 > top of Java-index,Storage Forums,Storage General Discussion...