A1000 on a v240 with Solaris 9 problem

I recently started testing a v240 as a replacement for an older 280R . I swapped over the SCSI controller to the v240 and hooked up the A1000 from the 280R. On the v240, I loaded Raid Manager 6.22.1, installed the latest NVSRAM and firmware. All the diagnostics seem okay, and the drives work as expected. However, upon running Oracle, I start seeing lots of error messages in /var/adm/messages:

<table border="0" align="center" width="90%" cellpadding="3" cellspacing="1"><tr><td class="SmallText"><b>Quote:</b></td></tr><tr>& lt;td class="quote">

Oct 13 07:56:41 mike4 scsi: [ID 107833 kern.warning] WARNING: /<a href="mailto:pci&#64;1e" target="_blank">pci@1e</a>,600000/scs--M <a href="mailto:i&#64;2" target="_blank">i@2</a>,1/<a href="mailto:sd&#64;5" target="_blank">sd@5</a>,1 (sd145):

Oct 13 07:56:41 mike4disk not responding to selection

Oct 13 07:56:41 mike4 rdriver: [ID 486355 kern.notice] ID[RAIDarray.rdriver.4003--M ] The Array driver is returning an Errored I/O, with errno 5, on cl24_002, Lun 1--M, sector 1840

Oct 13 07:56:48 mike4 scsi: [ID 107833 kern.warning] WARNING: /<a href="mailto:pci&#64;1e" target="_blank">pci@1e</a>,600000/<a href="mailto:scs--Mi&#64;2" target="_blank">scs--Mi@2</a>,1/<a href="mailto:sd&#64;5" target="_blank">sd@5</a>,1 (sd145):

Oct 13 07:56:48 mike4Error for Command: read(10)Error Level: --M Retryable

Oct 13 07:56:48 mike4 scsi: [ID 107833 kern.notice]Requested Block: 1938228--M8Error Block: 19382288

Oct 13 07:56:48 mike4 scsi: [ID 107833 kern.notice]Vendor: Symbios --MSerial Number:J<a href="mailto:-&#64;E" target="_blank">-@E</a>

Oct 13 07:56:48 mike4 scsi: [ID 107833 kern.notice]Sense Key: Unit Attentio--Mn

Oct 13 07:56:48 mike4 scsi: [ID 107833 kern.notice]ASC: 0x29 (power on, res--Met, or bus reset occurred), ASCQ: 0x0, FRU: 0x0

Oct 13 07:57:09 mike4 raid: [ID 702911 user.error] AEN event Host=mike4 Ctrl=1T33638989 Dev=c4t5d0

Oct 13 07:57:09 mike4 raid: [ID 702911 user.error]ASC=A0 ASCQ=00 FRU=00 LUN=00 LUN Stat=00

Oct 13 07:57:09 mike4 raid: [ID 702911 user.error]Sense=700006000000009800000000

A0000000000000000000000000000000000000000000800000082C000000 000000000000000000000B0

531543333363338393839202020202020030104000000000000000000000 00000000000000000000000

000000000000010000000000000000000000000000000000000000000000 00000000000000000000000

00000313130313330352F30363033323300000000000000

</td></tr></table>

Below are some details on my setup:

<table border="0" align="center" width="90%" cellpadding="3" cellspacing="1"><tr><td class="SmallText"><b>Quote:</b></td></tr><tr>& lt;td class="quote">

# uname -a

SunOS mike4 5.9 Generic_118558-09 sun4u sparc SUNW,Sun-Fire-V240

# /usr/lib/osa/bin/raidutil -c c4t5d0 -i

LUNs found on c4t5d0.

LUN 0RAID 1104078 MB

LUN 1RAID 134692 MB

LUN 2RAID 134692 MB

LUN 3RAID 134692 MB

Vendor ID Symbios

ProductID StorEDGE A1000

Product Revision 0003

Boot Level03.01.04.00

Boot Level Date04/05/01

Firmware Level03.01.04.68

Firmware Date06/22/01

raidutil succeeded!

# /usr/sbin/osa/healthck -a

Health Check Summary Information

cl24_002:Optimal

# pkginfo | grep osa

systemSUNWosafwOpen Storage Array Firmware

systemSUNWosamnOpen Storage Array Man Pages

systemSUNWosanvOpen Storage Array (nvsram)

systemSUNWosar Open Storage Array (Root)

systemSUNWosau Open Storage Array (Usr)

# pkginfo -l SUNWosafw

PKGINST: SUNWosafw

NAME: Open Storage Array Firmware

CATEGORY: system

ARCH: sparc

VERSION: 06.22,REV=01.54

BASEDIR: /usr

VENDOR: Sun Microsystems, Inc

DESC: Open Storage Array Firmware

PSTAMP: miran20010820150047

INSTDATE: Oct 11 2005 16:05

VSTOCK: 06.22.01.54

HOTLINE: Please contact your local service provider

STATUS: completely installed

FILES:14 installed pathnames

3 shared pathnames

3 directories

15264 blocks used (approx)

</td></tr></table>

Any ideas what might be wrong? Thanks!

[4576 byte] By [mdemicco] at [2007-11-25 23:00:30]
# 1

The errored IO number 5 refers to the read(10) problem you had on LUN 1. Is this happening for LUN 0?

You may want to run:

/usr/lib/osa/bin/logutil > logutil.out

and look for ASC/ASCQ codes around the time of the issue. You can cross reference these codes in your /usr/lib/osa/raidcode.txt file.

You can also look up the log and get help in the RAID Manager GUI under the Maintenance and Tuning screen for your A1000 module.

The followup ASC/ASCQ code is due to read/write caching being disabled. This can be due to an old battery or some other type of problem. If this seems like too much, open a service call with Sun for more help.

snoogins5 at 2007-7-5 17:49:49 > top of Java-index,Storage Forums,Storage General Discussion...
# 2

<table border="0" align="center" width="90%" cellpadding="3" cellspacing="1"><tr><td class="SmallText"><b>snoogins5 wrote on Fri, 14 October 2005 15:21</b></td></tr><tr><td class="quote">

The errored IO number 5 refers to the read(10) problem you had on LUN 1. Is this happening for LUN 0?

</td></tr></table>

Yes, the errno 5 appears for all LUNs 0-3. I only pasted a snippet of the error log.

<table border="0" align="center" width="90%" cellpadding="3" cellspacing="1"><tr><td class="SmallText"><b>Quote:</b></td></tr><tr>& lt;td class="quote">

You may want to run:

/usr/lib/osa/bin/logutil > logutil.out

and look for ASC/ASCQ codes around the time of the issue. You can cross reference these codes in your /usr/lib/osa/raidcode.txt file.

The followup ASC/ASCQ code is due to read/write caching being disabled. This can be due to an old battery or some other type of problem. If this seems like too much, open a service call with Sun for more help.

</td></tr></table>

I'll give this a try and see if Iit tells me anything. I *just* replaced the battery a few days ago prior to upgrading the software/firmware so it shouldn't be that.

Thanks!

mdemicco at 2007-7-5 17:49:49 > top of Java-index,Storage Forums,Storage General Discussion...