Hardware error on disk

Hello,

3 weeks ago I notice the following message on a server:

Oct 3 05:22:16 m530e scsi: [ID 107833 kern.warning] WARNING: /pci@1c,600000/scsi@2/sd@2,0 (sd2):

Oct 3 05:22:16 m530eError for Command: write(10)Error Level: Retryable

Oct 3 05:22:16 m530e scsi: [ID 107833 kern.notice]Requested Block: 7961728Error Block: 7961728

Oct 3 05:22:16 m530e scsi: [ID 107833 kern.notice]Vendor: SEAGATESerial Number: 053632G895

Oct 3 05:22:16 m530e scsi: [ID 107833 kern.notice]Sense Key: Hardware Error

Oct 3 05:22:16 m530e scsi: [ID 107833 kern.notice]ASC: 0x44 (internal target failure), ASCQ: 0x0, FRU: 0xb

The error has not repeated. The scsi address corresponds to a submirror but metastat shows everything is OK.

What happaned? Is there a way to test if the disk is ok? What about that block number on the message? Maybe it's just a bad block and has been mark as such (?)

Any thoghts are greatly appreciated.

[974 byte] By [BillyP] at [2007-11-26 10:53:25]
# 1

Thats a soft error, they're not too serious.

Do an iostat -En | grep -i soft.

That will show you hard and soft errors. I wouldnt worry about it unless your getting hard errors

or your getting a lot of soft errors. Lots of soft errors might indicate scsi bus cabling issues etc.

robertcohen at 2007-7-7 3:06:22 > top of Java-index,Solaris Operating System,Solaris Essentials - General Technical Questions...
# 2
But the sense key indicates hardware error. The actual output from iostat -En:c1t2d0 Soft Errors: 0 Hard Errors: 1 Transport Errors: 0 Vendor: SEAGATE Product: ST373207LSUN72G Revision: 045A Serial No: xxxxxxxxxxSo it is a hardware error, right?
BillyP at 2007-7-7 3:06:22 > top of Java-index,Solaris Operating System,Solaris Essentials - General Technical Questions...
# 3

One single, solitary hard error ?

It's not serious. Don't fret over it until you see hundreds of errors on the same block.

There are no perfect disk drives.Never have been and there won't be any for the forseeable future.

If you consider this a critical concern, then unmount the filesystem,

and use FORMAT to go in and mark the block "bad".

You can do that with SCSI and FCAL disks, but cannot do that on IDE drives.

The man pages for FORMAT can give you guidance on how to do that.

rukbat at 2007-7-7 3:06:22 > top of Java-index,Solaris Operating System,Solaris Essentials - General Technical Questions...
# 4

The term soft doesnt mean software.

Both hard and soft errors are being reported by the disk.

A soft error is a retryable error. It could mean a write failed so was written to a spare failover block instead.

A hard error is more serious.

The counters reset when machine is rebooted.

But if they keep appearing, its worth reporting or replacing the disk.

I don't think I'd wait for hundreds. An recurring hard error is a problem.

But I don't normally muck around with marking blocks bad. Modern disks should do that automatically.

robertcohen at 2007-7-7 3:06:22 > top of Java-index,Solaris Operating System,Solaris Essentials - General Technical Questions...
# 5
It抯 not serious problem, while reading/ writing data on particular block, block may be busy. So it抯 showing s閍nce key error.Saran
Saran_India at 2007-7-7 3:06:22 > top of Java-index,Solaris Operating System,Solaris Essentials - General Technical Questions...