Sun Fire V480 Panic

Hi,

One of my V480R server kept resetting with the following error dump. My suspicion points to h/w fault on the DIMM, however I can't narrow down the exact bad stick. I hope someone could enlighten me on this problem.

WARNING: [AFT1] Bus Error (BERR) Event on CPU2 Privileged Data Access at TL=0, errID 0x0000624b.f07ffbb8

AFSR 0x00100800<PRIV,BERR>.00000000 AFAR 0x000007fb.00206200

Fault_PC 0x10031154

panic[cpu2]/thread=2a100045d20: [AFT1] errID 0x0000624b.f07ffbb8 BERR Error(s)

See previous message(s) for details

000002a100044ec0 SUNW,UltraSPARC-III+:cpu_aflt_log+4a4 (2a100044f7e, 1014a358, 1014a330, 0, 2a100045108, 2a100044fcb)

%l0-3: 000002a100045570 000002a1000451c8 0000000000000003 0000000000000010

%l4-7: 0000000000000000 0000000000000000 0000000000000000 000002a10001f910

000002a100045110 SUNW,UltraSPARC-III+:cpu_deferred_error+550 (80000000000, 1, 10080003200000, 1, 2a100045650, 2a1000451c8)

%l0-3: 0000000000000032 0000000000000000 0000000000000000 0000000000000219

%l4-7: 000002a100045740 000000000000009f 0000000000000000 000002a10001f9c0

000002a1000455a0 unix:prom_rtt+0 (30000064408, 30001f8420c, fba3af594c, 2a100045740, 300001b0e28, 2a100045810)

%l0-3: 0000000000000005 0000000000001400 0000009980001604 0000000010140658

%l4-7: 0000000000000000 000002a10004582a 0000000000000006 000002a100045650

000002a1000456f0 ce:ce_mif_read+b0 (30001d4dc90, 1, 1, 2a10004591e, 300001b0e28, 0)

%l0-3: 0000000000000001 0000030001f8420c 0000000000000001 0000030000064408

%l4-7: 0000030001d4dc90 0000030001d4dd38 000000000000001a 0000000000000000

000002a1000457a0 ce:ce_mii_read+24 (30001d4dc90, 1, 1, 2a10004591e, 0, 300001a1040)

%l0-3: 0000030001f8403c 0000030000064408 0000000000000000 00000000fffb3b0e

%l4-7: 0000030000064408 0000030001f7e000 0000030001d4dc90 0000030001f7e010

000002a100045850 ce:ce_mii_check+a8 (30001d4dc90, 794902a100045261, 24000010073f0c, fba3afd1c0, 0, 0)

%l0-3: 0000030001b3c738 000003000005a008 000002a10000fd20 0000000000000024

%l4-7: 0000000000000000 0000000000000001 0000030001d4dd20 000002a10057dba0

000002a100045920 ce:ce_intr+1ec (30001d4dc90, 30000067928, 30001b3c738, 3, 300001a0fa0, 10073f0c)

%l0-3: 0000030000067930 0000000000a4d3ed 0000030001d4dc90 00000300001a0fa0

%l4-7: 000000001bfa1000 0000030000eb3ea8 0000000000000000 0000000000000000

000002a100045a50 pcisch:pci_intr_wrapper+70 (1047f314, 240, 1, 3000019f2d0, 3000019ddd8, 30001d4bde0)

%l0-3: 0000000078064eb8 0000000000000000 0000000000000000 0000030000080338

%l4-7: 00000300000678c8 0000030000eb3ea8 0000000000000000 0000030000eb3ed0

syncing file systems...

panic[cpu2]/thread=2a100045d20: panic sync timeout

dumping to /dev/dsk/c1t0d0s1, offset 429588480

Thanks in advance.

-sonny

[2930 byte] By [sonnyfrans] at [2007-11-26 7:33:43]
# 1
How often is the system resetting ( is it in a crash loop )? If you can get the system to level 0 can change to diagnostic mode the system firmware will make a more detailed analysis of the hardware.
mlennon at 2007-7-6 19:31:04 > top of Java-index,Sun Hardware,Servers - General Discussion...
# 2
H/W diagnostic shows no problem on the hardware. Any possiblity to identify which h/w is being addressed by the AFAR: 0x000007fb.00206200. That's the only consistent info I noticed.
sonnyfrans at 2007-7-6 19:31:04 > top of Java-index,Sun Hardware,Servers - General Discussion...
# 3

I'll try to help you, the information on the problem I suspect here is Sun internal, so I can't point you to it, can you post the output of the follwing commands:

# /usr/platform/sun4u/sbin/prtdiag ( post CPU output only )

# uname -a

Also the event from /var/adm/messages

Message was edited by:

m-lennon

mlennon at 2007-7-6 19:31:04 > top of Java-index,Sun Hardware,Servers - General Discussion...
# 4
Hi Lenon, I am also facing similar problem. Can you pls mail me the details about this bug to ashu112@yahoo.comThanks in advance.RegardsAshok
ashokh123 at 2007-7-6 19:31:04 > top of Java-index,Sun Hardware,Servers - General Discussion...
# 5

ashock123:

That information is NOT available to anyone except to Sun employees,

and various Sun partners.That is what m-lennon was referring to as Sun internal.

If you need one-on-one assistance with the issue, you'll need to open a service case

and have a Sun techsupport employee investigate it with you.

If you have service contract coverage on the system, there'll be no charge.

If there isn't ant warranty or service contract coverage, you will need to pay for such service.

Or ...you can monitor this particular discussion thread

and hope it continues far enough along to give some hints for a solution.

This is a public user-to-user discussion forum.

I suggest you edit your posting and delete your email address.

"bad guys" harvest email addresses from the public Internet

so that they can send out SPAM and computer viruses.

Protect yourself, even with a Yahoo email address.

rukbat at 2007-7-6 19:31:04 > top of Java-index,Sun Hardware,Servers - General Discussion...
# 6

Rukbat I am prepared to offer some help to people running EOL platforms through these forums. Many data center managers currently running obsolete UNIX platforms ( Itanium, PA-RISC or Alpha ) may decide to run testbed servers for testing other UNIX operating systems, it can be hard to secure funds from upper level management on an all new platform like Solaris/SPARC for these sites. Once application performance is proven a site can then implement a production solution using a current system ( V490 for example ).

bad guys harvest... that is the reason I no long have my mail address visible in my profile! Additionally I would be in breach of my partnership agreement with Sun if I email the information about an internal bug to anyone that is not a current customer of ours. Provide details about your issue and I will try to assist.

mlennon at 2007-7-6 19:31:04 > top of Java-index,Sun Hardware,Servers - General Discussion...
# 7
Hi,Thanks for highlighting about the email id. Can you pls let me know how to edit the previous post ? I cannot find any link to edit the post.Thanks and RegardsAshok
ashokh123 at 2007-7-6 19:31:04 > top of Java-index,Sun Hardware,Servers - General Discussion...
# 8

Hello Ashok,

after Login to the forum you can edit your own posts.

Please review the following two theads

http://supportforum.sun.com/jive/thread.jspa?threadID=98120&messageID=33623 7

http://supportforum.sun.com/jive/thread.jspa?threadID=67466

The ability to edit your own posts was disabled for no apparent reason. I don't know why this feature/ability was disabled or not enabled. Jive Forums (http://www.jivesoftware.com/poweredby/) is a software used by many companies.

Just click on the "bubble with the pen" icon right of the "small red envelope" icon. You can even add a comment why and what you changed.

Michael

The links don't work as excepted, scroll down to the last posting in the threads.

Message was edited by:

MAALATFT

MAALATFT at 2007-7-6 19:31:04 > top of Java-index,Sun Hardware,Servers - General Discussion...
# 9

Hi MAALATFT,

Thanks for those links.

But even after login, i don't see any "bubble with pen" icon. While replying to another post i saw that icon, but i think it will get disable after 30 mins.

Is there any way to edit the post after 30 mins?

Thanks and Regards

Ashok

Message was edited by:

ashokh123

ashokh123 at 2007-7-6 19:31:04 > top of Java-index,Sun Hardware,Servers - General Discussion...
# 10
Hello,I would suggest that you report this error via the Feedback form.(Don't expect an answer, besides the automatic confirmation).I'm able to edit some of my postings, even days later, for other of my postings editing isn't available.Michael
MAALATFT at 2007-7-6 19:31:04 > top of Java-index,Sun Hardware,Servers - General Discussion...
# 11

The editing feature does seem to come and go, perhaps you could request that the post with your email address be deleted. Try to report it as a bogus message. It would also be good to get back on topic, so if you want to describe your issue, provide the output of the above commands and also a log of the event from /var/adm/message. I can do a little research and see if I can come up with a solution for you. It is also worth keeping in mind that many systems running US II, US III and US IV will panic with obsolete kernel revision. Details are freely available and can be found through Sunsolve.

mlennon at 2007-7-6 19:31:04 > top of Java-index,Sun Hardware,Servers - General Discussion...