SunOne 6.0sp5 corrupting files

Hello,

I was just brought into a problem today and am at somewhat of a loss. I am the system administrator, not a developer on this issue. The development team seem to think this is an OS issue. I think it has something to do with Sun One. Maybe some kernel tuning at the OS level, but Solaris is not corrupting their files.

System Information:

v240

2x 1280 MHz US-IIIi processors

2GB Ram

Solaris 8

SunOne 6.0sp5

Problem:

On occasion, about weekly, something will happen and a number of files will end up with 0 byte lengths. it is the same set of files every time. The files are mostly images and js files and applets in 6 specific directories. All files are files served up by Sun One.

Some problem indicators:

Multiple connection related errors:

[14/May/2007:18:46:15] failure (13853): Connection queue full, closing socket

[15/May/2007:08:44:39] failure ( 3420): Error accepting connection -5928, oserr=130 (Connect aborted)

I think these just indicate heavy load. I should be able to alleviate these by changing the following in magnus.conf

MaxKeepAliveConnections [higher number]

ConnQueueSize [higher number]

Is this correct? Could it cause more problems?

Then this error. It looks like it is causing a Sun One crash every time it happens:

[18/May/2007:11:01:36] config ( 3420): SIGSEGV 11 segmentation violation

[18/May/2007:11:01:36] config ( 3420):si_signo [11]: SEGV

[18/May/2007:11:01:36] config ( 3420):si_errno [0]:

[18/May/2007:11:01:36] config ( 3420):si_code [1]: SEGV_MAPERR [addr: 0x11]

From what I can find this looks like a Java related error. I don't know if it is related to our code or to SunOne itself. When this causes the crash could it be corrupting my files if SunOne has them open at the time?

My main concern is the file corruption. Unless my big problem load related, I really don't care about the load as we have a second server being turned up this week to share some of the load. Does anyone have any thoughts or recommendations on my issue? Do I need to provide more information?

Thanks for your help,

Michael

[2260 byte] By [madesimonea] at [2007-11-27 5:01:40]
# 1

The connection queue full error can be caused by excessive load, but it's most often caused by a web application hanging. Once all the server's threads are stuck inside such a web application, the connection queue will begin to back up until it overflows. When it overflows, that message will be logged and new connections will be closed.

The connect aborted error indicates a client initiated a connection then aborted it. This is often caused by "rude" load balancers that uses aborted connections to ping servers. It's also normal for it to occur in low amounts on the open Internet.

The SIGSEGV is logged by the server's JVM support code, but it doesn't necessarily indicate a problem in Java. However, it does indicate a server crash. That should be investigated. If you have a support contract, Sun should be able to help. If you don't, you can start by looking at the core file.

None of these things should cause Sun ONE Web Server to truncate files in the document root. In fact, the server opens such files as read-only, so it's incapable of modifying them.

elvinga at 2007-7-12 10:19:14 > top of Java-index,Web & Directory Servers,Web Servers...