Tarantella EE 4 - CPU usage goes >80%

Hi,

i\'ve a secure global desktop 4.0 installed and working fine almost time,

but randomly a single ttaxpe process goes out of control and get

>60-70-80% CPU blocking all other activity. I\'ve to kill the process and

all come back to normality. This happens about one a day. Is there a fix

for this problem?

Thank you

bye

m.piaggesi -at teamsystem.com

[426 byte] By [Mirko] at [2007-11-25 20:53:48]
# 1

Hi Mirko

Are you certain this is not normal behaviour, i.e. the work is being driven

by the client application? If you run truss on this process it might

tell us whether it is responding to requests or in some sort of internal spin?

When you say "blocking all other activity", can you be more precise in

saying what you can and can't do?

I would upgrade to 4.1 in any case. There was a similar incident in 4.0

but I can't say whether or not it is anything to do with what you are seeing

with just this information though.

Thanks

Barrie

On 2005-08-05, Mirko <m@m.com> wrote:

> Hi,

> i\'ve a secure global desktop 4.0 installed and working fine almost time,

> but randomly a single ttaxpe process goes out of control and get

>>60-70-80% CPU blocking all other activity. I\'ve to kill the process and

> all come back to normality. This happens about one a day. Is there a fix

> for this problem?

>

> Thank you

> bye

>

> m.piaggesi -at teamsystem.com

>

barrie at 2007-7-4 19:03:38 > top of Java-index,Desktop,Sun Secure Global Desktop Software...
# 2

Yes, I'm sure because it stays at about 80% too much time. When this

happens, i can't neither get to tarantella login form, all tarantella

commands stop respondig (ex. tarantella status never return), who are

alredy logged in a webtop session works very slowly ecc ecc.

In the fix list of 4.1 there is this:

606003 Fixed the problem that causes ttaexecpe to spin in some

circunstance.

I think it could solve my problem but i'd like not reinstalling/upgrading

my system because i've just upgraded from 3.32 to 4.0.

At the moment, i've written a bash script that every 5 mins check if there

is a ttaxpe pid uses more than 70% cpu and kill it.

thank you

bye

> Hi Mirko

> Are you certain this is not normal behaviour, i.e. the work is being driven

> by the client application? If you run truss on this process it might

> tell us whether it is responding to requests or in some sort of internal

spin?

> When you say "blocking all other activity", can you be more precise in

> saying what you can and can't do?

> I would upgrade to 4.1 in any case. There was a similar incident in 4.0

> but I can't say whether or not it is anything to do with what you are seeing

> with just this information though.

> Thanks

>Barrie

> On 2005-08-05, Mirko <m@m.com> wrote:

> > Hi,

> > i\'ve a secure global desktop 4.0 installed and working fine almost time,

> > but randomly a single ttaxpe process goes out of control and get

> >>60-70-80% CPU blocking all other activity. I\'ve to kill the process and

> > all come back to normality. This happens about one a day. Is there a fix

> > for this problem?

> >

> > Thank you

> > bye

> >

> > m.piaggesi -at teamsystem.com

> >

mirko at 2007-7-4 19:03:38 > top of Java-index,Desktop,Sun Secure Global Desktop Software...
# 3

Hi Mirko

I don't think 606003 is your problem, that could only occur very rarely,

if at all. It was never reproduced in-house, or shown to be an actual problem,

but was proved to be a logical flaw in how telnet connections were managed

internally.

I'm not sure how a protocol engine process, even one consuming an entire

CPU, could manage to stop a server from responding, that doesn't make a

lot of sense as a cause on it's own.

I'm still inclined to think that either this rate of CPU usage is normal

or is a consequence of some other issue. Have you managed to make a

truss file of the system calls made by the ttaxpe process in this state?

I would also check the JVM tuning values you have for this system i.e.

the size of JVM you require. The most common cause of unavailablity,

other than networking problems, is when the JVM in the Jserver fails to

respond because it has grown to a maximum size and is severely

constrained by the garbage collector.

4.1 is mostly bug fixes to 4.0 so I would not consider it a major change

to upgrade.

Barrie

On 2005-08-05, mirko <m@m.com> wrote:

> Yes, I'm sure because it stays at about 80% too much time. When this

> happens, i can't neither get to tarantella login form, all tarantella

> commands stop respondig (ex. tarantella status never return), who are

> alredy logged in a webtop session works very slowly ecc ecc.

> In the fix list of 4.1 there is this:

> 606003 Fixed the problem that causes ttaexecpe to spin in some

> circunstance.

> I think it could solve my problem but i'd like not reinstalling/upgrading

> my system because i've just upgraded from 3.32 to 4.0.

>

> At the moment, i've written a bash script that every 5 mins check if there

> is a ttaxpe pid uses more than 70% cpu and kill it.

>

> thank you

> bye

>

>

>> Hi Mirko

>

>> Are you certain this is not normal behaviour, i.e. the work is being driven

>> by the client application? If you run truss on this process it might

>> tell us whether it is responding to requests or in some sort of internal

> spin?

>> When you say "blocking all other activity", can you be more precise in

>> saying what you can and can't do?

>

>> I would upgrade to 4.1 in any case. There was a similar incident in 4.0

>> but I can't say whether or not it is anything to do with what you are seeing

>> with just this information though.

>

>> Thanks

>>Barrie

>

>> On 2005-08-05, Mirko <m@m.com> wrote:

>> > Hi,

>> > i\'ve a secure global desktop 4.0 installed and working fine almost time,

>> > but randomly a single ttaxpe process goes out of control and get

>> >>60-70-80% CPU blocking all other activity. I\'ve to kill the process and

>> > all come back to normality. This happens about one a day. Is there a fix

>> > for this problem?

>> >

>> > Thank you

>> > bye

>> >

>> > m.piaggesi -at teamsystem.com

>> >

>

>

>

barrie at 2007-7-4 19:03:38 > top of Java-index,Desktop,Sun Secure Global Desktop Software...
# 4

Hi barrie, thank you for your help.

I've got a server with 1.8GB ram; my JVM memory settings was: 58-150%-512.

Now i've changed them to 512-150%-1024. What do you suggest for these

settings considering that i've about 150 webtop session open simultanely?

By real, the xpe process doesn't block the entire server, but only

tarantella; i still can connect using ssh/telnet and access any other

service, but not to tarantella.

I try to upgrade a local copy of tarantella 4 to 4.1; the entire process

has gone fine without trouble and i don't lost my settings/object. Do you

suggest to me to upgrade or to leave the 4.0 in out server?

Thanks again

> Hi Mirko

> I don't think 606003 is your problem, that could only occur very rarely,

> if at all. It was never reproduced in-house, or shown to be an actual

problem,

> but was proved to be a logical flaw in how telnet connections were managed

> internally.

> I'm not sure how a protocol engine process, even one consuming an entire

> CPU, could manage to stop a server from responding, that doesn't make a

> lot of sense as a cause on it's own.

> I'm still inclined to think that either this rate of CPU usage is normal

> or is a consequence of some other issue. Have you managed to make a

> truss file of the system calls made by the ttaxpe process in this state?

> I would also check the JVM tuning values you have for this system i.e.

> the size of JVM you require. The most common cause of unavailablity,

> other than networking problems, is when the JVM in the Jserver fails to

> respond because it has grown to a maximum size and is severely

> constrained by the garbage collector.

> 4.1 is mostly bug fixes to 4.0 so I would not consider it a major change

> to upgrade.

> Barrie

> On 2005-08-05, mirko <m@m.com> wrote:

> > Yes, I'm sure because it stays at about 80% too much time. When this

> > happens, i can't neither get to tarantella login form, all tarantella

> > commands stop respondig (ex. tarantella status never return), who are

> > alredy logged in a webtop session works very slowly ecc ecc.

> > In the fix list of 4.1 there is this:

> > 606003 Fixed the problem that causes ttaexecpe to spin in some

> > circunstance.

> > I think it could solve my problem but i'd like not reinstalling/upgrading

> > my system because i've just upgraded from 3.32 to 4.0.

> >

> > At the moment, i've written a bash script that every 5 mins check if there

> > is a ttaxpe pid uses more than 70% cpu and kill it.

> >

> > thank you

> > bye

> >

> >

> >> Hi Mirko

> >

> >> Are you certain this is not normal behaviour, i.e. the work is being

driven

> >> by the client application? If you run truss on this process it might

> >> tell us whether it is responding to requests or in some sort of internal

> > spin?

> >> When you say "blocking all other activity", can you be more precise in

> >> saying what you can and can't do?

> >

> >> I would upgrade to 4.1 in any case. There was a similar incident in 4.0

> >> but I can't say whether or not it is anything to do with what you are

seeing

> >> with just this information though.

> >

> >> Thanks

> >>Barrie

> >

> >> On 2005-08-05, Mirko <m@m.com> wrote:

> >> > Hi,

> >> > i\'ve a secure global desktop 4.0 installed and working fine almost

time,

> >> > but randomly a single ttaxpe process goes out of control and get

> >> >>60-70-80% CPU blocking all other activity. I\'ve to kill the process and

> >> > all come back to normality. This happens about one a day. Is there a fix

> >> > for this problem?

> >> >

> >> > Thank you

> >> > bye

> >> >

> >> > m.piaggesi -at teamsystem.com

> >> >

> >

> >

> >

mirko at 2007-7-4 19:03:38 > top of Java-index,Desktop,Sun Secure Global Desktop Software...
# 5

On 2005-08-09, mirko <m@m.com> wrote:

> Hi barrie, thank you for your help.

> I've got a server with 1.8GB ram; my JVM memory settings was: 58-150%-512.

> Now i've changed them to 512-150%-1024. What do you suggest for these

> settings considering that i've about 150 webtop session open simultanely?

>

Either of these settings is probably okay. The whole tuning thing is

less of an issue with the 1.5 JVM anyway. If you only have this number

of sessions I don't think this can be the problem.

> By real, the xpe process doesn't block the entire server, but only

> tarantella; i still can connect using ssh/telnet and access any other

> service, but not to tarantella.

This makes more sense but it may mean that the rogue process is keeping

the rest of the system busy or vice versa. Can you access SGD as another

user? The PE manager (ttauxserv) utilises a namespace that is user

based, so it is more likely that the rogue process has deadlocked part of

this namespace making it impossible to log in as the user of that xpe

process. I can think of two ways to get a handle on this rogue process:

1. gather the ouput of truss (strace on Linux) -- this may show nothing

but that would indicate a tight CPU loop so would be useful info.

2. generate a core file by sending the process a signal; this would not

produce very useful information unless the binary was unstripped.

Support should be able to provide you with this form of binary; I

can't distribute them via the newsgroup unfortunately.

>

> I try to upgrade a local copy of tarantella 4 to 4.1; the entire process

> has gone fine without trouble and i don't lost my settings/object. Do you

> suggest to me to upgrade or to leave the 4.0 in out server?

>

4.0 was a technology change from 3.x, although 4.0 is stable and should

run fine, 4.1 fixes problems that were discovered in the field with the

new webtop infrastructure etc. Since it doesn't radically alter the

footprint of 4.0 I think it is worth having. I can't promise it fixes this

problem though we would need to get more information to work on.

Regards

Barrie

> Thanks again

>

>> Hi Mirko

>

>> I don't think 606003 is your problem, that could only occur very rarely,

>> if at all. It was never reproduced in-house, or shown to be an actual

> problem,

>> but was proved to be a logical flaw in how telnet connections were managed

>> internally.

>

>> I'm not sure how a protocol engine process, even one consuming an entire

>> CPU, could manage to stop a server from responding, that doesn't make a

>> lot of sense as a cause on it's own.

>

>> I'm still inclined to think that either this rate of CPU usage is normal

>> or is a consequence of some other issue. Have you managed to make a

>> truss file of the system calls made by the ttaxpe process in this state?

>> I would also check the JVM tuning values you have for this system i.e.

>> the size of JVM you require. The most common cause of unavailablity,

>> other than networking problems, is when the JVM in the Jserver fails to

>> respond because it has grown to a maximum size and is severely

>> constrained by the garbage collector.

>

>> 4.1 is mostly bug fixes to 4.0 so I would not consider it a major change

>> to upgrade.

>

>> Barrie

>

>> On 2005-08-05, mirko <m@m.com> wrote:

>> > Yes, I'm sure because it stays at about 80% too much time. When this

>> > happens, i can't neither get to tarantella login form, all tarantella

>> > commands stop respondig (ex. tarantella status never return), who are

>> > alredy logged in a webtop session works very slowly ecc ecc.

>> > In the fix list of 4.1 there is this:

>> > 606003 Fixed the problem that causes ttaexecpe to spin in some

>> > circunstance.

>> > I think it could solve my problem but i'd like not reinstalling/upgrading

>> > my system because i've just upgraded from 3.32 to 4.0.

>> >

>> > At the moment, i've written a bash script that every 5 mins check if there

>> > is a ttaxpe pid uses more than 70% cpu and kill it.

>> >

>> > thank you

>> > bye

>> >

>> >

>> >> Hi Mirko

>> >

>> >> Are you certain this is not normal behaviour, i.e. the work is being

> driven

>> >> by the client application? If you run truss on this process it might

>> >> tell us whether it is responding to requests or in some sort of internal

>> > spin?

>> >> When you say "blocking all other activity", can you be more precise in

>> >> saying what you can and can't do?

>> >

>> >> I would upgrade to 4.1 in any case. There was a similar incident in 4.0

>> >> but I can't say whether or not it is anything to do with what you are

> seeing

>> >> with just this information though.

>> >

>> >> Thanks

>> >>Barrie

>> >

>> >> On 2005-08-05, Mirko <m@m.com> wrote:

>> >> > Hi,

>> >> > i\'ve a secure global desktop 4.0 installed and working fine almost

> time,

>> >> > but randomly a single ttaxpe process goes out of control and get

>> >> >>60-70-80% CPU blocking all other activity. I\'ve to kill the process and

>> >> > all come back to normality. This happens about one a day. Is there a fix

>> >> > for this problem?

>> >> >

>> >> > Thank you

>> >> > bye

>> >> >

>> >> > m.piaggesi -at teamsystem.com

>> >> >

>> >

>> >

>> >

>

>

>

barrie at 2007-7-4 19:03:38 > top of Java-index,Desktop,Sun Secure Global Desktop Software...