Tarantella EE 4 - CPU usage goes >80%
Hi,
i\'ve a secure global desktop 4.0 installed and working fine almost time,
but randomly a single ttaxpe process goes out of control and get
>60-70-80% CPU blocking all other activity. I\'ve to kill the process and
all come back to normality. This happens about one a day. Is there a fix
for this problem?
Thank you
bye
m.piaggesi -at teamsystem.com
[426 byte] By [
Mirko] at [2007-11-25 20:53:48]

# 1
Hi Mirko
Are you certain this is not normal behaviour, i.e. the work is being driven
by the client application? If you run truss on this process it might
tell us whether it is responding to requests or in some sort of internal spin?
When you say "blocking all other activity", can you be more precise in
saying what you can and can't do?
I would upgrade to 4.1 in any case. There was a similar incident in 4.0
but I can't say whether or not it is anything to do with what you are seeing
with just this information though.
Thanks
Barrie
On 2005-08-05, Mirko <m@m.com> wrote:
> Hi,
> i\'ve a secure global desktop 4.0 installed and working fine almost time,
> but randomly a single ttaxpe process goes out of control and get
>>60-70-80% CPU blocking all other activity. I\'ve to kill the process and
> all come back to normality. This happens about one a day. Is there a fix
> for this problem?
>
> Thank you
> bye
>
> m.piaggesi -at teamsystem.com
>
# 2
Yes, I'm sure because it stays at about 80% too much time. When this
happens, i can't neither get to tarantella login form, all tarantella
commands stop respondig (ex. tarantella status never return), who are
alredy logged in a webtop session works very slowly ecc ecc.
In the fix list of 4.1 there is this:
606003 Fixed the problem that causes ttaexecpe to spin in some
circunstance.
I think it could solve my problem but i'd like not reinstalling/upgrading
my system because i've just upgraded from 3.32 to 4.0.
At the moment, i've written a bash script that every 5 mins check if there
is a ttaxpe pid uses more than 70% cpu and kill it.
thank you
bye
> Hi Mirko
> Are you certain this is not normal behaviour, i.e. the work is being driven
> by the client application? If you run truss on this process it might
> tell us whether it is responding to requests or in some sort of internal
spin?
> When you say "blocking all other activity", can you be more precise in
> saying what you can and can't do?
> I would upgrade to 4.1 in any case. There was a similar incident in 4.0
> but I can't say whether or not it is anything to do with what you are seeing
> with just this information though.
> Thanks
>Barrie
> On 2005-08-05, Mirko <m@m.com> wrote:
> > Hi,
> > i\'ve a secure global desktop 4.0 installed and working fine almost time,
> > but randomly a single ttaxpe process goes out of control and get
> >>60-70-80% CPU blocking all other activity. I\'ve to kill the process and
> > all come back to normality. This happens about one a day. Is there a fix
> > for this problem?
> >
> > Thank you
> > bye
> >
> > m.piaggesi -at teamsystem.com
> >
mirko at 2007-7-4 19:03:38 >

# 3
Hi Mirko
I don't think 606003 is your problem, that could only occur very rarely,
if at all. It was never reproduced in-house, or shown to be an actual problem,
but was proved to be a logical flaw in how telnet connections were managed
internally.
I'm not sure how a protocol engine process, even one consuming an entire
CPU, could manage to stop a server from responding, that doesn't make a
lot of sense as a cause on it's own.
I'm still inclined to think that either this rate of CPU usage is normal
or is a consequence of some other issue. Have you managed to make a
truss file of the system calls made by the ttaxpe process in this state?
I would also check the JVM tuning values you have for this system i.e.
the size of JVM you require. The most common cause of unavailablity,
other than networking problems, is when the JVM in the Jserver fails to
respond because it has grown to a maximum size and is severely
constrained by the garbage collector.
4.1 is mostly bug fixes to 4.0 so I would not consider it a major change
to upgrade.
Barrie
On 2005-08-05, mirko <m@m.com> wrote:
> Yes, I'm sure because it stays at about 80% too much time. When this
> happens, i can't neither get to tarantella login form, all tarantella
> commands stop respondig (ex. tarantella status never return), who are
> alredy logged in a webtop session works very slowly ecc ecc.
> In the fix list of 4.1 there is this:
> 606003 Fixed the problem that causes ttaexecpe to spin in some
> circunstance.
> I think it could solve my problem but i'd like not reinstalling/upgrading
> my system because i've just upgraded from 3.32 to 4.0.
>
> At the moment, i've written a bash script that every 5 mins check if there
> is a ttaxpe pid uses more than 70% cpu and kill it.
>
> thank you
> bye
>
>
>> Hi Mirko
>
>> Are you certain this is not normal behaviour, i.e. the work is being driven
>> by the client application? If you run truss on this process it might
>> tell us whether it is responding to requests or in some sort of internal
> spin?
>> When you say "blocking all other activity", can you be more precise in
>> saying what you can and can't do?
>
>> I would upgrade to 4.1 in any case. There was a similar incident in 4.0
>> but I can't say whether or not it is anything to do with what you are seeing
>> with just this information though.
>
>> Thanks
>>Barrie
>
>> On 2005-08-05, Mirko <m@m.com> wrote:
>> > Hi,
>> > i\'ve a secure global desktop 4.0 installed and working fine almost time,
>> > but randomly a single ttaxpe process goes out of control and get
>> >>60-70-80% CPU blocking all other activity. I\'ve to kill the process and
>> > all come back to normality. This happens about one a day. Is there a fix
>> > for this problem?
>> >
>> > Thank you
>> > bye
>> >
>> > m.piaggesi -at teamsystem.com
>> >
>
>
>
# 4
Hi barrie, thank you for your help.
I've got a server with 1.8GB ram; my JVM memory settings was: 58-150%-512.
Now i've changed them to 512-150%-1024. What do you suggest for these
settings considering that i've about 150 webtop session open simultanely?
By real, the xpe process doesn't block the entire server, but only
tarantella; i still can connect using ssh/telnet and access any other
service, but not to tarantella.
I try to upgrade a local copy of tarantella 4 to 4.1; the entire process
has gone fine without trouble and i don't lost my settings/object. Do you
suggest to me to upgrade or to leave the 4.0 in out server?
Thanks again
> Hi Mirko
> I don't think 606003 is your problem, that could only occur very rarely,
> if at all. It was never reproduced in-house, or shown to be an actual
problem,
> but was proved to be a logical flaw in how telnet connections were managed
> internally.
> I'm not sure how a protocol engine process, even one consuming an entire
> CPU, could manage to stop a server from responding, that doesn't make a
> lot of sense as a cause on it's own.
> I'm still inclined to think that either this rate of CPU usage is normal
> or is a consequence of some other issue. Have you managed to make a
> truss file of the system calls made by the ttaxpe process in this state?
> I would also check the JVM tuning values you have for this system i.e.
> the size of JVM you require. The most common cause of unavailablity,
> other than networking problems, is when the JVM in the Jserver fails to
> respond because it has grown to a maximum size and is severely
> constrained by the garbage collector.
> 4.1 is mostly bug fixes to 4.0 so I would not consider it a major change
> to upgrade.
> Barrie
> On 2005-08-05, mirko <m@m.com> wrote:
> > Yes, I'm sure because it stays at about 80% too much time. When this
> > happens, i can't neither get to tarantella login form, all tarantella
> > commands stop respondig (ex. tarantella status never return), who are
> > alredy logged in a webtop session works very slowly ecc ecc.
> > In the fix list of 4.1 there is this:
> > 606003 Fixed the problem that causes ttaexecpe to spin in some
> > circunstance.
> > I think it could solve my problem but i'd like not reinstalling/upgrading
> > my system because i've just upgraded from 3.32 to 4.0.
> >
> > At the moment, i've written a bash script that every 5 mins check if there
> > is a ttaxpe pid uses more than 70% cpu and kill it.
> >
> > thank you
> > bye
> >
> >
> >> Hi Mirko
> >
> >> Are you certain this is not normal behaviour, i.e. the work is being
driven
> >> by the client application? If you run truss on this process it might
> >> tell us whether it is responding to requests or in some sort of internal
> > spin?
> >> When you say "blocking all other activity", can you be more precise in
> >> saying what you can and can't do?
> >
> >> I would upgrade to 4.1 in any case. There was a similar incident in 4.0
> >> but I can't say whether or not it is anything to do with what you are
seeing
> >> with just this information though.
> >
> >> Thanks
> >>Barrie
> >
> >> On 2005-08-05, Mirko <m@m.com> wrote:
> >> > Hi,
> >> > i\'ve a secure global desktop 4.0 installed and working fine almost
time,
> >> > but randomly a single ttaxpe process goes out of control and get
> >> >>60-70-80% CPU blocking all other activity. I\'ve to kill the process and
> >> > all come back to normality. This happens about one a day. Is there a fix
> >> > for this problem?
> >> >
> >> > Thank you
> >> > bye
> >> >
> >> > m.piaggesi -at teamsystem.com
> >> >
> >
> >
> >
mirko at 2007-7-4 19:03:38 >

# 5
On 2005-08-09, mirko <m@m.com> wrote:
> Hi barrie, thank you for your help.
> I've got a server with 1.8GB ram; my JVM memory settings was: 58-150%-512.
> Now i've changed them to 512-150%-1024. What do you suggest for these
> settings considering that i've about 150 webtop session open simultanely?
>
Either of these settings is probably okay. The whole tuning thing is
less of an issue with the 1.5 JVM anyway. If you only have this number
of sessions I don't think this can be the problem.
> By real, the xpe process doesn't block the entire server, but only
> tarantella; i still can connect using ssh/telnet and access any other
> service, but not to tarantella.
This makes more sense but it may mean that the rogue process is keeping
the rest of the system busy or vice versa. Can you access SGD as another
user? The PE manager (ttauxserv) utilises a namespace that is user
based, so it is more likely that the rogue process has deadlocked part of
this namespace making it impossible to log in as the user of that xpe
process. I can think of two ways to get a handle on this rogue process:
1. gather the ouput of truss (strace on Linux) -- this may show nothing
but that would indicate a tight CPU loop so would be useful info.
2. generate a core file by sending the process a signal; this would not
produce very useful information unless the binary was unstripped.
Support should be able to provide you with this form of binary; I
can't distribute them via the newsgroup unfortunately.
>
> I try to upgrade a local copy of tarantella 4 to 4.1; the entire process
> has gone fine without trouble and i don't lost my settings/object. Do you
> suggest to me to upgrade or to leave the 4.0 in out server?
>
4.0 was a technology change from 3.x, although 4.0 is stable and should
run fine, 4.1 fixes problems that were discovered in the field with the
new webtop infrastructure etc. Since it doesn't radically alter the
footprint of 4.0 I think it is worth having. I can't promise it fixes this
problem though we would need to get more information to work on.
Regards
Barrie
> Thanks again
>
>> Hi Mirko
>
>> I don't think 606003 is your problem, that could only occur very rarely,
>> if at all. It was never reproduced in-house, or shown to be an actual
> problem,
>> but was proved to be a logical flaw in how telnet connections were managed
>> internally.
>
>> I'm not sure how a protocol engine process, even one consuming an entire
>> CPU, could manage to stop a server from responding, that doesn't make a
>> lot of sense as a cause on it's own.
>
>> I'm still inclined to think that either this rate of CPU usage is normal
>> or is a consequence of some other issue. Have you managed to make a
>> truss file of the system calls made by the ttaxpe process in this state?
>> I would also check the JVM tuning values you have for this system i.e.
>> the size of JVM you require. The most common cause of unavailablity,
>> other than networking problems, is when the JVM in the Jserver fails to
>> respond because it has grown to a maximum size and is severely
>> constrained by the garbage collector.
>
>> 4.1 is mostly bug fixes to 4.0 so I would not consider it a major change
>> to upgrade.
>
>> Barrie
>
>> On 2005-08-05, mirko <m@m.com> wrote:
>> > Yes, I'm sure because it stays at about 80% too much time. When this
>> > happens, i can't neither get to tarantella login form, all tarantella
>> > commands stop respondig (ex. tarantella status never return), who are
>> > alredy logged in a webtop session works very slowly ecc ecc.
>> > In the fix list of 4.1 there is this:
>> > 606003 Fixed the problem that causes ttaexecpe to spin in some
>> > circunstance.
>> > I think it could solve my problem but i'd like not reinstalling/upgrading
>> > my system because i've just upgraded from 3.32 to 4.0.
>> >
>> > At the moment, i've written a bash script that every 5 mins check if there
>> > is a ttaxpe pid uses more than 70% cpu and kill it.
>> >
>> > thank you
>> > bye
>> >
>> >
>> >> Hi Mirko
>> >
>> >> Are you certain this is not normal behaviour, i.e. the work is being
> driven
>> >> by the client application? If you run truss on this process it might
>> >> tell us whether it is responding to requests or in some sort of internal
>> > spin?
>> >> When you say "blocking all other activity", can you be more precise in
>> >> saying what you can and can't do?
>> >
>> >> I would upgrade to 4.1 in any case. There was a similar incident in 4.0
>> >> but I can't say whether or not it is anything to do with what you are
> seeing
>> >> with just this information though.
>> >
>> >> Thanks
>> >>Barrie
>> >
>> >> On 2005-08-05, Mirko <m@m.com> wrote:
>> >> > Hi,
>> >> > i\'ve a secure global desktop 4.0 installed and working fine almost
> time,
>> >> > but randomly a single ttaxpe process goes out of control and get
>> >> >>60-70-80% CPU blocking all other activity. I\'ve to kill the process and
>> >> > all come back to normality. This happens about one a day. Is there a fix
>> >> > for this problem?
>> >> >
>> >> > Thank you
>> >> > bye
>> >> >
>> >> > m.piaggesi -at teamsystem.com
>> >> >
>> >
>> >
>> >
>
>
>
