YASPP: Yet Another Stacksize Problem Post

Hi,

I've just installed the latest community version of Opensolaris, along with the beta of sun studio 12 on a 16-way 32GB amd64 machine. So far this machine has been running with linux+intel 9.1 compilers.

When I tried to run our openmp code, after solving a (very few) porting issues, I found out that it runs fine for the "minimal size" problems. For the minimal size code to run i have to set the stack size to at least ~45056kb (44MB).

However the medium size problem needs exactly 8 times more memory. If I try to set the stack size to 360448kb, I get an "unlimited" stack size, and the program crashes. I guess that by unlimited, the system means I have hit the maximum limit for the stack size. Is there any way to increase it?

[762 byte] By [franjesusa] at [2007-11-26 21:56:50]
# 1
Hi how many threads are you running?. Is the program built 32 or 64 bit.Thanks
prash_nsa at 2007-7-10 3:53:33 > top of Java-index,Development Tools,Solaris and Linux Development Tools...
# 2
16 theads (it's a 16-way machine) and the code is 64 bits.I've posted debug information along with a simple test on the opensolaris forums: http://www.opensolaris.org/jive/thread.jspa?threadID=26639&tstart=0
franjesusa at 2007-7-10 3:53:33 > top of Java-index,Development Tools,Solaris and Linux Development Tools...
# 3

It's a bug in the compiler. Let's look at your example program from the OpenSolaris forum:

program stack

implicit none

integer*8, parameter :: k=1024

integer*8, parameter :: s=512*k*k/4

integer,automatic :: foo(s)

integer*8 i

print*, 'allocating:', s/1024*4,'kbytes'

!$omp parallel do private(i)

do i=1,s

foo(i)=1

end do

!$omp end parallel do

pause

print*, foo(1),foo(s)

print*, 'worked'

end program

If I change s to 512k, then look at the assembly code, this is what I see:

$ f90 -O3 -m64 -xarch=sse3 ompstack.f90 -S

$ grep 'subq.*%rsp' ompstack.s

subq$32,%rsp

subq$524392,%rsp ;/ line : 1

That second "subq" instruction is reserving space on the stack for the array. Now if I restore s to 512M, I see this:

devtools-x4200-0$ f90 -O3 -m64 -xarch=sse3 ompstack.f90 -S

devtools-x4200-0$ grep 'subq.*%rsp' ompstack.s

subq$32,%rsp

subq$104,%rsp;/ line : 1

Looks as though the code that computes the amount of space to reserve on the stack is overflowing. I have filed bug 6541518.

The program will work if you remove the "automatic" keyword from the variable declaration. Most likely, your OpenMP program will also work if you remove "automatic" keywords. Alternatively, you could change the automatic variables to local allocatable variables. Of course, you would then have to insert an allocate statement at the beginning of the procedure.

igba at 2007-7-10 3:53:33 > top of Java-index,Development Tools,Solaris and Linux Development Tools...
# 4
Thanks for having a look! :-)We were doing the allocate() thing, but since openmp recommends that you put all local arrays on the stack to improve performance, we turned them into automatic.I hope it gets fixed soon.Thanks!
franjesusa at 2007-7-10 3:53:33 > top of Java-index,Development Tools,Solaris and Linux Development Tools...
# 5

> We were doing the allocate() thing, but since openmp

> recommends that you put all local arrays on the stack

> to improve performance, we turned them into

> automatic.

For arrays this large, I think that the overhead of doing allocation will be dwarfed by the time it takes actually to compute values.

One caveat that I just remembered with respect to allocatable arrays: you can't use them directly in an OpenMP privatizing clause (e.g., private, firstprivate, reduction). You can, however, pass an allocatable array to a subroutine, which can then use its dummy argument in a privatizing clause. Yes, it's very strange. The standard is written under the assumption that each thread will allocate its own private array.

> I hope it gets fixed soon.

I will try to keep this thread up to date with the status of the bug.

igba at 2007-7-10 3:53:33 > top of Java-index,Development Tools,Solaris and Linux Development Tools...