YASPP: Yet Another Stacksize Problem Post
Hi,
I've just installed the latest community version of Opensolaris, along with the beta of sun studio 12 on a 16-way 32GB amd64 machine. So far this machine has been running with linux+intel 9.1 compilers.
When I tried to run our openmp code, after solving a (very few) porting issues, I found out that it runs fine for the "minimal size" problems. For the minimal size code to run i have to set the stack size to at least ~45056kb (44MB).
However the medium size problem needs exactly 8 times more memory. If I try to set the stack size to 360448kb, I get an "unlimited" stack size, and the program crashes. I guess that by unlimited, the system means I have hit the maximum limit for the stack size. Is there any way to increase it?
[762 byte] By [
franjesusa] at [2007-11-26 21:56:50]

# 3
It's a bug in the compiler. Let's look at your example program from the OpenSolaris forum:
program stack
implicit none
integer*8, parameter :: k=1024
integer*8, parameter :: s=512*k*k/4
integer,automatic :: foo(s)
integer*8 i
print*, 'allocating:', s/1024*4,'kbytes'
!$omp parallel do private(i)
do i=1,s
foo(i)=1
end do
!$omp end parallel do
pause
print*, foo(1),foo(s)
print*, 'worked'
end program
If I change s to 512k, then look at the assembly code, this is what I see:
$ f90 -O3 -m64 -xarch=sse3 ompstack.f90 -S
$ grep 'subq.*%rsp' ompstack.s
subq$32,%rsp
subq$524392,%rsp ;/ line : 1
That second "subq" instruction is reserving space on the stack for the array. Now if I restore s to 512M, I see this:
devtools-x4200-0$ f90 -O3 -m64 -xarch=sse3 ompstack.f90 -S
devtools-x4200-0$ grep 'subq.*%rsp' ompstack.s
subq$32,%rsp
subq$104,%rsp;/ line : 1
Looks as though the code that computes the amount of space to reserve on the stack is overflowing. I have filed bug 6541518.
The program will work if you remove the "automatic" keyword from the variable declaration. Most likely, your OpenMP program will also work if you remove "automatic" keywords. Alternatively, you could change the automatic variables to local allocatable variables. Of course, you would then have to insert an allocate statement at the beginning of the procedure.
igba at 2007-7-10 3:53:33 >

# 5
> We were doing the allocate() thing, but since openmp
> recommends that you put all local arrays on the stack
> to improve performance, we turned them into
> automatic.
For arrays this large, I think that the overhead of doing allocation will be dwarfed by the time it takes actually to compute values.
One caveat that I just remembered with respect to allocatable arrays: you can't use them directly in an OpenMP privatizing clause (e.g., private, firstprivate, reduction). You can, however, pass an allocatable array to a subroutine, which can then use its dummy argument in a privatizing clause. Yes, it's very strange. The standard is written under the assumption that each thread will allocate its own private array.
> I hope it gets fixed soon.
I will try to keep this thread up to date with the status of the bug.
igba at 2007-7-10 3:53:33 >
