Directory 5.2 P4 crashes with virtual memory error.
One of my master (dual master repl) crashed last night with the following errors (sorry it is going to be a little long...):
[05/Jan/2007:02:04:44 -0500] - ERROR<5135> - Resource Limit - conn=-1 op=-1 msgId=-1 - Memory allocation er
ror calloc of 878636 bytes failed; errno 12
The server has probably allocated all available virtual memory. To solve this problem, make more virtual m
emory available to your server, or reduce the size of the server's `Maximum Entries in Cache' (cachesize) o
r `Maximum DB Cache Size' (dbcachesize) parameters.
can't recover; calling exit(1)
[05/Jan/2007:02:04:44 -0500] - ERROR<5135> - Resource Limit - conn=-1 op=-1 msgId=-1 - Memory allocation er
ror calloc of 878636 bytes failed; errno 12
The server has probably allocated all available virtual memory. To solve this problem, make more virtual m
emory available to your server, or reduce the size of the server's `Maximum Entries in Cache' (cachesize) o
r `Maximum DB Cache Size' (dbcachesize) parameters.
can't recover; calling exit(1)
[05/Jan/2007:02:04:44 -0500] - ERROR<5130> - Resource Limit - conn=-1 op=-1 msgId=-1 - Memory allocation er
ror malloc of 1083535 bytes failed; errno 12
The server has probably allocated all available virtual memory. To solve this problem, make more virtual m
emory available to your server, or reduce the size of the server's `Maximum Entries in Cache' (cachesize) o
r `Maximum DB Cache Size' (dbcachesize) parameters.
can't recover; calling exit(1)
[05/Jan/2007:02:04:44 -0500] - ERROR<5130> - Resource Limit - conn=-1 op=-1 msgId=-1 - Memory allocation er
ror malloc of 1083535 bytes failed; errno 12
The server has probably allocated all available virtual memory. To solve this problem, make more virtual m
emory available to your server, or reduce the size of the server's `Maximum Entries in Cache' (cachesize) o
r `Maximum DB Cache Size' (dbcachesize) parameters.
can't recover; calling exit(1)
--
First it is trying to allocate 870K, then it tries to allocate 1MB.
1. Why does it try more than once? Is that by design?
2. This server runs on an AIX 5.2 server with 4CPUs,8GB od RAM, and 6GB of paging space. Why did this happened? We did not run out of paging space because the whole server would have crashed instead. Is it fair to asume that the available paging space went below 1mb, or 800k from 6GB?
These are my cache configuration values:
dn: cn=changelog,cn=ldbm database,cn=plugins,cn=config
nsslapd-cachesize: -1
nsslapd-cachememsize: 2097152
...
dn: cn=config,cn=ldbm database,cn=plugins,cn=config
nsslapd-dbcachesize: 268435456
nsslapd-import-cachesize: 268435456
nsslapd-search-tune: 57
nsslapd-disk-low-threshold: 100
nsslapd-disk-full-threshold: 10
...
dn: cn=userRoot,cn=ldbm database,cn=plugins,cn=config
nsslapd-cachesize: -1
nsslapd-cachememsize: 10485760
...
The DS has about 400,000 entries.
The whole database uses about 2.1GB, and id2entry.db3 is about
1.9GB.
Am I looking at a memory leak issue? Is this a documented problem?
The error says to reduce the cache sizes, but they do not seem to be significantly bigger, do they?
I would appreciate any help on this (before it happens again...)
(Does anybody have any word on DS 6?)

