Re: Child process limits cumulative or instantaneous?
Re: Child process limits cumulative or instantaneous?
- Subject: Re: Child process limits cumulative or instantaneous?
- From: mm w <email@hidden>
- Date: Fri, 27 Feb 2009 09:37:06 -0800
I got a mail :), doesn't like my test routine 8)
On Fri, Feb 27, 2009 at 9:30 AM, mm w <email@hidden> wrote:
> #include <stdio.h>
> #include <unistd.h>
> #include <errno.h>
> #include <stdlib.h>
>
> int _count = 0;
>
> #define forever for(;;)
>
> main(int argc, const **argv)
> {
> int stack = 0;
> const char *identifier;
> pid_t pid;
>
> forever {
> if ((pid = fork())) {
> identifier = "com.me.child";
> _count++;
> stack++;
> } else if (pid == -1) {
> fprintf(stderr, "-- %s: %i %i\n", identifier, stack, _count);
> perror("fork");
> exit(errno);
> } else {
> identifier = "com.me.parent";
> }
>
> fprintf(stdout, "-- %s: %i %i\n", identifier, stack, _count);
> }
>
> return 0;
> }
>
> the childs are still alive, I made another test to close each child and not
> it's working well: it will fail at the limit and not in the first place
>
> you are creating ghosts
>
> Best Regards
>
> On Fri, Feb 27, 2009 at 9:03 AM, Ralph Castain <email@hidden> wrote:
>> We actually call waitpid and wait for the callback, check the pid to verify
>> the child terminated, and then complete the loop.
>>
>> To allow for any OS cleanup, we also added a sleep(1) at the bottom of the
>> loop just in case we were encountering garbage collection issues...but that
>> didn't help.
>>
>>
>> On Feb 27, 2009, at 9:53 AM, Dan Heller wrote:
>>
>>> It should not be cumulative. Are you calling wait() (or wait3() or
>>> wait4() or waitpid()) and checking the return value?
>>>
>>> Cheers,
>>> Dan
>>>
>>> On Feb 27, 2009, at 8:46 AM, mm w wrote:
>>>
>>>> I think you are in the first case, I don't think this is a bug, it 's
>>>> a wanted feature
>>>>
>>>> Best Regards
>>>>
>>>> On Fri, Feb 27, 2009 at 8:43 AM, Ralph Castain <email@hidden> wrote:
>>>>>
>>>>> I appreciate that info. There is plenty of swap space available. We are
>>>>> not
>>>>> exceeding the total number of processes under execution at any time, nor
>>>>> the
>>>>> total number of processes under execution by a single user - assuming
>>>>> that
>>>>> these values are interpreted as instantaneous and not cumulative.
>>>>>
>>>>> In other words, if you look at any given time, you will see that the
>>>>> total
>>>>> number of processes is well under the system limit, and the number of
>>>>> processes under execution by the user is only 4, which is well under the
>>>>> system limit.
>>>>>
>>>>> However, the total number of processes executed by the user (cumulative
>>>>> over
>>>>> the entire time the job has been executing) is over 263 and thus pushing
>>>>> the
>>>>> system limit IF that limit is cumulative and not instantaneous.
>>>>>
>>>>> Hope that helps clarify the situation
>>>>> Ralph
>>>>>
>>>>>
>>>>>
>>>>> On Feb 27, 2009, at 9:37 AM, mm w wrote:
>>>>>
>>>>>> ERRORS
>>>>>> Fork() will fail and no child process will be created if:
>>>>>>
>>>>>> [EAGAIN] The system-imposed limit on the total number of
>>>>>> pro-
>>>>>> cesses under execution would be exceeded. This
>>>>>> limit
>>>>>> is configuration-dependent.
>>>>>>
>>>>>> [EAGAIN] The system-imposed limit MAXUPRC (<sys/param.h>) on
>>>>>> the total number of processes under execution by a
>>>>>> single user would be exceeded.
>>>>>>
>>>>>> [ENOMEM] There is insufficient swap space for the new
>>>>>> process.
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, Feb 27, 2009 at 8:13 AM, Ralph Castain <email@hidden> wrote:
>>>>>>>
>>>>>>> Hello folks
>>>>>>>
>>>>>>> I'm the run-time developer for Open MPI and am encountering a resource
>>>>>>> starvation problem that I don't understand. What we have is a test
>>>>>>> program
>>>>>>> that spawns a child process, exchanges a single message with it, and
>>>>>>> then
>>>>>>> the child terminates. We then spawn another child process and go
>>>>>>> through
>>>>>>> the
>>>>>>> same procedure.
>>>>>>>
>>>>>>> This paradigm is typical of some of our users who want to build
>>>>>>> client-server applications using MPI. In these cases, they want the
>>>>>>> job
>>>>>>> to
>>>>>>> run essentially continuously, but have a rate limiter in their
>>>>>>> application
>>>>>>> so only one client is alive at any time.
>>>>>>>
>>>>>>> We have verified that the child processes are properly terminating. We
>>>>>>> have
>>>>>>> monitored and observed that all file descriptors/pipes are being fully
>>>>>>> recovered after each cycle.
>>>>>>>
>>>>>>> However, after 263 cycles, the fork command returns an error
>>>>>>> indicating
>>>>>>> that
>>>>>>> we have exceeded the number of allowed child processes for a given
>>>>>>> process.
>>>>>>> This is fully repeatable, yet the number of child processes in
>>>>>>> existence
>>>>>>> at
>>>>>>> any time is 1, as verified by ps.
>>>>>>>
>>>>>>> Do you have any suggestions as to what could be causing this problem?
>>>>>>> Is
>>>>>>> the
>>>>>>> limit on child processes a cumulative one, or instantaneous?
>>>>>>>
>>>>>>> Appreciate any help you can give
>>>>>>> Ralph
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Do not post admin requests to the list. They will be ignored.
>>>>>>> Darwin-kernel mailing list (email@hidden)
>>>>>>> Help/Unsubscribe/Update your Subscription:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> This email sent to email@hidden
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> -mmw
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> -mmw
>>>> _______________________________________________
>>>> Do not post admin requests to the list. They will be ignored.
>>>> Darwin-kernel mailing list (email@hidden)
>>>> Help/Unsubscribe/Update your Subscription:
>>>>
>>>> This email sent to email@hidden
>>>
>>
>>
>
>
>
> --
> -mmw
>
--
-mmw
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Darwin-kernel mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden