Mailing Lists: Apple Mailing Lists
Image of Mac OS face in stamp
Re: [Xgrid] xgrid controller using massive resources and rebooting server
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xgrid] xgrid controller using massive resources and rebooting server



That makes sense, and I initially thought it was your submission file that may be big. Such a large output would crash the controller very easily, if it creates a property list file as it seems it does from the crash log. The property list will be built in memory and will be at least 1 GB large. Creating several files like this will quickly reach the 2 GB limit (probably 1 or 2 files will be enough!).

Given the 2 GB limit, it is probably reasonable to limit yourself to <100 MB size files. But you also have to consider that the file is going to first be sent by the agent to the controller, then by the controller to the client. If client/controller are the same machine, maybe smart things happen, but the overhead might be still there. Using a solid separate fileserver is probably a better idea.

charles

On Feb 22, 2007, at 11:24 AM, Adam Kocoloski wrote:

Hi all,

So, the job I was submitting produce O(1GB) of binary output files, and I'm certain that the controller resource usage and crashes were due to trying to handle all that data. I tried adding a few lines to the end of each task where I AFP guest-mount a "drop box" on a fileserver and move my output files there instead of bringing them back through the controller. Everything is rock-solid now, and the xgridcontrollerd memory usage is stable at about 7MB.

I guess this is not surprising. In working with other batch systems (Condor, LSF, SGE) I've never let the system handle anything but the streams. It'd be nice if we could quantify just how much data is reasonable to bring through the controller. Regards,

Adam


On Feb 22, 2007, at 6:50 AM, Adam Kocoloski wrote:

Hi Charles,

On Feb 22, 2007, at 12:30 AM, Charles Parnot wrote:

how big is your batch xml file?

904 KB

Is it crashing at submission or while running the jobs?

While running the jobs. First time it crashed I had a new job Pending with 0 tasks in Xgrid Admin (plus a couple other running jobs), but subsequent crashes occurred during running.


--Adam _______________________________________________
Do not post admin requests to the list. They will be ignored.
Xgrid-users mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden


-- Xgrid-at-Stanford Help science move fast forward: http://cmgm.stanford.edu/~cparnot/xgrid-stanford

Charles Parnot
email@hidden




_______________________________________________ Do not post admin requests to the list. They will be ignored. Xgrid-users mailing list (email@hidden) Help/Unsubscribe/Update your Subscription: This email sent to email@hidden
References: 
 >[Xgrid] xgrid controller using massive resources and rebooting server (From: Adam Kocoloski <email@hidden>)
 >Re: [Xgrid] xgrid controller using massive resources and rebooting server (From: Charles Parnot <email@hidden>)
 >Re: [Xgrid] xgrid controller using massive resources and rebooting server (From: Adam Kocoloski <email@hidden>)
 >Re: [Xgrid] xgrid controller using massive resources and rebooting server (From: Adam Kocoloski <email@hidden>)



Visit the Apple Store online or at retail locations.
1-800-MY-APPLE

Contact Apple | Terms of Use | Privacy Policy

Copyright © 2011 Apple Inc. All rights reserved.