Just a follow up, in case some new information triggers any thoughts
for any one:
- We're using Kerberos authentication between controller and agents.
- The system log shows agents connecting normally (no timeouts, etc.),
except that all the agents are still seen as offline.
Is it possible that agents are mis-reporting their status to the
controller or is it more likely that the controller has things
confused?
barry
On 2/11/06, Barry Wark <email@hidden> wrote:
> I've run into a strange problem(s) with our Xgrid controller and
> haven't found the answer in the list archives or in the Xgrid
> documentation. I'd really appreciate a pointer...
>
> We had a power failure at work which shut down the server running our
> controller (no UPS, doh!). After the server came back up, jobs
> submitted to the grid would fail with a error = "task: unexpected
> reply". Does anyone know what might have been causing this problem? I
> expected that the controller would just resubmit tasks that failed on
> their assigned agents.
>
> Anyways, I tried restarting the controller in Server Admin. After
> spinning for a while, Server Admin crashed. I'd seen this problem
> before when the Xgrid controller database was corrupted. There was a
> job running at the time of the power failure, so I thought the
> database may have gotten corrupted again. (asside: are there tools to
> repair the Xgrid database?). I trashed the database files in
> /var/xgrid/controller and it now restarts fine. All of the agents
> appear "Offline", however, in XgridAdmin even though they aren't.
> Nothing I've done including restarting all the agents and controller
> and removing all of the agenst from the grid and letting them
> re-coneect, seem to help.
>
> Can anyone point me in a useful direction?
>
> Thanks!
>
> Barry
>
> P.S. While I'm in the question asking mode, does anyone know where the
> interaction between xgridagent and the Energy Saver controll panel are
> documented? It appears that computers that are running xgridagent and
> are connected to a controller don't sleep, even if they should
> according to the Energy Saver. I'd expect that xgridagent might
> prevent sleep while running a task, but even idle agents appear to
> prevent their host from sleeping.
>
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Xgrid-users mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/xgrid-users/email@hidden
This email sent to email@hidden