Re: Killer Application
Re: Killer Application
- Subject: Re: Killer Application
- From: "Jerry W. Walker" <email@hidden>
- Date: Sat, 26 Mar 2005 00:35:45 -0500
Hi, Robert,
I don't pretend to be a an expert sysadmin (or even a sysadmin, for
that matter) so my first suggestion would be to get the advice of a
really good system administrator. If you've done that, they're stumped,
and/or you're at a dead end and somewhat desparate, then read on:
It's interesting that according to the log provided, the problem
started at 1 minute and 12 seconds shy of exactly 26 hours. That and
your statement of the conditions under which it occurs lead me to ask
whether this occurs at a fixed time interval after startup (say, 26
hours) or at a specific time of day regardless of the startup time
(say, around 4:00 PM). If the latter, you might check JavaMonitor to
see if the time corresponds with a scheduled restart time for the app
or some other pre-scheduled event in the deployed configuration.
The log seems to show signs of either a critical system file or a
critical wotaskd file that is either permanently hosed or being
dynamically hosed. Neither, of course, should occur on an Xserve. If
the file is simply statically hosed, reinstalling the file (whether by
reinstalling WebObjects or reinstalling the system software) should
solve the problem. If, after that, the problem recurs, that suggests
both a rogue set of code (probably in your app) and a permissions
problem in the system allowing your app to overwrite part of a file
that it shouldn't be able to.
You might start up terminal and issue the following command to try to
find this file:
sudo find / -name 'SMERXCode*'
If the find command finds the file, I would take a look at the file's
content to determine if it's a plaintext file, if you can identify the
prolog, and if you can remove its content. A simpler approach might be
to compare it to a comparable file from another Xserve that's not
having these problems. I would be cautious about replacing one with
the other, they may be configuration dependent. But comparing them
might provide some real answers.
If you don't have sudo permissions to do this, your success will
probably be limited with this approach. If you do, expect to wait a
long time for an answer, the find command will be sifting through a LOT
of files starting at root. If you find the file, try to determine what
it belongs to (the system or WebObjects) and customize the following
advice accordingly
In reading your statement,
"... after the service goes down, the WO server won't work any
more for any application we deploy on it."
it wasn't clear how you were able to get the error to repeat. Do you
simply restart the server, or do you have to reinstall software, then
restart?
If none of the above gave you any clues and you're in a desert without
expert Xserve sysadmin help, this is what I would suggest:
1) Reinstall the WebObjects software and your app on this server.
2) Run Disk First Aid's Repair Permissions command (I'm presuming that
this works on MacOS X Server, but am not certain of that).
3) Run the application again to see if the problem recurs. If not,
you're done (YEAH!! send a large cashiers check in thanks to...). If
it does recur, then...
4) Reinstall both the system and the WO software. As long as you're
doing this, you might test the disk to see if you're dealing with a bad
drive.
5) After installation, again run Disk First Aid to Repair Permissions
6) Run the application again to see if the problem recurs.
That's the extent of my advice. If that doesn't do the job, take the
answers to the questions I asked above to Apple Support and request
HELP!
Alternatively, you might start with the HELP request and then do the
other things, since it often takes a while for Apple to respond to such
a request.
Whether any of the above helps at all or not, it should be sufficient
to whet the appetites of others on this list who really know what
they're doing and can make substantial improvements on my advice. :-)
Regards,
Jerry
On Mar 25, 2005, at 8:40 AM, Robert Snyder wrote:
Hi,
We have an application that when it is deployed runs for a day, then
takes down our wotaskd on our WebObjects servers (Dual G5 XServes).
The really bad part is that after the service goes down, the WO server
won't work any more for any application we deploy on it.
Here is what the log files are showing:
-----------------------------------------------------------------------
--
javawoservice.sh: `wotaskd' is starting up ...
[2005-03-21 14:40:00 EST] <main> Multicast Response Disabled
[2005-03-21 14:40:00 EST] <Thread-0> Created UDP socket; listening for
requests...
[2005-03-21 14:40:00 EST] <main> The URL for webserver connect is:
http://128.118.242.69/cgi-bin/WebObjects/wotaskd.woa/-1085
The URL for direct connect is:
http://128.118.242.69:1085/cgi-bin/WebObjects/wotaskd.woa
[2005-03-21 14:40:00 EST] <main> Waiting for requests...
[Fatal Error] :1:1: Content is not allowed in prolog.
[2005-03-22 16:38:48 EST] <WorkerThread14> Wotaskd
getStatisticsForInstanceArray: Error parsing: An Internal Server Error
Has Occurred. from SMERXCode-5
[Fatal Error] :1:1: Content is not allowed in prolog.
[2005-03-22 16:41:08 EST] <WorkerThread2> Wotaskd
getStatisticsForInstanceArray: Error parsing: An Internal Server Error
Has Occurred. from SMERXCode-5
[Fatal Error] :1:1: Content is not allowed in prolog.
[2005-03-22 16:41:17 EST] <WorkerThread6> Wotaskd
getStatisticsForInstanceArray: Error parsing: An Internal Server Error
Has Occurred. from SMERXCode-5
[Fatal Error] :1:1: Content is not allowed in prolog.
[2005-03-22 16:41:35 EST] <WorkerThread13> Wotaskd
getStatisticsForInstanceArray: Error parsing: An Internal Server Error
Has Occurred. from SMERXCode-5
[Fatal Error] :1:1: Content is not allowed in prolog.
[2005-03-22 16:41:40 EST] <WorkerThread15> Wotaskd
getStatisticsForInstanceArray: Error parsing: An Internal Server Error
Has Occurred. from SMERXCode-5
[Fatal Error] :1:1: Content is not allowed in prolog.
[2005-03-22 16:42:42 EST] <WorkerThread4> Wotaskd
getStatisticsForInstanceArray: Error parsing: An Internal Server Error
Has Occurred. from SMERXCode-5
[Fatal Error] :1:1: Content is not allowed in prolog.
[2005-03-22 16:43:44 EST] <WorkerThread5> Wotaskd
getStatisticsForInstanceArray: Error parsing: An Internal Server Error
Has Occurred. from SMERXCode-5
[Fatal Error] :1:1: Content is not allowed in prolog.
[2005-03-22 16:44:47 EST] <WorkerThread9> Wotaskd
getStatisticsForInstanceArray: Error parsing: An Internal Server Error
Has Occurred. from SMERXCode-5
javawoservice.sh: `wotaskd' exited.
. `wotaskd' terminated with an error (`143'). Scaling wait time ...
. Execution not counted as a success.
bootstrap_look_up() failed (ipc/send) invalid destination port
javawoservice.sh: `wotaskd' will restart in 5 seconds ...
bootstrap_look_up() failed (ipc/send) invalid destination port
bootstrap_look_up() failed (ipc/send) invalid destination port
-----------------------------------------------------------------------
--
javawoservice.sh: `wotaskd' is starting up ...
bootstrap_look_up() failed (ipc/send) invalid destination port
/System/Library/WebObjects/JavaApplications/wotaskd.woa/wotaskd: line
62:
cd: bootstrap_look_up() failed (ipc/send) invalid destination port
bootstrap_look_up() failed (ipc/send) invalid destination port
/System/Library/WebObjects/JavaApplications/wotaskd.woa: No such file
or
directory
wotaskd: Unable to read "bootstrap_look_up() failed (ipc/send) invalid
destination port bootstrap_look_up() failed (ipc/send) invalid
destination port
/System/Library/WebObjects/JavaApplications/wotaskd.woa/
Contents/MacOS/MacOSClassPath.txt"! Terminating.
javawoservice.sh: `wotaskd' exited.
. `wotaskd' terminated with an error (`1'). Scaling wait time ...
. Execution not counted as a success.
bootstrap_look_up() failed (ipc/send) invalid destination port
javawoservice.sh: `wotaskd' will restart in 10 seconds ...
bootstrap_look_up() failed (ipc/send) invalid destination port
____________________________________________
Robert Snyder, Director
World Campus Data Management Services
The Pennsylvania State University
105 Mitchell Building
University Park PA 16802
Phone: 814-865-0912 Fax: 814-865-4406
E-mail: email@hidden
URL: http://www.worldcampus.psu.edu
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
email@hidden
This email sent to email@hidden
--
__ Jerry W. Walker, Partner
C o d e F a b, LLC - "High Performance Industrial Strength Internet
Enabled Systems"
email@hidden
212 465 8484 X-102 office
212 465 9178 fax
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden