Re: distributed processing ad multiThreading
Re: distributed processing ad multiThreading
- Subject: Re: distributed processing ad multiThreading
- From: Miguel Morales <email@hidden>
- Date: Mon, 17 Sep 2001 09:04:57 -0700
Dear Robert,
I've been doing something vaguely similar (real time
distributed/multi-threaded searches for Gamma Ray Bursts on linux
machines), and I beg to differ from some of the opinions previously
expressed. I think your instinct is correct that Objective-C and DO is
the way to go. MPI and PVM are very well suited to a particular class
of problems. In specific, they are usually used for programs where
there is an enormous array of information, but the pieces of the array
are dependent only on local information. The classic example is weather
modelling where there is an enormous number of weather grid points, but
each point only depends on its neighbors. It sounds like you have a
different style of problem, where you have lots of independent pieces
that don't fit into a large array, you are just limited by processing
power. In my experience, DO works beautifully for this kind of problem.
Take a look at GNUstep (
http://www.gnustep.org/), where a cross platform
version of the Foundation framework is in a complete and stable form. I
have not used it on solaris, but on linux it works like a charm. Now
the caveat is their DO system will not communicate with Apple's, but the
API is identical. Just recompile on the solaris machines and it should
run like a charm. In addition, you get the benefit of exactly the same
code for both threading and distributed communication. A node never
needs to know if the object it is communicating to is on the local
machine or not, the code is the same. Because of this, you can test and
debug on the OS X where the developer tools are significantly better,
then recompile and run on the solaris machines to take advantage of the
processor farm. This also makes it easy to have the number of threads
running on a machine depend on the number of processors/speed for load
balancing.
Of course there would have to be a client program running on each
machine in the farm, but this should be straightforward (preferably
running on startup, or a cron tab so that nodes automatically reboot
after failure). And for a processor farm, the overhead issue is not a
restrictive as Martin states. These machines don't need to do anything
other than your job, and if they are networked fairly well using
switches, communication is limited by the ethernet card on each machine.
I hope this helps, and if you want any help I may be able to provide,
just respond directly,
-Miguel
On Monday, September 17, 2001, at 06:05 AM, cocoa-dev-
email@hidden wrote:
Date: Mon, 17 Sep 2001 11:34:01 +0100
From: Robert S Goldsmith <email@hidden>
To: email@hidden
Subject: distributed processing ad multiThreading
Hi everyone :)
I have a little problem and am not sure the best way to go
about it best in ObjC.
My work is in evolutionary programming, an area of research
in which the programs being written and run have usually got
a few distinct characteristics:
1) There is a 'population' of entities (instances of an
object). For my work, usually only about 5 or 6 (although
sometimes there can be thousands).
2) In base level systems, every one of these entities must
be 'evaluated' and given a score on every process step
(known as a generation).
3) Evaluation is a HUGE processing requirement. As an
example, some of my recent work takes over 8 hours to
complete only about 10000 generations - and that's in
optimised C.
4) There are a huge number of generations needed to solve
the problems being set the system.
5) This is the real problem point - every generation, almost
all of the entities have to be changed. That is, their
internal structure is redesigned based on the 'winner' of
the last generation.
The problem I have is that I would like to try and
parallelise the evaluation process.
I have two options. I have access to a 4 processor solaris
machine and a duel processor G4 if I want to try threading
in an SMP enviroment and I have access to about 120 solaris
workstations if I want to go for distributed processing.
SMP I don't think will be much of a problem to try - except
the Apple docs on NSThread etc. are a little thin on the
ground. However, with the distributed processing, how would
i go about:
1) launching 'threads' on all the machines
2) getting them to communicate small and large abounts of data
As you can understand, I do not want a huge overhead because
time is an important concideration.
I hope someone can help :)
Thankyou
Robert