Mailing Lists: Apple Mailing Lists

Image of Mac OS face in stamp
 
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

My Ideal XGrid API



 Objective:

Ok, so I work in the financial industry, and part of my job is taking all the member accounts from our website: http://www.marketocracy.com and analyzing their performance.

I did this in perl, because it was the easiest thing to get up and running
and the analysis program is perfectly suited to a "duct-tape" language
like perl.


However, I like to tweak the parameters to this script occasionally, which means
running these large matrices to see which set of parameters works the best.


The faster I can do this, the more money our customers make, the more money we make.

XGrid is the natural way for me to do this since I have 3 G4s and a G5 in
my office right now.


 Data in:

  The script iterates over about 200MB of data files.

 Data out:

  The output data is about 2k of summarized data.

 Terminology:

A "job" is a set of tasks that has to be done. In my case, it would be the
entire matrix I'm running.


A "task" is a useful subset of that job that can be done by one or more CPUs.
In my case it would be one element of the matrix.


    A "thread" is just a typical thread.

Current API:

Right now, XGrid does setup and tear down at the task level. That's not very efficient for me. I would need to put all the data files onto a fileserver
somewhere that they could all access. By adding a job concept, I can get more
efficiency out of XGrid, without having to do any external management. Also,
rather then having a cluster at my disposal, I'm more likely to talk my co-workers
into installing the screen saver, so setup has to be automatic.


  What I would like XGrid to do (a push model):

    for each job:
         for each computer:
              setup job for computer
              for each processor:
                  setup job for processor
         for each task:
             Assign to computer or processor();
         for each computer:
              for each processor:
                   tear down job for processor()
              tear down job for computer


In my case, the job setup for each computer would be just scp-ing the files from
one location to /tmp, while job tear down would be rm-ing them.


Now from XGrid's point of view, since machines can come and go dynamically,
I would expect it would actually do the setup lazily rather then all at once, so it
would actually call the setup stuff when allocating tasks:


   def allocate task():
       if (task == per_processor):
		 processor = pick_processor()
          computer = processor.computer()
          if not computer.job_is_setup(job):
              job.setup_computer(computer)
          if not processor.job_is_setup(job);
              job.setup_processor(processor_setup)
          processors_used.append(processor)
       elif (task == per_computer())
           computer = pick_computer()
           if not computer.job_is_setup(job):
              job.setup_computer(computer)
       computers_used.append(computer)

   def job_finished():
       for each processors_used():
            job.teardown_processor();
       for each computers_used():
           job.teardown_computer()


That is, when it went to assign a task to a computer or processor, it would first see
if the job setup had been done, and if it hadn't, it would perform it. When the job
completes, any computer or processor used would get told to cleanup.



Alternative (a pull model):

The above assumed that the main XGrid controlling process would push all the files
to all the CPUs. An alternative might be something more along this line:


  xgrid_publish_file <job_id> my_data_file.dat

This command line tool would make "my_data_file.dat" available from any client machine, which basically means that if a client does:

  xgrid_retrieve_file <job_id> my_data_file.dat

Xgrid would first look to see if the file had already been copied. (That is, it
would cache it, which Xgrid doesn't do now.) If it has, its done. If it hasn't,
xgrid will retrieve the file to the local directory. This would make things
simpler then having to setup a file server. Perhaps you can publish a whole
directory of files.


The nice thing about this model is that it all setup is lazy, because it
would be pretty trivial to have a preflight script for command line processes
that just did publish for all the data files on the controller, and retrieve on
all the clients. When Xgrid finished a job, it would (optionally, in case those files are needed for multiple jobs) delete the files.


 Except:

In my case, my perl script isn't currently setup to split things into tasks,
because it takes a certain amount of time to load all the data files. So what I
would really need to do is:


1. Rewrite script to run in a command-line fashion. So it would startup, then
read commands from standard in until told to quit. I could do this pretty easily.


    2. Have setup_job() launch that script as a background task.

    3. Have task_execute() send that script a message.

    4. Have teardown job send a "quit" message.

So I would still need some kind of job API, the file publishing stuff wouldn't
quite do it.


 Python Plug:

PyObjC, the Python-Objective-C transparent bridge is just a great tool, it would be wonderful if the XGrid controller could execute plugins built in it directly. Same
thing for CamelBones and Perl.


 Discussion:

The intent of this long posting was just to get everyone thinking about how
we'd like XGrid to work in the future. Pretty much all the Grid APIs have
some kind of job concept, its kind of inherent in the problem. Some of them
(Condor for instance) have an idea about how to distribute files and such
already built in. So basically, I'd like to see:


XGridProject --- A collection of jobs, has setup/teardown. Jobs are done
one at a time until completed.
XGridJob ---- a collection of tasks, also has setup/teardown
XGridTask ---- a subset of a job, smallest divisible unit. Tasks can
be divided per processor or per computer by Xgrid.


I'd then like to see more hooks into the file distribution mechanism. For instance,
seti@home and folding@home basically run as command line tools that take different files, I think with the above methodology, its kind of obvious how it would be possible to seti@home@home.


Finally, I'd like to be able to hook into the screensaver. One of the key features
of Seti@home is the cool looking screensaver which motivates people to run it. The speedometer is kind of boring.


 Other thoughts:

Having a suspend()/resume() on XGridJob might make a lot of sense for people using
XGrid in screensaver mode. For instance, folding@home uses a bunch of checkpoint files so that it can restart from a certain point.


Condor does some of this automatically, but you have to use their special tool to relink which doesn't work on OSX yet. It might be cool to have a Condor plugin for
XGrid, so that you could build a tool with Condor and submit it to XGrid.


Though I don't think Condor is open source. _______________________________________________
xgrid-users mailing list | email@hidden
Help/Unsubscribe/Archives: http://www.lists.apple.com/mailman/listinfo/xgrid-users
Do not post admin requests to the list. They will be ignored.





Visit the Apple Store online or at retail locations.
1-800-MY-APPLE

Contact Apple | Terms of Use | Privacy Policy

Copyright © 2007 Apple Inc. All rights reserved.