Mailing Lists: Apple Mailing Lists

Image of Mac OS face in stamp
 
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Re: Multithreaded XML file parsing?



Some followup responses:

In the end I did go ahead a use just a single threaded instance
parsing the XML file using SAX, and the performance was much better
than I had initially expected. However I have a follow question:

One area in my code that could benefit from threading is the following:

File[] dirChildren = gmlDir.listFiles(filter);
       for(File aFile : dirChildren) {
           System.out.println("Got here!");
           //Iterate through the list of files and start loading GML files:
           GMLThread thread =  new GMLThread(aFile);
           thread.run();
           try {
               thread.join();
           }
           catch(InterruptedException ie) {
               ie.printStackTrace();
           }
           System.out.println("Got here! 2");
       }


I'm not loading one XML file, but several in a directory. However, I can't figure out how to iterate through the File arraylist and start several parsing threads, whilst making the main application thread wait until they've have all completed? The current code above only makes the "for each" loop wait until the thread its spawned has completed, which isn't the behaviour I want.

Thanks,

Saad Mahamood.


On 18/08/06, Oisin Hurley <email@hidden> wrote:
> I was wondering if it is possible to split a very large XML file (100
> megabytes) into several smaller size chunks and then run several SAX
> parsing threads on each of the chunks?

I think the issue you might have here is that you must first parse
the file to know where you can split the file - even if you have a
clever lexer to do the splitting, you will still be seeing *all* of
the data. Then when you parse with each of the threads, you will
be seeing *all* of the data a second time. This will not help you
with speed.

Of course, you will need to avoid DOM to prevent all of the data
being put into memory :)

You should probably take a look at StAX [0] as well as the normal
SAX approaches.

  rgds
   --oh

[0] http://jcp.org/en/jsr/detail?id=173

_______________________________________________
Do not post admin requests to the list. They will be ignored.
Java-dev mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/java-dev/email@hidden

This email sent to email@hidden
References: 
 >Multithreaded XML file parsing? (From: "Saad Mahamood" <email@hidden>)
 >Re: Multithreaded XML file parsing? (From: Oisin Hurley <email@hidden>)



Visit the Apple Store online or at retail locations.
1-800-MY-APPLE

Contact Apple | Terms of Use | Privacy Policy

Copyright © 2007 Apple Inc. All rights reserved.