Mailing Lists: Apple Mailing Lists

Image of Mac OS face in stamp
 
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Multithreaded XML file parsing?



I was wondering if it is possible to split a very large XML file (100
megabytes) into several smaller size chunks and then run several SAX
parsing threads on each of the chunks?

I think the issue you might have here is that you must first parse the file to know where you can split the file - even if you have a clever lexer to do the splitting, you will still be seeing *all* of the data. Then when you parse with each of the threads, you will be seeing *all* of the data a second time. This will not help you with speed.

Of course, you will need to avoid DOM to prevent all of the data
being put into memory :)

You should probably take a look at StAX [0] as well as the normal
SAX approaches.

 rgds
  --oh

[0] http://jcp.org/en/jsr/detail?id=173
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Java-dev mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/java-dev/email@hidden

This email sent to email@hidden
References: 
 >Multithreaded XML file parsing? (From: "Saad Mahamood" <email@hidden>)



Visit the Apple Store online or at retail locations.
1-800-MY-APPLE

Contact Apple | Terms of Use | Privacy Policy

Copyright © 2007 Apple Inc. All rights reserved.