Mailing Lists: Apple Mailing Lists

Image of Mac OS face in stamp
 
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Multithreaded XML file parsing?




On Aug 17, 2006, at 9:48 PM, Saad Mahamood wrote:

Hello,

I was wondering if it is possible to split a very large XML file (100
megabytes) into several smaller size chunks and then run several SAX
parsing threads on each of the chunks? The problem I having trying to
conceptualise in mind is how to avoid the situation where I split the
large XML file in the wrong place and thus leaving two of the threads
with incomplete information. I reckon I need to run a pre-processor on
the file to determine where to spilt it....


What is the connection to java?
I am just attempting to learn some XML again so my understanding here might be off. But isn't SAX more suited to parsing document type XML files as opposed to data type files and 100M would more likely be a data type file.
You probably could manage to split the file, I tried a similar approach to digitally signing PDF files using iText for PDF handling and bouncycastle for crypto. It turns out I did it incorrectly and iText added the support shortly after. I mean to go back and look at what I did wrong and what they did instead at some point though. But the signature needed to be embedded into the split PDF it seemed to me. So I won't argue that splitting a file can make sense.
Still for purposes of multi-threading I think it may be a wrong approach. The disk thrashing you would get into trying to read the file in multiple places would probably more than offset any multi- threaded performance gain? As another example I decided a threaded classpath search was probably a bad idea for this reason, I would be doing disk accesses all over the place. I never need any profiling though either way which I should of. Also that had the rare but bad side effect of occasionally finding the class in the wrong place if there were duplicates. So I think I currently have the multi-tasking pretty much inactivated.
Possibly the disk thrashing is what is giving you a nagging feeling that something might be wrong here?



Mike Hall mikehall at spacestar dot net http://www.spacestar.net/users/mikehall http://sourceforge.net/projects/macnative



Attachment: smime.p7s
Description: S/MIME cryptographic signature

 _______________________________________________
Do not post admin requests to the list. They will be ignored.
Java-dev mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/java-dev/email@hidden

This email sent to email@hidden

References: 
 >Multithreaded XML file parsing? (From: "Saad Mahamood" <email@hidden>)



Visit the Apple Store online or at retail locations.
1-800-MY-APPLE

Contact Apple | Terms of Use | Privacy Policy

Copyright © 2007 Apple Inc. All rights reserved.