• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
Re: Read lines from very large text file
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Read lines from very large text file


  • Subject: Re: Read lines from very large text file
  • From: Greg Parker <email@hidden>
  • Date: Mon, 2 Feb 2009 20:31:44 -0800

On Feb 2, 2009, at 7:50 PM, Joar Wingfors wrote:
On Feb 2, 2009, at 6:02 PM, Seth Willits wrote:
Before opening the file, either determine, guess, or be told what the encoding is. With that encoding, convert your delimiter string into raw bytes, then do byte-for-byte comparison on the file to find occurrences of that delimiter.

How do you know what delimiter string to use? Another thing that you'd have to determine, guess or be told, right? In general I would guess that it in this case almost always would be impossible and / or inappropriate to attempt to determine either of these two, and that you would have to simply default to something reasonable.

That's right, though heuristics work better for the guess-line-ending problem than they do for the guess-encoding problem. If you scan the first few KB of a sufficiently-long file and see exactly one kind of line ending, it's a good bet that you're right.



If you have an encoding where characters are not of fixed width, is it generally safe to assume that the byte signature of the valid delimiter strings for that encoding cannot also be found as a sub pattern of some combination of other characters? Perhaps that would always be a safe assumption, I'm no expert on string encodings and line delimiters.

Safe in some encodings, unsafe in others. I'm pretty sure that UTF-8 is safe - that no valid UTF-8 character is a subsequence of any other valid UTF-8 character.



-- Greg Parker email@hidden Runtime Wrangler


_______________________________________________

Cocoa-dev mailing list (email@hidden)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden


References: 
 >Read lines from very large text file (From: Jacob Rhoden <email@hidden>)
 >Re: Read lines from very large text file (From: Alexander Spohr <email@hidden>)
 >Re: Read lines from very large text file (From: Jacob Rhoden <email@hidden>)
 >Re: Read lines from very large text file (From: Robert Martin <email@hidden>)
 >Re: Read lines from very large text file (From: Seth Willits <email@hidden>)
 >Re: Read lines from very large text file (From: Joar Wingfors <email@hidden>)
 >Re: Read lines from very large text file (From: Seth Willits <email@hidden>)
 >Re: Read lines from very large text file (From: Joar Wingfors <email@hidden>)

  • Prev by Date: Re: NSTextFieldCell+NSDateFormatter in NSTableView
  • Next by Date: Re: Read lines from very large text file
  • Previous by thread: Re: Read lines from very large text file
  • Next by thread: Re: Read lines from very large text file
  • Index(es):
    • Date
    • Thread