Re: Retrieving the EXIF date/time from 250k images
Re: Retrieving the EXIF date/time from 250k images
- Subject: Re: Retrieving the EXIF date/time from 250k images
- From: Alex Zavatone via Cocoa-dev <email@hidden>
- Date: Wed, 17 Aug 2022 14:27:09 -0500
Hi Jim. You did exactly what I did. You found the level of diminishing
returns on threads.
Back in 2008, while working on FiOS TV for Verizon I wrote the design to
development automation pipeline that automated export of graphic assets from
Illustrator, Photoshop and ImageOptim by Kornel Lesińsk. It written in
AppleScript (kill me now) wrapped in AppleScript-Obj-C in Xcode 3.x IIRC. In
spawning processes for image optimization, I tried the “make a thread for each
image” and quickly ran into what you did.
BUT, if they were spawned, suspended and then added to an operation queue, you
can let the operation queue to the heavy lifting. Just as you mentioned.
One issue you could have run in to is if the media you were performing the
writes to took longer than the amount of time to perform the image processing,
creating a write bottleneck.
Thanks for posting the processInfo.processorCount. That’s what I was thinking
about but couldn’t remember.
FYI, ImageOptim is a fun tool for smallifying images in both lossless and lossy
operation. It’s got one special trick I discovered added to it to get extra
smallification. In the images I tested, we went from ~ 18% smaller to 24 and
then 28% smaller using my technique that’s wrapped into ImageOptim, which uses
multiple tools and multiple parameter iterations and compares the image sizes
to pick the smallest. Then it wraps that with my technique until no more
smallification can be achieved. It’s fun. It’s free. And Kornel is a great
guy for making this available if this matters to you. Here’s the link.
https://imageoptim.com/mac
Cheers,
Alex Zavatone
> On Aug 17, 2022, at 1:32 PM, James Crate via Cocoa-dev
> <email@hidden> wrote:
>
> I have an app that does some image processing, and when I tried to use GCD it
> created several hundred threads which didn’t work very well. NSOperationQueue
> allows you to set the max concurrent operations, and the batch exporting
> process fully utilizes all logical cores on the CPU.
>
> opsQueue.maxConcurrentOperationCount =
> NSProcessInfo.processInfo.processorCount;
>
> Maybe I was using GCD wrong, or maybe reading, processing, and writing
> several hundred images is not a good fit for GCD concurrent queue? In any
> case NSOperationQueue is easy to use and works well.
>
> Jim Crate
>
>
>> On Aug 16, 2022, at 3:37 PM, Jack Brindle via Cocoa-dev
>> <email@hidden> wrote:
>>
>> Instead of using NSOperationQueue, I would use GCD to handle the tasks.
>> Create a new Concurrent queue
>> (dispatch_queue_create(DISPATCH_QUEUE_CONCURRENT)), then enqueue the
>> individual items to the queue for processing (dispatch_async(), using the
>> queue created above). Everything can be handled in blocks, including the
>> completion routines. As Christian says the problem then is that data may not
>> be in the original order so you will probably want to sort the returned
>> objects when done. This should significantly speed up the time to do the
>> whole task.
>>
>> Jack
>>
>>
>>> On Aug 16, 2022, at 12:26 PM, Steve Christensen via Cocoa-dev
>>> <email@hidden> wrote:
>>>
>>> You mentioned creating and managing threads on your own, but that’s what
>>> NSOperationQueue —and the lower-level DispatchQueue— does. It also will be
>>> more efficient with thread management since it has an intimate
>>> understanding of the capabilities of the processor, etc., and will work to
>>> do the “right thing” on a per-device basis.
>>>
>>> By leveraging NSOperationQueue and then keeping each of the queue
>>> operations focused on a single file then you’re not complicating the
>>> management of what to do next since most of that is handled for you. Let
>>> NSManagedObjectQueue do the heavy lifting (scheduling work) and focus on
>>> your part of the task (performing the work).
>>>
>>> Steve
>>>
>>>> On Aug 16, 2022, at 8:41 AM, Gabriel Zachmann <email@hidden>
>>>> wrote:
>>>>
>>>> That is a good idea. Thanks a lot!
>>>>
>>>> Maybe, I can turn this into more fine-grained, dynamic load balancing (or
>>>> latency hiding), as follows:
>>>> create a number of threads (workers);
>>>> as soon as a worker is finished with their "current" image, it gets the
>>>> next one (a piece of work) out of the list, processes it, and stores the
>>>> iso_date in the output array (dates_and_times).
>>>> Both accesses to the pointer to the currently next piece of work, and the
>>>> output array would need to be made exclusive, of course.
>>>>
>>>> Best regards, Gabriel
>>>
>
> _______________________________________________
>
> Cocoa-dev mailing list (email@hidden)
>
> Please do not post admin requests or moderator comments to the list.
> Contact the moderators at cocoa-dev-admins(at)lists.apple.com
>
> Help/Unsubscribe/Update your Subscription:
>
> This email sent to email@hidden
_______________________________________________
Cocoa-dev mailing list (email@hidden)
Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden