Re: Prioritize my own app's disk access
Re: Prioritize my own app's disk access
- Subject: Re: Prioritize my own app's disk access
- From: Jonathan Taylor <email@hidden>
- Date: Wed, 06 Jul 2016 11:06:09 +0100
Thanks for your reply Alastair. Definitely interested in thinking about your suggestions - some responses below that will hopefully help clarify:
> The first thing to state is that you *can’t* write code of this type with the attitude that “dropping frames is not an option”. Fundamentally, the problem you have is that if you generate video data faster than it can be saved to disk, there is only so much video data you can buffer up before you start swapping, and if you swap you will be dead in the water --- it will kill performance to the extent that you will not be able to save data as quickly as you could before and the result will be catastrophic, with far more frames dropped than if you simply accepted that there was the possibility the machine you were on was not fast enough and would have to occasionally drop a frame.
I should clarify exactly what I mean here. Under normal circumstances I know from measurements that the i/o can keep up with the maximum rate at which frames can be coming in. I very rarely see any backlog at all reported, but might occasionally see a transient glitch (if CPU load momentarily spikes) of the order of 10MB backlog that is soon cleared. With that as the status quo, and 8GB of RAM available, something has gone badly, badly wrong if we enter vm swap chaos.
When I say "dropping frames is not an option", what I mean is that a single lost frame will be fairly catastrophic for the scientific experiment, and so my priorities in order are: (1) ensure the machine specs leave plenty of headroom above my actual requirements, (2) try and do anything relatively simple I can do to ensure my code is efficient and marks threads/operations/etc as high or low priority where possible, (3) identify stuff that the user should avoid doing (which looks like it includes transferring data off the machine while a recording session is in progress - hence this email thread!), (4) not worry too much about what to do when we *have* already ended up with a catastrophic backlog (i.e. whether to drop frames or do something else), because at that point we have failed in the sense that the scientific experiment will basically need to be re-run.
I should also clarify that (in spite of my other email thread running in parallel to this) I am not doing any complex encoding of the data being streamed to disk - these are just basic TIFF images and metadata. The encoding I referred to in my other thread is optional offline processing of previously-recorded data.
> The right way to approach this type of real time encoding problem is as follows:
>
> 1. Use statically allocated buffers (or dynamically allocated once at encoder or program startup). DO NOT dynamically allocate buffers as you generate data.
>
> 2. Knowing the rate at which you generate video data, decide on the maximum write latency you need to be able to tolerate. This (plus a bit as you need some free to encode into) will tell you the total size of buffer(s) you need.
OK.
> 3. *Either*
>
> (i) Allocate a ring buffer of the size required, then interleave encoding and issuing I/O requests. You should keep track of where the as-yet-unwritten data starts in your buffer, so you know when your encoder is about to hit that point. Or
>
> (ii) Allocate a ring *of* fixed size buffers totalling the size required; start encoding into the first one, then when finished, issue an I/O request for that buffer and continue encoding into the next one. You should keep track of which buffers are in use, so you can detect when you run out.
>
> 4. When issuing I/O requests, DO NOT use blocking I/O from the encoder thread. You want to be able to continue to fetch video from your camera and generate data *while* I/O takes place. GCD is a good option here, or you could use a separate I/O thread with a semaphore, or any other asynchronous I/O mechanism (e.g. POSIX air, libuv and so on).
>
> 5. If you find yourself running out of buffers, drop frames until buffer space is available, and display the number of frame drops to the user. This is *much* better than attempting to use dynamic buffers and then ending up swapping, which is I think what’s happening to you (having read your later e-mails).
I am making good use of GCD here (and like it very much!). There are quite a few queues involved, and one is a dedicated disk-writing queue. The main CPU-intensive work going on in parallel with this is some realtime image analysis, but this is running on a concurrent queue.
Hopefully my detail above explains why I really do not want to drop frames and/or use a ring buffer. Effectively I have a buffer pool, but if I exhaust the buffer pool then (a) something is going badly wrong, and (b) I prefer to expand the buffer pool as a last-ditch attempt to cope with the backlog rather than terminating the experiment right then and there.
> Without knowing exactly how much video data you’re generating and what encoder you’re using (if any), it’s difficult to be any more specific, but hopefully this gives you some useful pointers.
As I say, there is no encoding going on in this particular workflow. Absolute maximum data rates are of the order of 50MB/s, but [and this is a non-optimal point, but one that I would prefer to stick with] this is split out into a sequence of separate files, some of which are as small as ~100kB in size.
Thanks very much for all your comments
Jonny
_______________________________________________
Cocoa-dev mailing list (email@hidden)
Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden