Re: large scale (audio) file I/O on OS X : help or insight requested
Re: large scale (audio) file I/O on OS X : help or insight requested
- Subject: Re: large scale (audio) file I/O on OS X : help or insight requested
- From: Hamilton Feltman <email@hidden>
- Date: Wed, 04 Mar 2015 22:45:56 -0800
It’s completely related to block size. Running this test with large block sizes, I could reach the maximum sequential read speed of the drive. But seeking between hundreds of files and only reading 16k of each at a time, on an HDD, is going to be extremely seek bound. This includes SSD’s because random reads usually benchmark at about 10% of sequential read speeds. If not, they (windows/linux) must be doing some extra reading and stashing it off somewhere. With caches really disabled, these reads all correspond to random reads of varying block sizes.
Below is the test on 2 different drives, 256 files 10MB each. Had to kill it toward the end as it was taking too long, and the drive sounded like it was being eaten alive.
Test 1:
Internal SSD (Crucial MX100)
500.711 MB/sec write
418.255 MB/sec read
Overtone:READTEST pwnified$ ./run-readtest.sh -d readtest_10_256 -n 256 -f 10 1048576 524288 262144 131072 65536 32768 16384
# Re-using files in readtest_10_256
# Blocksize 1048576
# Min: 362.4322 MB/sec Avg: 408.9372 MB/sec || Max: 0.706 sec
# Max Track count: 2233 @ 48000SPS
# Sus Track count: 1979 @ 48000SPS
1048576 362.4322 408.9372 0.7063 0.03024
# Blocksize 524288
# Min: 300.4145 MB/sec Avg: 346.3327 MB/sec || Max: 0.426 sec
# Max Track count: 1891 @ 48000SPS
# Sus Track count: 1640 @ 48000SPS
524288 300.4145 346.3327 0.4261 0.01553
# Blocksize 262144
# Min: 230.6564 MB/sec Avg: 268.7554 MB/sec || Max: 0.277 sec
# Max Track count: 1467 @ 48000SPS
# Sus Track count: 1259 @ 48000SPS
262144 230.6564 268.7554 0.2775 0.00811
# Blocksize 131072
# Min: 167.6279 MB/sec Avg: 253.5452 MB/sec || Max: 0.191 sec
# Max Track count: 1384 @ 48000SPS
# Sus Track count: 915 @ 48000SPS
131072 167.6279 253.5452 0.1909 0.01259
# Blocksize 65536
# Min: 137.1824 MB/sec Avg: 202.4644 MB/sec || Max: 0.117 sec
# Max Track count: 1105 @ 48000SPS
# Sus Track count: 749 @ 48000SPS
65536 137.1824 202.4644 0.1166 0.00586
# Blocksize 32768
# Min: 73.5551 MB/sec Avg: 136.8140 MB/sec || Max: 0.109 sec
# Max Track count: 747 @ 48000SPS
# Sus Track count: 401 @ 48000SPS
32768 73.5551 136.8140 0.1088 0.00506
# Blocksize 16384
# Min: 44.4558 MB/sec Avg: 81.6032 MB/sec || Max: 0.090 sec
# Max Track count: 445 @ 48000SPS
# Sus Track count: 242 @ 48000SPS
16384 44.4558 81.6032 0.0900 0.00409
Test 2:
External 3GB WD Reds on USB 3 in raid 0
105.761 MB/sec write
122.768 MB/sec read
Overtone:READTEST pwnified$
Overtone:READTEST pwnified$
Overtone:READTEST pwnified$ cp -r readtest_10_256 /Volumes/StarTech\ 3TB/Etcetera
Overtone:READTEST pwnified$ ln -s /Volumes/StarTech\ 3TB/Etcetera/readtest_10_256 readtest_10_256_ext
Overtone:READTEST pwnified$
Overtone:READTEST pwnified$ ./run-readtest.sh -d readtest_10_256_ext -n 256 -f 10 1048576 524288 262144 131072 65536 32768 16384
# Re-using files in readtest_10_256_ext
# Blocksize 1048576
# Min: 57.2084 MB/sec Avg: 66.3175 MB/sec || Max: 4.475 sec
# Max Track count: 362 @ 48000SPS
# Sus Track count: 312 @ 48000SPS
1048576 57.2084 66.3175 4.4749 0.31024
# Blocksize 524288
# Min: 43.4194 MB/sec Avg: 43.9487 MB/sec || Max: 2.948 sec
# Max Track count: 240 @ 48000SPS
# Sus Track count: 237 @ 48000SPS
524288 43.4194 43.9487 2.9480 0.02069
# Blocksize 262144
# Min: 25.0147 MB/sec Avg: 25.6101 MB/sec || Max: 2.558 sec
# Max Track count: 139 @ 48000SPS
# Sus Track count: 136 @ 48000SPS
262144 25.0147 25.6101 2.5585 0.04261
# Blocksize 131072
# Min: 10.5466 MB/sec Avg: 13.7875 MB/sec || Max: 3.034 sec
# Max Track count: 75 @ 48000SPS
# Sus Track count: 57 @ 48000SPS
131072 10.5466 13.7875 3.0342 0.08241
# Blocksize 65536
# Min: 5.3326 MB/sec Avg: 7.0131 MB/sec || Max: 3.000 sec
# Max Track count: 38 @ 48000SPS
# Sus Track count: 29 @ 48000SPS
65536 5.3326 7.0131 3.0004 0.05982
# Blocksize 32768
^C
Overtone:READTEST pwnified$ ./run-readtest.sh -d readtest_10_256_ext -n 256 -f 10 2621440
# Re-using files in readtest_10_256_ext
# Blocksize 2621440
# Min: 95.0108 MB/sec Avg: 96.5428 MB/sec || Max: 6.736 sec
# Max Track count: 527 @ 48000SPS
# Sus Track count: 518 @ 48000SPS
2621440 95.0108 96.5428 6.7361 0.07149
> On Mar 3, 2015, at 4:20 PM, Paul Davis <email@hidden> wrote:
>
> Many readers of the list will know who I am, but for those who don't let me start by just mentioning that I'm the lead developer and designer of Ardour, an open source GPL'ed DAW for Linux, OS X and Windows, and also the original author of JACK, a cross-platform API for low latency, realtime audio and MIDI, including inter-application and network routing.
>
> Over in Ardour-land, we've wrestled for a couple of years now with less than ideal support for OS X and are just finally beginning to resolve many/most of the areas where we were falling down.
>
> But there's an area where we've really run into a brick wall, despite the extremely deep and amazingly talented pool of people we have as developers, people who know *nix operating systems inside out and upside down.
>
> That area is: getting acceptable disk I/O bandwidth on OS X when reading many files. For comparison: we can easily handle 1024 tracks on a single spinning disk on Linux. We can get close to similar performance on Windows. We can even (gasp!) get in the same ballpark on Windows running as a guest OS on OS X (i.e. the entire Windows filesystem is just one file from OS X's perspective).
>
> But on OS X itself, we are unable to get consistent, reliable performance when reading many files at once. Our code is, of course, entirely cross-platform, and so we are not using any Apple specific APIs (or Linux-specific or Windows-specific). The same code works well for recording (writing) files (though OS X does still perform worse than other platforms).
>
> We've used various dtrace-based tools to try to analyse what is going on: these only confirmed for us that we have a problem, but provided no insights into what.
>
> Things became so inexplicable that we decided to break out the problem from Ardour itself and write a small test application that would let us easily collect data from many different systems. We've now tested on the order of a half-dozen different OS X systems, and they all show the same basic bad behaviour:
>
> * heavy dependence of sustained streaming bandwidth on the number of files being read
> (i.e. sustained streaming bandwidth is high when reading 10 files, but can be very low
> when reading 128 files; This dependence is low on Windows and non-existent
> on Linux)
>
> * periodic drops of sustained streaming bandwidth of as much a factor of 50, which can
> last for several seconds (e.g a disk that can peak at 100MB/sec fails to deliver
> better than 5MB/sec for a noticeable period).
>
> * a requirement to read larger blocksizes to get the same bandwidth than on other platforms
>
> Our test application is small, less than 130 lines of code. It uses the POSIX API to read a specified blocksize from each of N files, and reports on the observed I/O bandwidth. It comes with a companion shell script (about 60 lines) which sets up the files to be read and then runs the executable with each of series of (specified) blocksizes.
>
> I've attached both files (source code for the executable, plus the shell script). The executable has an unfortunate reliance right now on glib in order to get the cross-platform g_get_monotonic_time(). If you want to run them, the build command is at the top of the executable, and then you would just do:
>
> ./run-readtest.sh -d FOLDER-FOR-TEST-FILES -n NUMBER-OF-FILES -f FILESIZE-IN-BYTES blocksize1 blocksize2 ....
>
> (You'll also need some pretty large-ish chunk of diskspace available for the test files). The blocksize list is typically powers of 2 starting at 16348 and going up to 4MB. We only see problems with NUMBER-OF-FILES is on the order of 128 or above. To be a useful test, the filesizes need to be at least 10MB or so. The full test takes several (2-20) minutes depending on the overall speed of your disk.
>
> But you don't have to run them: you could just read the source code executable to see if any lightbulbs go off. Why would this code fail so badly once the number of files gets up to 100+? Why can we predictably make any version of OS X able to get barely 5MB/sec from a disk that can sustain 100MB/sec? Are there some tricks to making this sort of thing work on OS X that anyone is aware of? Keep in mind that this same code works reliably, solidly and predictably on Windows and Linux (and even works reliably on Windows as a guest OS on OS X).
>
> We do know, by simple inspection, that other DAWs, from Reaper to Logic, appear to have solved this problem. Alas, unlike Ardour, they don't generally share their code, so we have no idea whether they did something clever, or whether we are doing something stupid.
>
> I, and rest of the Ardour community, would be very grateful for any insights anyone can offer.
>
> thanks,
> --p
>
>
> <readtest.c><run-readtest.sh> _______________________________________________
> Do not post admin requests to the list. They will be ignored.
> Coreaudio-api mailing list (email@hidden)
> Help/Unsubscribe/Update your Subscription:
>
> This email sent to email@hidden
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Coreaudio-api mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden