void* f_Prefetcher(void* threadid) {
while (AllowThreadToRun) {
pthread_mutex_lock( &lock_ahead );
// READING FROM A FILE DESCRIPTOR, IN HERE
// END READING
pthread_mutex_unlock( &lock_processing );
}
pthread_exit( 0 );
}
void ThreadTester() {
// FIRST NON THREADED READ IN HERE
for (int i = 0; i < LoopMax; i++) {
pthread_mutex_lock( &lock_processing );
// PROCESSING IN HERE
// END PROCESSING
AllowThreadToRun = (i+1 < LoopMax);
pthread_mutex_unlock( &lock_ahead );
}
}
Something like that?
Basically, it does this. First, we read some data, non-threaded style.
Then we create a thread, which pre-fetches some data. The pre-fetcher
cannot run multiple loops by itself, it will wait on every loop until
the main processing code has done it's processing.
The main processing code, of course, in return, will wait until the
pre-fetch is done, before it processes the data. Except on the first
loop of course, because we did a non-threaded read.
The code won't look like this, but essentially this is what will be
going on :)
OK... I have a question for you.
Assuming that the reading we do from a file, is in at least 256KB
blocks.
And assuming that the processing is quite slow, so will almost always
finish long after the pre-fetcher will finish...
Will this code help me speed up reads? :D Will it help me get "Reads
for free"? (or almost for free).
Remember, I am using two mutexes, calling a lock function in two
places, calling unlock in two places, and there will be 2 threads.
(The main thread and the one I create).
I chose this double lock design, because in theory... it will mean I
don't need to create and destroy threads all the time! I can keep on
using the main thread. And this... I think... should be more CPU
efficient, than a lock, correct?
How will this design scale across different Macs? Like single, duo, or
8 core? It's just two threads per app... And won't get more. For 8
core computers, I invisage my algorithm running perhaps 8 processes,
or maybe 7 to give the computer some breathing space :) And they'll
communicate via sockets. This enables me to re-use the same code that
lets my app use cores on one Mac, and still run it on multiple Macs!
So two Macs will run faster than one :) So basically, I got the use of
extra cores covered with sockets, and a client-server model. The data
to exchange is minimal, my algorithm is like a search engine... less
data is returned to the client from the server, than google returns on
one of it's search engine pages.
Sorry about this really basic question here. It's not rocket science,
I know :)
When I have three threads running (read+process+write), it'll be even
more fun. I should have four locks by then!
_______________________________________________
Do not post admin requests to the list. They will be ignored.
PerfOptimization-dev mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden