Re: Why multi-threading ?
Re: Why multi-threading ?
- Subject: Re: Why multi-threading ?
- From: Dietrich Epp <email@hidden>
- Date: Tue, 27 May 2003 04:23:14 -0700
On Monday, May 26, 2003, at 17:48 US/Pacific, publiclook wrote:
In my experience with multi-threading going back to Ada 83 and most
recently with Java and C++ systems, why does anybody bother ?
Let me clarify: In my experience, multi-threading makes writing
correct code many times harder. It is nearly impossible to document
your intentions in the source code because almost by definition,
someone reading your code will be unaware of other things happening
concurrently when the code they are looking at executes.
Multi-threading adds a huge overhead to common operations due to
system calls for mutexes/semaphores and in a true multi-processor
system plays havoc with processor caches etc. For example, if two
processors are simultaneously using data in the same cache line then
typically neither processor can cache the data and memory accesses go
from being a few cycles to a massive number of cycles. Doesn't the
memory access impairment cancel any benefit of multi-threading in the
common cases?
If there are long running background "tasks" to perform, why not run
them as separate tasks with separate protected memory ? Why not use
message queues, signals, distributed messaging, mach messaging or
whatever as the communication between tasks if you are going to end up
using them between threads anyway ? Protected memory is a GOOD thing
that was a long time coming to Mac OS and it seem the first thing
people want to do is deliberately bypass it with multi-threading
instead of multi-tasking.
So why does anybody bother with multi-threading ? What am I missing ?
First of all, exactly how does multithreading affect the correctness of
the code? Nobody is making this very clear to me.
Situation:
One program I wrote drew the output of a certain function. You press a
button, and it draws something based on the parameters. Simple. But
it took a long time -- maybe a minute or two, depending on how smooth
you wanted the output to be. While this was happening, I wanted to be
able to get a quick preview of another set of parameters.
Solution:
One thread is created each time you ask the program to draw the output.
The thread calls a function every so often to update the progress bar,
and check a flag to see if it's been cancelled. Then it sends a
message back to the main thread to draw the picture to the screen.
Situation:
Another program I wrote uses the network heavily. It's got several
connections, some of which need immediate response. It also has a GUI.
Solution:
One thread deals with the network, often blocked. One thread deals
with everything else. This way I don't get disconnected when I fail to
pong because the user is running a script, or saving a file, etc.
If this were single-threaded, I would have to do something like polling
which is pretty damn inefficient. Instead, when data is available, the
network thread unblocks and processes it without any intervention from
the UI thread.
Basically, it boils down to this: different threads usually don't
operate on the same data. These are the cases where it is trivial to
make it single-threaded anyway. The cases where you want to do two
*different* things at the same time are when you should multithread.
Trying to interleave different operations without threading is just
asking for trouble.
In the first example, spawning another process would require writing
another tool (as opposed to just writing a single function) or playing
around with fork() which is damn ugly in my opinion, and dealing with
communication between processes. I'd have to serialize the image over
a pipe, deal with shared memory, or write it to disk. None of these
options are as elegant or easy to implement as the multi-threaded
version.
In the second case, making another process doesn't help, because there
is not much difference between inter-process communication and network
communication, or at least not much difference between the natural way
to implement both in this particular case. Without another process or
thread, the network has to be polled intermittently. This either
increases latency or has to be called often enough to be inconvenient
and performance-affecting. Instead, with multiple threads, the network
thread blocks until network data arrives, then processes it without
intervention from the UI thread.
I know that some of the problems are associated with fundamental Unix
flaws, such as the lack of a unified synchronization and IPC primitive
-- you can't, for example, wait for either a semaphore or a socket --
and I know that many of these flaws are not inherent in the underlying
mach kernel, but multi-threading is a simpler solution than both
interleaving operations in a single thread and multiple processes.
Multi-threading in general is also more portable than relying on the
services that run loops, for example, provide to interleave operations
on a single thread.
One last situation is when long operations are being performed on
different sets of data, such as video frames, which can be split
between a number of threads equal to the number of processors.
However, these programs are usually written either by someone who has
lots of experience, training, and access to SMP boxes.
_______________________________________________
cocoa-dev mailing list | email@hidden
Help/Unsubscribe/Archives:
http://www.lists.apple.com/mailman/listinfo/cocoa-dev
Do not post admin requests to the list. They will be ignored.