|
| [Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] |
Sean_______________________________________________
Thanks for your input. I have a follow-up question or two. Would
this issue persist if one had used OpenMP parallelization? I would
think so .. I have not extensively tested that out yet.
Also, benchmarking my codes on an SGI ALTIX (which is an Itanium2
based SMP system) .. yielded near perfect scaling. So, I imagine that
the SGI ALTIX motherboard has a memory controller etc for each
processor or something like that.
Gaurav
===== Original Message From "Sean C. Garrick" <email@hidden> =====_______________________________________________
I've done a fair amount of benchmarking with my own MPI codes on
different machines have noticed the same thing. This is a known issue.
It has to do with the limited memory bandwidth on the Apple G5
motherboard. Each processor has to share the bus and memory controller
which accesses main memory.
This is true on Xeon motherboards and their the performance is even
worse. It is not true on AMD Opteron motherboards however. There I get
speed-ups of 1.85 or so. And just so everyone knows, this is executed
using "mpirun -np 1 or 2 executable.out"
I was hoping that Apple would give each processor its own memory
controller on the XServe G5 motherboard but alas it did not happen.
Sean
On Mar 17, 2004, at 9:04 PM, Gaurav Khanna wrote:
Hi all
I have a question on the performance of MPI (MPICH,
in this case) on an SMP (shared memory) system. I have
a code that I, MPI parallelized myself .. and it scales near
perfectly (on small sized distributed clusters). More
explicitly, say I run this code on 2 single processor Macs,
in parallel, I nearly get twice the speed compared with one.
However, if I run the same code on a dual processor
Mac (G4 or G5) by configuring MPI to treat the machine
as 2 computers .. I get much poorer performance (i.e.
gain in speed over single processor). Moreover, the larger
(in terms of memory) simulation I attempt, the worse
the problem gets. On a run using approximately 300MB
of RAM, I'm down to getting a factor of 1.5 speed-up
using a dual processor over a single processor.
I even tried to reconfigure and recompile MPICH for shared
memory communication (using -comm=shared) but no
improvement.
I tried a totally different and unrelated code (that is also
known to scale well) and I'm getting pretty much the same
deal. I even (very briefly though) tried LAM-MPI with no
significant difference.
Am I missing something? Has anyone noticed this as well?
Note that the problem becomes significant only for *large*
simulations .. say 300MB or more. Any advice would be
appreciated. Maybe this is a generic occurance when you
use MPI on an SMP machine .. instead of OpenMP?
I'll try a similar test on an IBM Power4 system (p690) that
I have access to ..
Regards
Gaurav
_______________________________________________
scitech mailing list | email@hidden
Help/Unsubscribe/Archives:
http://www.lists.apple.com/mailman/listinfo/scitech
Do not post admin requests to the list. They will be ignored.
scitech mailing list | email@hidden
Help/Unsubscribe/Archives: http://www.lists.apple.com/mailman/listinfo/scitech
Do not post admin requests to the list. They will be ignored.
| References: | |
| >RE: MPI on SMP system (From: Gaurav Khanna <email@hidden>) |
| Home | Archives | Terms/Conditions | Contact | RSS | Lists | About |
Visit the Apple Store online or at retail locations.
1-800-MY-APPLE
Contact Apple | Terms of Use | Privacy Policy
Copyright © 2011 Apple Inc. All rights reserved.