Using too much memory crashes entire machine
Using too much memory crashes entire machine
- Subject: Using too much memory crashes entire machine
- From: Tor Andre Myrvoll <email@hidden>
- Date: Wed, 15 Feb 2006 11:18:52 +0100
Hi,
I hope this is a proper mailing list for my problem. If not - please
redirect me.
I have written a piece of simulation software in C++ that relies
heavily on the Accelerate framework - in particular the BLAS
implementation. The software is compiled for the ppc64 architecture,
i.e. 64 bit.
I can control how much memory the program uses, but needless to say -
the more of my main memory I allow it too use, the faster the BLAS
routines run due to the internal threading used in the library.
The problem is that if I let the program use 6-7 GB out of my total
of 8 GB, the machine will eventually crash (see an example of a crash-
dump below. The cpu causing the panic varies from time to time). Note
that the machine is not swapping when these crashes occur.
Now, I would be willing to blame this on bad memory, but the machine
runs weeks on end performing other heavy computational tasks
(although in 32-bit mode). The same problem also occurs on a separate
dual G5 with 8 GB of memory.
So - do anyone have any opinions on what might be going on here? Is
it a hardware problem (occuring on both machines), can it be a
problem with my software (same setup runs fine on a dual Opteron
machine with 8 GB of memory), or is it a problem with OS X/Darwin?
Any input is appreciated!
--
- Tor André Myrvoll
Researcher, Dep. of Electronics and Telecomm.,
Norwegian University of Science and Technology
Crash Dump:
==========
panic(cpu 2 caller 0x000A8D80): Uncorrectable machine check: pc =
FFFFFFFFFFFF8260, msr = 900000000004F030, dsisr = 00200000, dar =
00000000B0011C00
AsyncSrc = 0000000000000000, CoreFIR = 0000000000000000
L2FIR = 0000000000000000, BusFir = 0000000000000000
Latest stack backtrace for cpu 2:
Backtrace:
0x00095718 0x00095C30 0x0002683C 0x000A8D80 0x000A84C8
0x000ABD00
Proceeding back via exception chain:
Exception state (sv=0x5849D780)
PC=0xFFFF8260; MSR=0x0004F030; DAR=0xB0011C00;
DSISR=0x00200000; LR=0x00845194; R1=0x8AD13BB0; XCP=0x00000008 (0x200
- Machine check)
Kernel version:
Darwin Kernel Version 8.4.0: Tue Jan 3 18:22:10 PST 2006;
root:xnu-792.6.56.obj~1/RELEASE_PPCModel: PowerMac11,2, BootROM
5.2.7f1, 4 processors, PowerPC G5 (1.1), 2.5 GHz, 8 GB
Graphics: NVIDIA GeForce 6600, GeForce 6600, PCI, 256 MB
Memory Module: DIMM0/J6700, 1 GB, DDR2 SDRAM ECC, PC2-4200E-444
Memory Module: DIMM1/J6800, 1 GB, DDR2 SDRAM ECC, PC2-4200E-444
Memory Module: DIMM2/J6900, 1 GB, DDR2 SDRAM ECC, PC2-4200E-444
Memory Module: DIMM3/J7000, 1 GB, DDR2 SDRAM ECC, PC2-4200E-444
Memory Module: DIMM4/J7100, 1 GB, DDR2 SDRAM ECC, PC2-4200E-444
Memory Module: DIMM5/J7200, 1 GB, DDR2 SDRAM ECC, PC2-4200E-444
Memory Module: DIMM6/J7300, 1 GB, DDR2 SDRAM ECC, PC2-4200E-444
Memory Module: DIMM7/J7400, 1 GB, DDR2 SDRAM ECC, PC2-4200E-444
Network Service: Innebygd Ethernet 1, Ethernet, en0
PCI Card: GeForce 6600, Display, SLOT-1
PCI Card: bcom5714, network, GIGE
PCI Card: bcom5714, network, GIGE
Serial ATA Device: Hitachi HDS725050KLA360, 465.76 GB
Parallel ATA Device: HL-DT-ST DVD-RW GWA-4165B,
USB Device: Hub in Apple Pro Keyboard, Mitsumi Electric, Up to 12 Mb/
sec, 500 mA
USB Device: Apple Pro Keyboard, Mitsumi Electric, Up to 12 Mb/sec,
250 mA
FireWire Device: unknown_device, unknown_value, Up to 400 Mb/sec
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Darwin-dev mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden