Re: [syslogd] Performance issue?
site_archiver@lists.apple.com Delivered-To: darwin-dev@lists.apple.com So what can you do other than eagerly await 10.5.6? -- Marc Majka On 22 Oct, 2008, at 03:28, Stéphane Sudre wrote: Here's the scenario: After doing this, I quit the process calling printf. Logs may get lost . Real Memory (in Activity Monitor) can goes up as much as 860 MB. ---------------------------------------------------------------------------------- Total number in stack (recursive counted multiple, when >=5): Sort by top of stack, same collapsed (when >= 5): semaphore_wait_signal_trap 4930 mach_msg_trap 2465 asl_store_compact 2043 asl_store_truncate 342 read 38 write$NOCANCEL$UNIX2003 18 lseek 10 Sample analysis of process 928 written to file /dev/stdout _______________________________________________ Do not post admin requests to the list. They will be ignored. Darwin-dev mailing list (Darwin-dev@lists.apple.com) Help/Unsubscribe/Update your Subscription: http://lists.apple.com/mailman/options/darwin-dev/majka%40apple.com _______________________________________________ Do not post admin requests to the list. They will be ignored. Darwin-dev mailing list (Darwin-dev@lists.apple.com) Help/Unsubscribe/Update your Subscription: http://lists.apple.com/mailman/options/darwin-dev/site_archiver%40lists.appl... Yes, syslogd has some known performance problems in 10.5.5 and earlier versions. We are addressing them in 10.5.6. syslogd is probably getting overwhelmed with messages. It tries to keep the back-end data store (/var/log/asl.db) from growing too large. It monitors the size of the file, and when it grows too large, it deletes messages (oldest first) to shrink the file. This works OK when the rate of messages being received is low, but when things get busy, it hits a couple of problems: The search and deletion are slow, and the task of compressing out "holes" (i.e. fragmentation) are slow. While syslogd is slowly searching, deleting, and compressing (and using lots of CPU in the process), lots of new messages arrive and are buffered in memory. As soon as the search/delete/compress phase is over, syslogd goes into overdrive saving all the new messages keeping the the CPU load high. If it's *really* getting hammered, it doesn't take long to max out the file size, and it all happens again. To fix this, we streamlined the data store code, and let syslogd focus on the task of just saving messages to file. Messages are saved to multiple files (in /var/log/asl). syslogd limits the size of each file - new ones are opened when the previous one gets too large. A helper process (aslmanager) launched as required by launchd is responsible for managing the size of the whole data store. aslmanager will delete messages and optionally save them to a long-term archive. The new data architecture eliminates most of the cost of deletion and fragmentation / compression. - Try to eliminate processes that flood syslogd with messages that you really don't need. You may be able to use client-side filtering using setlogmask() in the code, or using "syslog -c" to adjust the log mask of a process that's logging a lot of messages. - Adjust the filter that syslogd uses to determine which messages it saves in the data store. You can do this using "sudo syslog -c syslogd -X", where X is a cutoff log level. By default, syslogd saves all messages with priority levels in the range 0 (Emergency) to 5 (Notice). To save just messages in the range 0 to 2 (Critical) you would use "-2". Using "syslog -c" to make this adjustment only lasts until a restart. To make the change take effect on every reboot, edit the syslogd plist file: /System/Library/LaunchDaemons/ com.apple.syslogd.plist, and add <string>-c</string> <string>2</string> (or whatever cutoff level you choose) to the ProgramArguments. - Increase the size of the asl data store, so that it doesn't need to run the search/delete/compress operation as frequently. This can be done by setting the "-db_max" command line option in the plist (as above). See the syslogd man page for details. Is the following case a performance issue with syslogd on Mac OS X 10.5.5? I'm dumping a byte buffer (which weights 4.7 MB approximately) into stdout using printf calls. The output looks like the one of hexdump - C. There's a '\n' for each new line. This may be done multiple times. syslogd then eats 100% of CPU and does not seem to want to stop but will after quite a while (more than 5 minutes). In the meantime Console.app is not responding and if it's launched, it does not refresh the Console Messages contents. Time to create new shell window in Terminal.app is bigger than usual. It takes 5-6 seconds. The size of /var/log/asl.db may change or not while syslogd is using 100% CPU. If I run a sample on syslogd, it shows that it's either asl_store_save or asl_store_compact that are taking 99.x% of CPU Time. Sampling process 928 for 3 seconds with 1 millisecond of run time between samples Sampling completed, processing symbols... Analysis of sampling syslogd (pid 928) every 1 millisecond Call graph: 2465 Thread_2503 2465 start 2465 main 2465 klog_in_close 2465 db_archive 2465 asl_store_truncate 2123 asl_store_open 2108 asl_store_compact 1622 asl_store_compact 1576 asl_store_compact 27 fread 27 __srefill 26 read 26 read 1 __srefill 16 fseek 15 _fseeko 8 __sflush 8 write$NOCANCEL$UNIX2003 8 write$NOCANCEL$UNIX2003 6 lseek 6 lseek 1 __bzero 1 __bzero 1 flockfile 1 __spin_lock 1 __spin_lock 3 fwrite$UNIX2003 3 __sfvwrite 3 __sfvwrite 467 asl_store_compact 13 fseek 11 _fseeko 6 __sflush 6 write$NOCANCEL$UNIX2003 6 write$NOCANCEL$UNIX2003 4 lseek 4 lseek 1 _sseek 1 _sseek 1 flockfile 1 __spin_lock 1 __spin_lock 1 funlockfile 1 funlockfile 6 fread 6 __srefill 6 read 6 read 15 asl_store_open 15 asl_store_open 8 fread 7 __srefill 6 read 6 read 1 _sread 1 _sread 1 __memcpy 1 __memcpy 6 fseek 6 _fseeko 4 __sflush 4 write$NOCANCEL$UNIX2003 4 write$NOCANCEL$UNIX2003 1 __bzero 1 __bzero 1 _fseeko 1 asl_store_open 342 asl_store_truncate 2465 Thread_2603 2465 thread_start 2465 _pthread_start 2465 database_server 2465 asl_ipc_server 2465 asl_ipc_server_routine 2465 __asl_server_query 2465 db_query 2465 pthread_mutex_lock 2465 semaphore_wait_signal_trap 2465 semaphore_wait_signal_trap 2465 Thread_2703 2465 thread_start 2465 _pthread_start 2465 db_worker 2465 pthread_mutex_lock 2465 semaphore_wait_signal_trap 2465 semaphore_wait_signal_trap 2465 Thread_2803 2465 thread_start 2465 _pthread_start 2465 launchd_drain 2465 _vprocmgr_log_drain 2465 vproc_mig_log_drain 2465 mach_msg 2465 mach_msg_trap 2465 mach_msg_trap This email sent to majka@apple.com This email sent to site_archiver@lists.apple.com
participants (1)
-
Marc Majka