Mark Gritter (markgritter) wrote,
Mark Gritter

syslogd follies

I ran into something at work that makes no sense to me.

We observed that messages to syslog were taking a long time to show up. What triggered our interest was messages from the end of a previous test run showing up the next one, causing spurious failure reports!

As an experiment I pushed 300 messages at syslog fairly rapidly (over the network) and observed that the last message took about 15 seconds to work its way through. But tcpdump showed that the messages were being delivered to the server promptly. Something in the syslog daemon was slowing things up. The culprit turned out to be that syslogd is, by default, configured to do synchronous writes to its log file. So as each message came in, it was written to the file, the daemon waited for the write to complete, and then proceeded to handle the next message.

Synchronous writes make sense if you want to ensure that messages don't get lost due to power failure. When you're logging locally this might make some sort of sense. The case you want to avoid is:

   application  ----->   syslog daemon   ------>  disk buffer   ------>  disk
                                                  M6 M5 M4 M3            M1 M2       
                                             Important message that tells you what
                                             went wrong just before the machine catches 
                                             on fire, now mysteriously disappeared although
                                             the application thought it had been delivered.

But for messages coming in from over the network you can get yourself into this situation:

   remote app  ----> network buffer ---->  syslog daemon   ---->  disk buffer   ---->  disk
                      M6 M5 M4                                         M3              M1 M2       

So when the machine catches fire, you've still lost the message because it was sitting in memory, it was just sitting in a different piece of the machine's memory. Meanwhile you're progressing very, very slowly because you're writingly only one message at a time and waiting for it to make it to disk.

There's a reasonable solution here which is to completely drain the network buffer before doing your synchronous disk write! Then you move M6,M5,M4 to disk in a single operation instead of three operations, while you haven't compromised whatever property it was you were trying to preserve. (To cap it all off, syslog over the network is not reliable anyway, so making very sure that the message makes it to disk is useless.)

But our syslogd doesn't do that, so we may just end up switching it to do asynchronous writes.
Tags: geek
  • Post a new comment


    default userpic

    Your reply will be screened

    Your IP address will be recorded 

    When you submit the form an invisible reCAPTCHA check will be performed.
    You must follow the Privacy Policy and Google Terms of use.