python-twisted log rotation

Warning: I just learned how to spell python-twisted today, so it’s entirely plausible I’m advocating bonghits below. If you have real answers, especially for developers, please let me know.

Let’s say you’re a sysadmin deploying or a developer writing a twisted app. It’s likely the app is writing a bunch of logs somewhere. The problem is that the default logging settings for twistd are set to write an unbounded number of log files, so over time, it’s possible to fill up your filesystem with old logs.

As a lazy developer (or more likely lazy bug fixer who is just trying to fix something and move on with life without having to learn yet another huge framework because let’s face it, you have gtk, glib, firefox, and kernel bugs that need fixing), you google and find the quick basic help on logging in twisted. You read:

If you are using twistd to run your daemon, it will take care of calling startLogging for you, and will also rotate log files.

So you think “top banana”, and merrily invoke your program with:

from twisted.python.release import sh
sh("/usr/bin/twistd --nodaemon --logfile=%s" % "myapp.log")

And you’re done and it’s time to go to the pub. Not so fast, Speedracer. Let’s take a look at the defaults (I checked twisted 10.0 and 11.0):

137	class LogFile(BaseLogFile):
138	    """
139	    A log file that can be rotated.
141	    A rotateLength of None disables automatic log rotation.
142	    """
143	    def __init__(self, name, directory, rotateLength=1000000, defaultMode=None,
144	                 maxRotatedFiles=None):
234	class DailyLogFile(BaseLogFile):
235	    """A log file that is rotated daily (at or after midnight localtime)
236	    """

See that maxRotatedFiles=None? It means you will eventually hit -ENOSPC, and pandas will be sad.

A little more digging, and reading through the twisted application framework howto, you get the hint on how to modify the default logging behavior. The example says:

The logging behavior can be customized through an API accessible from .tac files. The ILogObserver component can be set on an Application in order to customize the default log observer that twistd will use.

Ok, so you look at the example, and then you say to yourself, that’s pretty good, but Sidnei da Silva could improve it, and then you bug Sidnei for a few hours, asking him rudimentary questions about whether the whitespace in Python really matters or not, and then he politely gives you the following rune (all the ugliness below comes from me, anything elegant comes from Sidnei):

from twisted.application.service import Application
from twisted.python import log
from twisted.python.logfile import DailyLogFile

application = Application("foo")

And that kinda works without any fancy observers or anything, but then you realize that your codebase has the following pattern repeated across many hundreds of source files:

from twisted.python import log
log.msg("unique message 1 out of 47890478932423987234")

[This is the point where I just gave up trying to learn twisted in 45 minutes or less, and also wanted to stress eat a cheeseburger that is 180% beef, 120% bacon while forgetting everything I thought I knew about how percentages worked.]

You think to yourself, “this is Linux, and I wouldn’t be using this OS unless my stubbornness rating was +8” so you go bash around on google for a while before remembering that Linux has a tool to take care of all this for you already called logrotate. So you stop trying to wrestle with twisted and beg your downstream packagers to fix it for you. Example below.

Luckily, twisted’s built-in log rotation mechanism matches up with what logrotate expects, namely, that when it rotates log files, it will do things like:

foo.log -> foo.log.1
foo.log.1 -> foo.log.2

So, all you have to do is teach logrotate about your application’s logfiles, and you are done. The beauty is, all the standard logrotate commands Just WorkTM.

On debian and Ubuntu, you can drop a new file into /etc/logrotate.d/. Not sure how the other distros package it.

Here’s an example file:

# cat /etc/logrotate.d/myapp
/var/log/myapp.log {
        rotate 0

This says:

  1. Pay attention to this file, /var/log/myapp.log. twisted will try and rotate it to /var/log/myapp.log.1, myapp.log.2, etc.
  2. We don’t want many log files around. In fact, we will only keep around myapp.log and myapp.log.1. Anything older gets auto-deleted
  3. Don’t try to create myapp.log. Let the myapp create it on its own
  4. Important! Don’t rename myapp.log to myapp.log.0. Rather, just copy it to myapp.log.1, and then clear out myapp.log so it’s empty. This is good because myapp might assume that its file descriptor is still valid and continue trying to write to it. Without this command, myapp might lose data because it’ll assume the original log file is still around and continue to write to it, even though it’s gone. It’s a little dangerous because it’s possible that you might lose some logged data between the time that logrotate copies the new log file and truncates the old file. Caveat administrator.
  5. Dear logrotate, please do not puke if you do not find myapp.log, it’s ok, I’ll give you a hug.

And now, twisted and logrotate are playing nicely with each other. On Ubuntu, logrotate runs as a daily cronjob, so twisted shouldn’t get too far ahead by creating too many extra log files. Of course, if it does, you can just create an hourly cronjob or something even more special, but probably the real answer is to discover why myapp is creating more than 1MB of logs so quickly.

One last tip, you can experiment with the behaviors above by playing with twisted and logrotate at the command line, without needing a reboot.

To force twisted to rotate its log files:

# kill -SIGUSR1 

To check what logrotate will then do:

# logrotate -f -d /etc/logrotate.d/myapp

Beware, the above doesn’t actually do anything, it just pretends. So to make it actually do stuff, but with verbose output:

# logrotate -f -v /etc/logrotate.d/myapp

And that, friends, is a thousand words on how to solve what should be a pretty simple problem but turns out to be way harder than necessary because you don’t know yet another framework. Good luck. As always, I recommend moAR bacon.