marcs notes on starting daemons

So you have almost finished your great linux server application. While you were writing it was alright to run it interactively from a terminal and see the debug output scroll by. But now it is almost time to release it. And your users will want to run it without a controlling terminal, in other words run it as a daemon.

The easy (and in my view nasty) way to background a daemon is to simply append an ampersand when running it - that makes the shell do the work. Maybe like this

yourd &

More experienced system administrators might type

nohup yourd < /dev/null &> /dev/null & disown %1

But you can do better.

The simplest improvement is to use the daemon() library function. After this function returns, your process will have detached from the terminal and have changed its current directory to /.

Why change working directory to the root directory ? Well, it allows the system administrator to mount and unmount directories without your daemon getting in the way. If you have seen the error messages such as

umount: /usr: device is busy

you will be familiar with the problem. The root directory may not always be the best choice - sometimes it is more appropriate to change to the directory containing all your daemons data files. Then if the sysadmin tries to unmount that directory, the device busy message may be a good reminder that there exists a process which still needs that data.

While the daemon() function is much better than a simple shell ampersand, backgrounding your server process manually allows you more control. And the task isn't that complicated, if you are aware of a convention and some concepts.

The important convention is a daemon should only go into the background when it is ready to service requests. You may have noticed that when starting a well written daemon - you don't get your prompt back immediately, there is a short delay. So why is this important ? Two reasons:

The first reason relates to startup dependencies. Imagine that there exists another daemon which needs your daemon to do its work. If you start your daemon with /etc/rc2.d/S40foo and the dependent daemon as /etc/rc2.d/S41bar, then the initscripts will only start S41bar when S40foo has returned. Now if your daemon backgrounds immediately, then S41bar may already start while your daemon is still parsing its configuration file. So it may happen that S41bar tries to contact your daemon which isn't yet listening. This then manifests itself as a mysterious startup failure of S41bar, the kind that only happens on some systems and only occasionally - a really nasty problem to debug.
The second reason is that error reporting needs to be done properly. Again imagine a daemon which backgrounds immediately. Imagine it being invoked from a shell prompt with the wrong commandline option. It backgrounds immediately, so the sysadmin thinks that everything has gone well. But after backgrounding the parser notices a problem. Writing to the terminal isn't an option anymore as the daemon has now detached. Oops.

So the trick is to perform as much initialisation as possible before backgrounding. This makes it possible to report any failures to the controlling terminal - directly to the sysadmin starting the server. Of course this doesn't exclude writing failure reports to the system log as well.

There is one complication to delaying backgrounding: The backgrounding step invariably requires a fork() system call - this process makes two copies of the process: The child process can detach from the terminal and runs the server logic, while the parent exits which wakes up the shell. The problem is that the child process runs as under a different process id - so if your daemon had written a pid file and only then called the daemon() function, the content of the pid file (the process id of the running daemon) would always be wrong.

There are several more reasons why you should run your initialisation logic under the same process as the main server logic. So while you want to do the actual detaching from the terminal as late as possible, you would want to run the fork() system call quite early on in the initialisation. In my opinion the point at which a daemon should issue the fork() call is shortly after parsing the command line arguments, but before running any other setup logic.

In summary the rule is: A good daemon should fork soon, but have the parent process exit only once the child is ready to service requests.

One way to do this is to use signals. This method is used by sysklogd - the parent process waits until killed by the child. But signals are easy to get wrong - if you don't register the signal handler strictly before it becomes possible for a signal to arrive, your parent process will not run the proper signal handler - instead it may crash or just ignore the signal - it all depends on what the grandparent process (usually the shell) had configured. Another problem: How do error messages get from the child back to the parent and so to the terminal ? After all, a signal is only a flag, not a message.

My preferred solution involves creating a pipe between the child and parent process. Early on during startup, the daemon splits into a parent process which listens to the pipe and displays (and possibly logs) whatever it can read from the pipe. The child process ensures that its writing pipe file descriptor becomes the standard error descriptor, so that any error messages generated during setup can easily be relayed to the parent. Once the child process is ready to service requests, it closes its error file descriptor. The parent process notices the end of file condition. Before the parent exits it briefly checks if the child process has terminated abnormally so that it can report a crash, otherwise it the parent exits normally to indicate that the daemon is now ready to do work.

I have used this approach for a couple of years and seem to have had good results. Of course, there is the chance it may have deficiencies - please let me know if you spot any. Below is the C function which implements the logic I have described. I call it fork_parent() because when it returns the process has a acquired a parent which disappears as soon as stderr is fclosed or STDERR_FILENO is closed.

I really like Free Software and usually to release my code under the GPL, but in this case I am prepared to make an exception and release the C fork-parent code to the public domain, disclaiming all warranties and guarantees. If you like to, credit me but this isn't mandatory. This note itself is released under a creative commons share-alike license.

up to notes index