Fair points. And thanks, by the way, for actually advancing the discussion. Init...

Fair points. And thanks, by the way, for actually advancing the discussion.

Init can and does manage processes. Somewhat crudely, mostly via the 'respawn' directive. One thing it isn't particularly good at is telling if a process is doing something useful (say, serving out web pages successfully), but it will let you know that it's running. There was a semi-popular hack some years back to run sshd out of init (via respawn) to ensure you always had an SSH daemon on your box (Dustin mentions this). The downside is that while it will ensure sshd is running, it doesn't give you much flexibility over the process (you've got to edit inittab and 'init q' to make changes).

What monit and kin can do, above and beyond process-level monitoring, is check that the service attributes of a process are sane. That a webserver, say, kicks out a 200 OK response rather than a 4## or 5## error, and restart the service if this isn't the case. Checking for correct operation can be more useful than simply verifying a process is running (though going too far overboard in defining "correctness" can also cause problems).

For realtime/HA tools, attacking things on the single-system level is probably the wrong way to roll. You want a load balancer in front of multiple hosts with response detection -- is host A still up or not? Whether or not this ties into mitigation (restart) or alerting (notifications to staff) is another matter.

There are also places other than init you can watch things from. /proc contains within it multitudes, including a lot of interesting/useful process state. Daemons can be written with control/monitoring sockets instrumented directly into themselves. Debuggers, strace, ltrace, dtrace, and systemtap all provide resolution inside a running process/thread. Creating something sane, effective, efficient, and sufficient out of all these tools ... interesting problem.