This article will present my software reed-alert, it checks user-defined states and send user-defined notification. I made it really eay to use but still configurable and extensible.
reed-alert is not a monitoring tool producing graph or storing values. It does a job sysadmins are looking for because there are no alternative product (the alternatives comes from a very huge infrastructure like Zabbix so it’s not comparable).
From its configuration file, reed-alert will check various states and then, if it fails, will trigger a command to send a notification (totally user-defined).
This is a open-source and free software released under MIT license, you can get it with the following command:
git clone git://bitreich.org/reed-alert cd reed-alert
A README file is available as documentation to describe how to use it, but we will see here how to get started quickly.
You will find a few files there, reed-alert is a Common LISP software and it has been choosed for (I hope) good reasons that the configuration file is plain Common LISP.
There is a config file looking like a real world example named config.lisp.sample and another configuration file I use for testing named example.lisp containing lot of cases.
In order to use reed-alert we only need to create a new configuration file and then add a cron job.
We are going to see how to configure reed-alert. You can find more explanations or details in the README file.
We have to configure two kind of parameters, first we need to set-up a way to receive alerts, easiest way to do so is by sending a mail with “mail” command. Alerts are declared with the function alert and as parameters the alert name and the command to be executed. Some variables are replaced with values from the probe, in the README file you can find the list of probes, it looks like %date% or %params%.
In Common LISP functions are called by using a parenthesis before its name and until the parenthesis is closed, we are giving its parameters.
(alert mail "echo 'problem on %hostname%' | mail email@example.com")
One should take care about nesting quotes here.
reed-alert will fork a shell to start the command, so pipes and redirections works. You can be creative when writing alerts that:
- use a SMS service
- write a script to post on a forum
- publishing a file on a server
- send text to irc with ii client
Now we have some alerts, we will configure some checks in order to make reed-alert useful. It uses probes which are pre-defined checks with parameters, a probe could be “has this file not been updated since N minutes ?” or “Is the disk space usage of partition X more than Y ?”
I choosed to name the function “=>” to make a check, it isn’t a name and reminds an item or something going forward. Both previous example using our previous mail notifier would look like:
(=> mail file-updated :path "/program/file.generated" :limit "10") (=> mail disk-usage :limit 90)
It’s also possible to use shell commands and check the return code using the command probe, allowing the user to define useful checks.
(=> mail command :command "echo '/is-this-gopher-server-up?' | nc -w 3 dataswamp.org 70" :desc "dataswamp.org gopher server")
We use echo + netcat to check if a connection to a socket works. The :desc keyword will give a nicer name in the output instead of just “COMMAND”.
We wrote the minimum required to configure reed-alert, now the configuration file requires only two more lines, so your my-config.lisp file should looks like the following:
(load "functions.lisp") (alert mail "echo 'problem on %hostname%' | mail firstname.lastname@example.org") (=> mail file-updated :path "/program/file.generated" :limit "10") (=> mail disk-usage :limit 90) (quit)
The first line is mandatory to make things working, the last line will prevent the lisp interpreter to stay in REPL (user interactive input).
Now, you can start it every 5 minutes from a crontab with this:
*/5 * * * * ( cd /home/user/projects/reed-alert/ ; sbcl --load my-config.lisp )
If you prefer to use ecl:
*/5 * * * * ( cd /home/user/projects/reed-alert/ ; ecl --load my-config.lisp )
The time between each run is up to you, depending on what you monitor.
By default, when a check returns a failure, reed-alert will only trigger the notifier associated once it reach the 3rd failure. And then, will notify again when the service is back (the variable %state% is replaced by start or end to know if it starts or stops.)
This is to prevent reed-alert to send a notification each time it checks, there is absolutely no need for this for most users.
The number of failures before triggering can be modified by using the keyword “:try” as in the following example:
(=> mail disk-usage :limit 90 :try 1)
In this case, you will get notified at the first failure of it.
The number of failures of failed checks is stored in files (1 per check) in the “states/” directory of reed-alert working directory.