Simple shell script replacing Pingdom for checking domain health
Pingdom is a service that automatically checks the health of your domains. It returns up/down information, response times, status codes and more. If something goes wrong it sends you notifications. Here I’d like to show a simple shell script that I’ve been using that replicates some of that functionality and notifies by email when something goes wrong.
Background/Why
After a while “doing web” you usually find that you end up with dozens of online assets and web addresses. Having a way to automatically check these domains and addresses for any show-stopping errors with a regular interval is highly beneficial.
I don’t need most of the features that Pingdom offers and hence opted for a custom-built solution for all my self-hosted domains. In my own experience TLS certificate problems are quite common, especially when multiple websites are sharing the same server/VPS and you are using Letsencrypt for automatic renewal. An automatic check allows me to keep my domains up all the time.
The script
The below script works by getting every address in $DOMAINS
with curl and checking the curl exit code. If it is non-zero (failed) it will send a single email containing every domain that failed, along with the exit code.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
#!/bin/bash
ERROR_EMAIL="[email protected]"
DOMAINS=(
"https://tedeh.net"
"https://www.tedeh.net"
)
ERRORED_DOMAINS=()
# joins the rest of the input args with the first arg
function join_by { local IFS="$1"; shift; echo "$*"; }
for i in "${DOMAINS[@]}"
do
DOMAIN="$i"
curl --silent --output /dev/null $DOMAIN
STATUS="$?"
if [ "$STATUS" -ne 0 ]; then
echo "`date +\"%Y-%m-%d %H:%M:%S\"` $DOMAIN curl-exit-code=$STATUS"
ERRORED_DOMAINS+=("$DOMAIN curl-exit-code=$STATUS")
else
echo "`date +\"%Y-%m-%d %H:%M:%S\"` $DOMAIN curl-exit-code=$STATUS"
fi
done
if [ ${#ERRORED_DOMAINS[@]} -eq 0 ]; then
echo "`date +\"%Y-%m-%d %H:%M:%S\"` No errors found"
else
echo "`date +\"%Y-%m-%d %H:%M:%S\"` Errors found, emailing $ERROR_EMAIL"
BODY=$(join_by $'\n' "${ERRORED_DOMAINS[@]/#/}" "$1")
echo "$BODY" | mail -E -s "check_domains error" $ERROR_EMAIL
fi
This script will send an email for TLS certificate problems and other connection errors. Certificate problems include common name mismatch, expired certificate, etc. This is quite basic but at least for me covers most of what I’d like to know. In a future version it would be good to also assert a 2xx HTTP status code if curl
succeeds.
A full list of what the curl exit codes mean can be found here.
The script is published as a Github Gist that you can edit/fork too!
Running schedule
Pingdom checks domains every 30 minutes. Since this script should invariably be run with cron
anyone is free to set their own running schedule. Easiest might be to put the script in /etc/cron.hourly
, turn the executable flag on and have it automatically run every 60 minutes.
The overhead of running these checks is pretty much minimal even with hundreds of domains, so a tighter schedule should not be a problem for most servers.
Improvements
There are plenty of ways this code could be improved.
- Error on a non 2xx HTTP status code (also reported over email)
- Automatic translation of curl exit codes
- State (send ONE down email when a domain goes down and ONE up email when it comes back up)
Fancy web interface
Do you have any ideas on how to improve the script? Please comment below or edit the Github gist.