Quite often one wants to run a periodic serial process from a linux shell. This process is not intended to run concurrently with itself.
There are many ways to achieve this. One good way is to use flock, but this may not always be available on your system of choice. We need an atomic operation in order to decide if we can grab a unique lock for our process, and mkdir is a pretty good choice. mkdir can be used to create a semaphore directory polled by all invocations of our process.
The below script uses mkdir, and if it successfully created the directory, writes the PID for itself inside that dir. When the script finishes and exits, it cleans up the directory. We also set some traps to clean up in case of a few interrupt conditions.
If the script runs and finds it cannot acquire the lock because the directory already exists, it tests why this is the case. If the PID still exists then it sends out an email to warn you a job may be overrunning unexpectedly. If the PID no longer exists, it concludes the job was killed quite rudely, and cleans up the lock directory, as well as emailing an alert. The script will run upon the next invocation.
You can invoke the script with the lock directory name, so you can in effect deliberately run multiple instances of your process, locked using different dir names. With this it is possible to have several process 'streams' as it were, say A, B and C, and make sure A is locked against all other occurrences of A, B against B, C against C and so on. That is, you may have a process that runs for each customer, and that should only be running once at any given moment. But you may want to run several customers simultaneously. Just call the lockscript with customerA, customerB etc as the argument.
This is limited to a single server, and possible improvements would include enhancing it to cope with locking processes running on multiple servers, by using some shared resource (mounted filesystems, database) for the locking semaphore.
#!/bin/bash
#
# locking function: This function must be called
# with <lockdirname> as the FIRST and only argument.
# This is for the bash file-locking mechanism
# exit codes adhere to http://tldp.org/LDP/abs/html/exitcodes.html#EXITCODESREF
lock () {
USAGE="usage: lock <lockdirname>"
NOOPTION="You must specify the lockdir name. Exiting"
[ -z "$1" ] && echo $USAGE && echo $NOOPTION && exit 64
EXECUTION=$1
SUPPORTMAIL=root
export APP_HOME=`dirname "$0"`
[ -z "$APP_HOME" ] && echo Could not determine base directory - Exiting && exit 71
LOCKDIR=${APP_HOME}/$EXECUTION.lock
if mkdir $LOCKDIR
then
echo >&2 "$0: successfully acquired lockdir $LOCKDIR at `date` "
# Remove LOCKDIR when the script finishes, or when it receives a signal
trap '{ echo Cleaning up lockdir $LOCKDIR ...; rm -rf "$LOCKDIR"; echo done, exiting; }' 0 EXIT # remove directory when script finishes
trap "{ echo Caught SIGHUP; exit 129; }" 1 SIGHUP # exit with 128+n
trap "{ echo Caught SIGINT; exit 130; }" 2 SIGINT # exit with 128+n
trap "{ echo Caught SIGQUIT; exit 131; }" 3 SIGQUIT # exit with 128+n
trap "{ echo Caught SIGTERM; exit 143; }" 15 SIGTERM # exit with 128+n
# put PID of this process into the $LOCKDIR, so we can check this if the next invocation fails to run
echo $$ > $LOCKDIR/PID
else
echo >&2 "$0: WARNING! $LOCKDIR present - aborting process - reason follows:"
PID=`cat $LOCKDIR/PID`
if kill -0 $PID
then
# process is still running
echo >&2 "$0: REASON - there is a lingering process, $PID"
echo -e "$0: PID= $PID \n `ps -lyf $PID `" | mail -s "Aborting $0 - old process still running (error 1001)" $SUPPORTMAIL
exit 1001
else
# process is not running, but lock file not deleted?
echo >&2 "$0: REASON - orphan lockdir. Host process $PID is gone, so lockdir will now be deleted."
echo "$0: Lockdir will be deleted. Process should run at next invocation" | mail -s "Aborting $0 - orphan lockdir (error 1002)" $SUPPORTMAIL
rm -rf $LOCKDIR
exit 1002
fi
fi
# End
return 0
}