Cron Job Tarball mySQL Dump File Archives

11 Apr 2009
Posted by m1mic

In a previous post, I walked the user through how to setup a cron job to automatically create a nightly mySQL backup (see http://ejhildreth.com/content/setting-cron-mysql-backups). This is great but has one huge flaw: The database backups keep piling up unless I manually remove them. Since scripting makes the computer work for me, I fired up VIM and created some new shell scripts.

Since I like having multiple backups, I like to keep a rolling three months of them to fall back on. From the command line, I typically use the find command and let it check the dates for me. Using the type f parameter (-type f), I can tell find that I am looking only for regular files. Additionally, I can pass the -mtime parameter to the find command to tell it how many days in the past I want it to look for. If I am looking for files that are older than today's date, I would use -mtime +0. If I am looking for files that are newer than today's date, I would use -mtime -0. The number that follows the -mtime parameter indicates how many days (24 hour time periods from the current time) I want to base my search on. Example:

find -type f -mtime +0

Next, I want to make sure that only the mySQL backup files are in my tarball (.sql file extension). Again, the find command has a parameter that you can pass it that will let you narrow down the results; -name. The -name parameter can take wildcards to really help focus in on the files that I am looking for. To find only .sql files in my directory, I can use *.sql. Example:

find -type f -mtime +0 -name '*.sql'

Now that I can search out the specific files that I am looking for within a specific time frame, I need to package them all together inside of a tarball (tar command). Luckily, the Linux command line allows you to string commands together using a pipe (a vertical line: |). However, the find command and the tar command cannot be strung together by themselves. Find will return each of the filenames it finds (pun intended!) one at a time. Since there is a chance that there is multiple files, using a straight pipe to string the tar command on to the end will confuse it. Tar is looking for the filenames to be passed to it one at a time. Enter xargs. The xargs command will take the output from the find command and pass it to the tar command in a way that it can interpret it and use it correctly.

find -type f -mtime +0 -name '*.sql' | tar -czf filename.tar.gz

(the -czf paramater tells the tar command to create a new archive, zip that archive using gzip, and put the results in its own file.)

Now that I have all the commands in place, I need to put in to a shell script so I can have the system run it automatically. Also, to distinguish one tarball from another, I want to automatically add a timestamp to each file name. Since I am going to run the script as a monthly cron job, I only need to tell the script to look for files older than one days old server time. When the script runs succefully, I like to keep a log of it inside of a text file to easily scan when the cronjob ran.

#!/bin/bash
# Creating a tarball of all files older than 1 day old
# Running this script on a monthly cronjob
cd /exampleDirectory

# Getting the current time to use as a timestamp in the file name
# Using Year|Month|Day format
now_timestamp=`date +%Y%m%d`

# Finding all the .sql files in the specified directory and archiving them in a gzip tarball
# Appending the now_timestamp to the end of the filename
find -type f -mtime +0 -name '*.sql' | tar -czf filename-$now_timestamp.tar.gz

# Getting a more specific timestamp for the log file in case the script is ran more than once on a given day
# Using Year|Month|Day-Hour|Minute|Second format
log_timestamp=`date +%Y%m%d-%H%M%S`

# If the script makes it this far, append the timestamp to the end of a txt file
echo $log_timestamp -- "Tarball Created Successfully" >> /exampleDirectory/exampleFile.txt

Now that I have my script in place, I need to let cron know about it so it can run it each month for me. To update cron, use crontab -e from the command line and enter in the following (depending on your setup, you may have to run as root):

00 00 1 * * /pathToTheShellScript

This is telling cron to run the script at the 00 minute mark, at the 00 hour mark, on the 1st of every month (i.e. at midnight on the first day of every month).