21 Apr 2009

Backup using encryption and gmail

Today I was a bit bored, so I wrote a script to backup up my personal svn repository. I use my svn-server for my personal projects and to keep a history of my personal documents. So I don't want to loose the data in my repository.

The repository is already on a RAID1, but I don't think this is enough. I use Google Apps for hosting my mail (it's GMail) and this gives me more disk-space then I could use on ordinary emails, so I've decided to utilize this for my backups. Both because I've the space and because I assume Google has a better backup system then anything I could afford.

I've written a script that compresses and encrypts to content of a directory, the archive created is then emailed to my backup email account.

This script is set to run once a day, but the email is only sent if something has changed since the last time it's sent to my email. This is to save space.

I encrypt the archive because email isn't all that private. The data isn't sensitive, but I would still like to minimize the risk of it falling into other peoples hands.

The script splits the backup into 20MB pieces because of the limits on the gmail-smtp.

Well enough text, here is my script:

#!/bin/bash

mkdir -p ${HOME}/tmp

DATE=`date +%Y%m%d-%H:%M:%S`
BACKUP_DIR="svn" # Name of directory to backup, NOT PATH.
PARENT_DIR="/home" # Path to the parent dir for the backup directory

BASE_FILENAME="${HOME}/tmp/svn_backup" # Base path for the working files for this backup
NEW_TAR_FILENAME="${BASE_FILENAME}.tar" # Working path to the tar archive
OLD_TAR_FILENAME="${BASE_FILENAME}_old.tar" # Path to the old tar archive, this will be used to see if changes have happend since last time.
NEW_LZMA_FILENAME="${BASE_FILENAME}-${DATE}.tar.lzma" # Path to the lzma-compress archive of the backup.
NEW_GPG_FILENAME="${NEW_LZMA_FILENAME}.gpg" # Path to the gpg encrypted file containing the back

EMAIL="target@domain.com" # Target email for our backup

SPLIT_SIZE="20MB" # How big do we want the pieces to be, I choose 20MB since that's the maximum file size gmail allows.

# Change dir to the parent dir.
cd ${PARENT_DIR}
echo "Generating tar-archive"
tar cp ${BACKUP_DIR} > ${NEW_TAR_FILENAME}

if [ -f ${OLD_TAR_FILENAME} ]; then
	# if an old copy of the archive exist, check if the new one is different
	echo "Old tar-archive exists."
	diff -q ${NEW_TAR_FILENAME} ${OLD_TAR_FILENAME} && \
		echo "They are the same, nothing has changed stopping." && \
		rm -f "${NEW_TAR_FILENAME}" && exit 0
fi

echo "Compressing backup with lzma: `date`"
lzma -z -9 - < ${NEW_TAR_FILENAME} > ${NEW_LZMA_FILENAME}

echo "Encrypting backup with gpg: `date`"
gpg -r "m_abs@mabs.dk" -e - < ${NEW_LZMA_FILENAME} > ${NEW_GPG_FILENAME}
rm -v ${NEW_LZMA_FILENAME}

echo "Splitting encrypted file: `date`"
split ${NEW_GPG_FILENAME} -b 20MB "${NEW_GPG_FILENAME}_"
rm -v ${NEW_GPG_FILENAME}

# How many pieces is there? This is need because I want to write it in the email.
c=0;
for filename in ${NEW_GPG_FILENAME}_[a-z]*; do
	let "c=${c}+1";
done

# Sent every piece of this backup to the target email.
i=0;
for filename in ${NEW_GPG_FILENAME}_[a-z]*; do
	let "i=${i}+1";
	echo "Sending email ${i}: `date`"
	echo "SVN backup - ${DATE}" | mutt -s "SVN backup part ${i}/${c} - ${DATE}" ${EMAIL} -a ${filename}
	rm -v ${filename}
done

mv -v ${NEW_TAR_FILENAME} ${OLD_TAR_FILENAME}
exit 0

Save it to a file, place the file ((remember to name the file to the run-parts specification, see man run-parts)) in /etc/cron.daily and make it executable.

In order for this script to work I needed a few packages installed on my Gentoo installation, mutt and ssmtp.

I'm gmails smtp-server to sent the emails, I've created a special email-account for this so that I don't have to use my everyday email account for this. This is my /etc/ssmtp/ssmtp.conf

# The person who gets all mail for userids < 1000
# Make this empty to disable rewriting.
root=postmaster

# The place where the mail goes. The actual machine name is required
# no MX records are consulted. Commonly mailhosts are named mail.domain.com
# The example will fit if you are in domain.com and your mailhub is so named.
mailhub=mail

root=sender@domain.com
mailhub=smtp.gmail.com:587
rewriteDomain=
hostname=sender@domain.com
UseSTARTTLS=YES
AuthUser=sender@domain.com
AuthPass=password
FromLineOverride=YES

To encrypt in this script I use GPG and a self generated key.
To generate the key run the command "gpg --gen-key" and follow the wizard. And export the public key from the machine you created it on and import it on the server you are going to run the script.
Exporting the key:

host# gpg --list-keys
/home/user/.gnupg/pubring.gpg
-----------------------------
pub   1024D/112A2538 2009-04-21
uid                  Firstname Lastname 
sub   4096g/AEFA304A 2009-04-21
host# gpg --export "Firstname Lastname " > key.pub

Once you have transfered the key to the server import it with and set trust-level to ultimate on this key, or the backup script will stop and wait for you to answer if you really trust the key.

server# gpg --import key.pub
server# gpg --edit-key user@domain.com
command> trust
Please decide how far you trust this user to correctly verify other users' keys
(by looking at passports, checking fingerprints from different sources, etc.)

  1 = I don't know or won't say
  2 = I do NOT trust
  3 = I trust marginally
  4 = I trust fully
  5 = I trust ultimately
  m = back to the main menu

Your decision? 5
Do you really want to set this key to ultimate trust? (y/N) y
Command> quit

Restoring the backup:
Get the files from the email, this can take a while if there are many pieces.

host# basename=svn_backup-20090421-20:20:33.tar.lzma

host#  for filename in $basename_[a-z]*; do
cat $filename >> $basename
done

host# lzma -d < ${basename} | tar cvp

This should do the trick. The reason I use a for loop and don't just "cat $basename*>>$basename" is that with danish locale "aa" is perceive like å which is the last letter in the danish alphabet and this messes up the file order for cat.


Tags: