martes, 21 de junio de 2011

My Backup script for Mac

Recently I bought an external disk of 1 terabyte. This comes to replace my old 250 MB disk.

The fists thing I made was to format the disk and create 1 partition to store all the information. As I wanted to use the disk with Mac and PC, I formatted it as NTFS. I also acquire a Mac utility to support writing to NTFS.

The disk worked fine until I tried to use backup tools for Mac. Most of the backup tools for Mac require a disk formatted in Mac format. But in my case it is NTFS. A solution might be to create 2 partitions, one for Mac and one for PC, but the one of Mac might not be available in PC. Still I want the compatibility with both systems.

My backup requirements for Mac were easy:
  • Back up only data files
  • Maintain only the image at the time the backup is made
  • Do not backup OS and Apps. In case of failure, those can be restored from the installation disk and packages.

I was about to format the disk with the 2 partitions when I realize I already had 500MB of information in the disk. I was in risk to lose part of my information because I have no other disk to use as backup during the format operation.

I include another requirement:
  • The backup files can be read from both systems. In case I want to review the files in any system.

The backup is simplified as I’m not really interested in maintain version of files.

I started looking for some scripts and backup ideas but none of those I found were of my entirely satisfaction.

At last I decide to create my own. The easiest seems to be the RSYNC command to create the copy of my Mac’s data.

My Backup Script initially was only to RSYNC the folder I want to backup, creating a copy of into the external disk.

The first time it took hours to backup all files.

The second time again took hours to make the backup.

No matter what files were changed the RSYNC always took hours to finish.

I didn’t do any deep investigation, but I’m convinced the RSYNC command can’t handle properly files stored under NTFS format. This because when the RSYNC started, many files are detected as changed, no matter no single change was made.

After been disappointed by RSYNC, I create my own script to sync files.

The required operation for the script:

  1. Receive 2 folders, source and destination
  2. Validate those folders
  3. Compare current files to detect differences
  4. Copy from source to destination the newly created files
  5. Delete from destination the deleted files
  6. Update destination if changes were made to source
Compare current files to detect differences, as needed: Copy from source to destination the newly created files Delete from destination the deleted files Update destination if changes were made to source

Using existing BASH commands all these can be done.

The script in review

The basic script usage is:

./myBackup.sh sourceDir destDir

or

./myBackup.sh sourceDir destDir X

Given the command, the script starts. We will later discuss the functions.

First validate the parameters

validateSource "$1"
...
validateDest "$2"

If source does not exist, the script won’t continue.

The list of files from both directories is retrieved using the FIND command.

cd "$source"
find -s * > $currentDir/__source__.lis
...
cd "$dest"
find -s * > $currentDir/__dest__.lis

Using the DIFF command we can get the differences.

diff __source__.lis __dest__.lis > __diff__.lis

Now, in the __diff__.lis file we have the list of differences. Reading that file we can process the fists 2 steps of the synchronization.

I had problems performing 2 operations as I read the file, so, I read the differences file twice, one for deletion, one to copy files.

First loop is to delete files. The files deleted from source are deleted from destination in this loop.

while read line
do
# < Copy file from source to dest
# > Delete file from dest
if [ "${line:0:1}" == ">" ]
then
delDest "$dest" "${line:2}"
fi
done < __diff__.lis

Second loop is to copy files. The files created in source and absent in destination are copied in this loop.

while read line
do
# < Copy file from source to dest
# > Delete file from dest
if [ "${line:0:1}" == "<" ]
then
copySource2Dest "$source" "$dest" "${line:2}"
fi
done < __diff__.lis

To detect differences in files, the evaluation is:
  • Compare changed date
  • Compare file size
  • If required, compare checksum

For this comparison, we loop the file list of source directory.

while read line
do
{process here}
done < __source__.lis

Within the loop, we get parameters for each file. Directories are skipped from this validation.

getFileProps "$source" "$line"
sourceCreate=$st_birthtime
sourceChange=$st_mtime
sourceSize=$st_size
getFileProps "$dest" "$line"
destCreate=$st_birthtime
destChange=$st_mtime
destSize=$st_size

Once we retrieve the parameters, the comparisons are very straight forward.

if [ $sourceChange -ne $destChange ]
then
different=1
fi
if [ $sourceSize -ne $destSize ]
then
different=1
fi

The script is prepared to receive 3 parameters; the third is any value and represents the need to compare checksums of each file (source and destination). The third parameter is stored in a variable $deepSearch

if [ ! $deepSearch == "" ]
then
set $(cksum "$source/$line")
sourceCheckSum=$1
set $(cksum "$dest/$line")
destCheckSum=$1
if [ $sourceCheckSum -ne $destCheckSum ]
then
different=1
fi
fi

If all the validations tells us the files are different, then, the file need to be copied.

if [ $different -eq 1 ]
then
needEnter=0
copySource2Dest "$source" "$dest" "$line"
fi

Finally, just delete the temporary files.

rm __source__.lis
rm __dest__.lis
rm __diff__.lis

Supporting functions

The source directory, if a parameter was given, validate it is really a directory.

function validateSource
{
localSource=$1
if [ "$localSource" == "" ]
then
echo "Usage:"
echo " $0 <sourceDir> [<destinationDir>]"
echo " "
localSource=""
elif [ ! -d "$localSource" ]
then
echo "\"$localSource\" directory does not exist."
localSource=""
fi
source=$localSource
}

The destination directory is not really required, but it is highly recommended. If it was not wiven, a default backup directory is selected.

function validateDest
{
localDest=$1
if [ "$localDest" == "" ]
then
localDest=/bak
fi
if [ ! -d "$localDest" ]
then
mkdir "$localDest"
fi
dest=$localDest
}

Sometimes we will need to copy files from source to destination. In case of directory, create it in destination, in case of file, copy and set the last changed date to the one of source, this can be done with TOUCH.

function copySource2Dest
{
# if it is a directory, create it
# if it is a file, copy it
if [ -d "$1/$3" ]
then
mkdir "$2/$3"
else
cp "$1/$3" "$2/$3"
if [ -f "$2/$3" ]
then
touch -r"$1/$3" "$2/$3"
fi
fi
}

When a file or directory was deleted from source, do the same in destination. Before deleting, validate the file exists in destination.

function delDest
{
if [ -d "$1/$2" ]
then
rm -R "$1/$2"
elif [ -f "$1/$2" ]
then
rm "$1/$2"
fi
}

To compare source and destination files, we use the properties returned by the STAT command. The returned values will be used by the script to detect changes.

function getFileProps
{
export $(stat -s "$1/$2")
}

Installation

I have copied the script to /bin/usr in order to make the script available to be used within any directory. Otherwise, the script need to be in the same directory where it is called or within the PATH.

Future development

A way to maintain previous versions of files is highly recommended. Once versioning is implemented, also the restore mechanism will be required.

So far the script is working fine, still, any suggestion is well received.

The Script


No hay comentarios: