Baculas autolabeling feature is not fault-tolerant

When Bacula needs a new tape in a pool, it can automatically label a blank one and write to it afterwards. Unfourtunatley the database/catalog entry is created before the new tape is known to be labeled successfully. So it can happen, that when an unappropriate tape (e.g. already labeled, from another pool) is in the drive, bacula creates a catalog entry in the Media table for a new tape and waits forever to have that tape (which doesn’t really exist) to be mounted.

In this case you have to cancel the corresponding backup jobs and delete manually the catalog entry with the delete command in bconsole. Continue reading “Baculas autolabeling feature is not fault-tolerant”

Work related to Bacula

I wrote this article originally back in 2006 and updated it the last time on 2006-09-12. It was originally available at http://www.georglutz.de/wiki/Bacula.  The information you find below might be rather outdated now and I don’t plan to work on it further, but I put it in the blog for archival purposes on this particular date – Georg in June 2011.

“Bacula is a set of computer programs that permit you (or the system administrator) to manage backup, recovery, and verification of computer data across a network of computers of different kinds.” For more information on Bacula itself see http://www.bacula.org .

GFS backup with bacula

GFS is a rotating backup scheme and stands for Grandfather-Father-Son. It means that there are 3 pools/sets of backup volumes/medias. Each volume belongs to one set and the volumes of one set are rotated indepenently.

GFS might not be feasible in all scenarios, but at the site which I was admin for, it runs for several years now. Note also, that for the following instructions/hints you need to understand the basic concepts of bacula and change the files to reflect your individual setup.

Unfourtunately up to now (2006-03-31) Bacula does not support GFS in the way I would like to. There is a GFS-like example in the Bacula documentation, but there are some differences to the method I understand under GFS:

Date: Tue, 11 Apr 2006 14:23:48 +0200
From: Georg Lutz <glist@gmx.net>
To: bacula-users@lists.sourceforge.net
Subject: Re: [Bacula-users] Re: GFS rotating howto

On 2006-04-10, Mark Nienberg wrote:
>
> How is your method different from the Daily, Weekly, Monthly Tape Usage
> Example in the Bacula Documentation?
>

1. The scheduling is different. In Bacula you cannot schedule on the
last workday of a month or week because Bacula doesn't know of workdays
and holidays. The internal scheduling mechanism does also not know of
exceptions. At least at my site there are some days where no operator is
here to change tapes and where no backup should take place (because the
data is not altered much anyway on these days).

2. The volume recycling is different. I cannot use a recyling based on
fixed dates(VolumeRetention), because backup does not take place on
fixed dates(exceptions, holidays). My recycling mechanism is: Only
overwrite the tape with the oldest data, nothing else. If there are
holidays for one week, you end up with 7 tapes which can be overwritten
in any particular order when you use the example in the bacula
documentation.

3. The tape change is different. As I said in the howto, for security
reasons we change tapes daily. Monthly and weekly tapes are stored
offsite. So if the site burns down we can restore at least the state
before the last weekend. Also I want to have the backup done in a
definite timeframe during the night. Its not acceptable for me when the
operator needs to change tapes in the morning and the backup is done
during the day. Of course you can change this behaviour easily when you
set "Maximum Volume Jobs = 2" (for files and catalog).

However, that does not mean that my backup scheme is somehow better than
others. What backup method one uses depends from the specific needs. I
think most admins prefer a "fire and forget" solution with minimal human
intervention. This often means that the tape resides in the
streamer/changer until the backup software notifies the operators to
insert some other tape.

However with the script presented here you can tweak it a bit to do so.

For the scripts presented here the following assumptions are made:

  • backups are done on a regular daily base
  • there is enough space on a single tape/volume to hold one full backup run
  • for security reasons for each backup run a different volume is used
  • there are three sets: daily, weekly, monthly
  • the monthly backup volume is used on the last workday of the month
  • the weekly backup volume is used on the last workday of the week
  • the daily backup volume is used on all other workdays

baculas configuration files

As bacula already makes use of pools all you have to do is to add the following lines to bacula director configuration file /etc/bacula/bacula-dir.conf:

Pool {
  Name = monthly
  Pool Type = Backup
  # should be set to the nr of jobs for each backup run
  # forces bacula to use a different volume each time
  # at least 2 (catalog and files)
  Maximum Volume Jobs = 2
  # bacula doc says that we should not do this ;-)
  Purge Oldest Volume = yes
  Accept Any Volume = no
}

Pool {
  Name = weekly
  Pool Type = Backup
  Maximum Volume Jobs = 2
  Purge Oldest Volume = yes
  Accept Any Volume = no
}

Pool {
  Name = daily
  Pool Type = Backup
  Maximum Volume Jobs = 2
  Purge Oldest Volume = yes
  Accept Any Volume = no
}

Normally you might want to create 4 volumes in daily pool (for monday to thursday), at least 4 volumes in weekly pool (to cover a whole month) and some volumes in monthly pool. At my site monthly volumes are never overwritten, but used as archive.

We don’t use the internal scheduler, so you don’t need to use the Schedule statement in the job description section.

scheduling

As we don’t use the internal scheduler we need to trigger the backup process via an external script. This has also the advantage, that we can define days were no backup should take place, because no operator can change volumes (e.g. on holidays). To define exceptions I use the file backup_exceptions.txt . Each line stand for a date value in ISO8601 format. A valid file backup_exceptions is e.g :

2006-01-01
2006-12-31

To start the backup process manually normally you need to type commands in the bconsole program. The following shell script run_bacula.sh (run_bacula.sh.gz) is just a wrapper for bconsole:

backup_type=$1
cd /etc/bacula

if [ $backup_type = "month" ]
then
    ./bconsole << EOF
    run job=StandardSicherung pool=monthly level=Full yes
    run job=BackupCatalog pool=monthly yes
EOF
fi

if [ $backup_type = "week" ]
   then
     ./bconsole << EOF
     run job=StandardSicherung pool=weekly level=Full yes
     run job=BackupCatalog pool=weekly yes
EOF
fi

if [ $backup_type = "day" ]
   then
     ./bconsole << EOF
     run job=StandardSicherung pool=daily level=Differential yes
     run job=BackupCatalog pool=daily yes
EOF
fi

Note that you must change it to fit your job definitions and backup level strategy.

The real calculation what type of backup (monthly, weekly, daily) should actually be started is done in backup.pl (backup.pl.gz). If you have heard of GFS before, the algorithm should be fairly clear. backup.pl can be called by cron without arguments at the time the daily backup should start normally.

backup plan/protocol

An important question for the backup operator is, which tape bacula wants to use for the next backup run in a particular pool. If you use sqlite as catalog storage and the GFS scheme presented here the question can be answered by the shell script bacula_next_tape (bacula_next_tape.gz):

sqlitebin=/usr/lib/bacula/sqlite/sqlite
database=/var/bacula/bacula.db

pools="daily weekly"

for pool in $pools; do
        selectPool="SELECT PoolId from Pool WHERE Name='${pool}';"
        poolId=`echo $selectPool | $sqlitebin $database`
        #if [ $poolId == ""]; then
        #       echo "error: pool $pool not found"
        #exit 1
        #fi
        selectVolume="SELECT VolumeName FROM Media WHERE PoolId=$poolId AND VolStatus='Recycle'
ORDER BY LastWritten LIMIT 1;"
        volumeName=`echo $selectVolume | $sqlitebin $database`

        if [ "$volumeName" == "" ]; then
                selectVolume="SELECT VolumeName FROM Media WHERE PoolId=$poolId AND VolStatus='U
sed' ORDER BY LastWritten LIMIT 1;"
                volumeName=`echo $selectVolume | $sqlitebin $database`
                if [ "$volumeName" == "" ]; then
                        echo "error: no oldest volume available for pool $pool"
                fi
        fi

        echo "pool ${pool}: $volumeName"
done

There is also a web interface for the script above. In the archive bacula_gfs_webscripts.tar.gz you can find 5 files:

  • index.php : Startpage and overview over bacula, what tape is going to be used, show exceptions, links to backup_plan.php etc.
  • backup_plan.php: Generates a backup protocol (pdf file) for a given month in which the operation has to confirm the correctnes of backup (needed e.g. for quality managment).
  • functions.inc.php : general functions used for in the other files.
  • backup_exceptions.txt : Text files on which dates the backup should not be run. One entry (ISO8601 date) per line.
  • backup_exception.php : Web-Interface for enter exception dates in backup_exceptions.txt.

backup_plan.php needs fpdf (http://www.fpdf.org), see in the source code. You have to make bacula_next_tape and the sqlite catalog somehow accessible to the webserver user. At this site this is not a major security issue, but might perhaps prefer another solution.

The above files are hardcoded in german and match a particular environment. Just see it as a proposition/template for your own script.

Here you can see index.php in action:

 

Here the Web-Interface for the exception file:

 

And here the pdf generated by backup_plan.php:

RPM-Installation von Bacula und Datenbank-Scripte

Nach einer RPM-Installation von Bacula funktionieren die Skripte in /etc/bacula, die sich um die Datenbank kümmern, nicht mehr richtig.

Der Grund dafür ist, daß in diesen Dateien u.a. der Pfad zu den Datenbank-Clientprogrammen von MySql, Postgresql und Sqlite relativ zu dem Pfad /root/rpmbuild/BUILD/bacula-1.38.11 definiert wird. Dieser Pfad existiert jedoch nur während des Build-Prozesses. Auf einer Produktions-Maschine gibt es ihn nicht. Die vorläufige Lösung ist, die Pfade manuell entsprechend manuell anzupassen. Die betroffenen Dateien sind make_catalog_backup, make\_sqlite\_tables, create\_sqlite\_database und grant\_sqlite\_privileges.

Für die Sqlite-Variante von Version 1.38.11 habe ich einen Patch generiert.