Here I'm republishing an old blog post of mine originally from October 2016. The article has been slightly improved.

Background: This is the final part of the tutorial. Originally I had written a different part 5 post but re-worked it almost completely as I hadn't been happy with it. Also I intended to revisit this topic at some point, but considering that I never got any feedback, I figured that it was not too interesting for my readers. So don't look for a part 6 or something.

Bacula on FreeBSD (pt. 5): A day at the pool

This is part five of my Bacula tutorial. The first part covered some basics as well as installing Bacula and starting the three daemons that it consists of.

Part two dealt with modifying the default configuration files in a way that allowed all components of Bacula to interact with each other locally. A deliberate configuration error was debugged and finally a test backup was done (without knowing details like what exactly would even be backed up!) just to ensure that communication between the daemons really works.

In part three, the configuration was cleaned up and split into smaller parts, the first self-created resources (fileset, device and storage) were added and a backup job customized using the bconsole.

Part four detailed restoring files from backup and discussed jobs as well as volumes, labels and pools.

Bacula on FreeBSD (pt. 1): Introduction - Bacula backup basics

Bacula on FreeBSD (pt. 2): Bconsole - ruling the Director

Bacula on FreeBSD (pt. 3): Customizing configuration

Bacula on FreeBSD (pt. 4): Jobs, volumes, pools & a restore

Part five will show how to create a new _storage pool_. We'll use memory disks so that we can simulate partitions. Then we'll talk about _jobs_ again and have a closer look at _volumes_ in our pool and some of the states they can have. Finally we'll briefly touch upon the topic of _recycling volumes_. You'll need to know all that (and actually some more) prior to planning your real pool(s).

The fifth part continues where the first one left off. During this tutorial series we use _VirtualBox VMs_ managed by _Vagrant_ to make things as simple as possible. If you don't know how to use them, have a look at two articles I wrote for a detailed introduction into using Vagrant and on how to prepare a VM template with FreeBSD that is used throughout all parts of this Bacula tutorial.

Vagrant: Creating a FreeBSD 11 base box (virtualbox) - pt. 1

Vagrant: Creating a FreeBSD 11 base box (virtualbox) - pt. 2

Preparations for a storage pool

Load the latest snapshot, enter the VM and switch to root:

% cd ~/vagrant/backuphost
% vagrant snapshot restore tut_4_end
% vagrant ssh
% sudo su -

Our VM has a simple partition scheme where essentially everything is put in one partition. A volume that provides storage backed by a file will keep growing until the disk is full. To enforce restrictions we have to create a pool. Also our VM certainly does not have a tape drive or anything like that. Still I'd like to simulate filling a medium! How can we do that? Probably by creating virtual disk-backed storage for which we can set a fixed size.

This time we'll backup a directory with a bit more data in it. I suggest _/usr/bin_. Let's see how big it is (mind this size! Especially if you're using a version newer than the 11.0 release that I'm using here the size may be different):

# du -sh /usr/bin
143M /usr/bin

Ok, we're going to create a sparse file of 300 MB in size next (so that at a bit more than two full backups would fit in there) as well as a directory for the mountpoint:

# truncate -s 300M /var/backup/ufs0
# mkdir /mnt/storage0

Now we need to make the storage accessible as a _device_ rather than just a file. FreeBSD calls these devices _memory disks_ because most of the time you'll want such a device to be backed by RAM (to benefit from its speed), but file-backed storage is also possible. As a first step we need to create the device and, as a second one, put a filesystem on it. Then in step three we can mount it on a directory to make it available in the machine's global filesystem hierarchy. Fortunately FreeBSD is pretty good in doing memory disks and comes with a command that can do all three steps at once:

# mdmfs -F /var/backup/ufs0 -S -n -p 700 -w bacula:bacula md0 /mnt/storage0

Let's check if that worked:

# mount
[... some output ...]

See the mount? Looks like we have our file backed storage in place.

Configuration changes

Now we can create a new fileset to back up _/usr/bin_:

# vi /usr/local/etc/bacula/includes/dir_fileset.conf

Add the following lines at the end of the file:

FileSet {
Name = "usr-bin"
Include {
Options {
signature = MD5
}
File = /usr/bin
}
}

Save the file and exit. Of course we need to tell the sd where to store the backups - we need to create another device:

# vi /usr/local/etc/bacula/includes/sd_device.conf

Add this device resource at the end of the file:

Device {
Name = Stor0-Test
Media Type = Test
Archive Device = /mnt/storage0
LabelMedia = yes;
Random Access = Yes;
AutomaticMount = yes;
RemovableMedia = no;
AlwaysOpen = no;
Maximum Concurrent Jobs = 5
}

And the director needs to know about this storage as well:

# vi /usr/local/etc/bacula/includes/dir_storage.conf

Add this block at the end of the file:

Storage {
Name = Test
Address = localhost
SDPort = 9103
Password = "sdPASSWORD"
Device = Stor0-Test
Media Type = Test
Maximum Concurrent Jobs = 10
}

We've done all this before but the next step is new. We're going to create a pool:

# vi /usr/local/etc/bacula/includes/dir_pool.conf

Again add some lines to the file:

Pool {
Name = Testpool
Pool Type = Backup
Recycle = no
Maximum Volume Bytes = 90M
}

Just set _recycle_ to "no" for now - you'll see what that does in a minute. We also want to force a maximum volume size of less than what the backup is worth of data so that two volumes will be needed for one full backup. And to avoid having to use "mod" at the bconsole all the time, we'll add a new job for convenience:

# vi /usr/local/etc/bacula/includes/dir_job.conf

Here's the job resource to put at the end of the file:

Job {
Name = "TestJob"
JobDefs = "DefaultJob"
Level = Full
FileSet = usr-bin
Storage = Test
Pool = Testpool
}

Jobs again

Ok. Our basic preparations are done. Let's restart the daemons and then try and see what happens if we run our new backup job:

# service bacula-dir restart
# service bacula-sd restart
# bconsole
* run
4
yes
* mes
[...]
29-Oct 12:48 backuphost.local-sd JobId 4: Job Testjob.2016-10-31_12.48.33_03 is waiting. Cannot find any appendable volumes.
Please use the “label” command to create a new Volume for:
[...]

The job cannot start because there are no volumes that Bacula could use. We could now create volumes using the _label_ command as we've done before. But we configured our device with the _LabelMedia_ directive set to "yes"! It should create volumes automatically as needed! Time to look at the configuration again. But let's first take the chance to cancel the currently pending job.

We already know the _JobId_ from the message above. But let's pretend we didn't know. We'll ask the director for an overview of jobs (both past and present):

* list jobs

There we have a simple table. Let's see what information it holds. First we have a unique _JobId_ for each job. The second column holds the name of the backup job - this will usually be the name of the client or _RestoreFiles_ for a restore job. But as you can see in our case, "TestJob" will work as well (however you really should stick to names that hint which client they belong to in a production environment as things would get pretty confusing really fast). _StartTime_ is self-explanatory. Type is "B" for _backup_ and "R" for _restore_ in our example. We’ve only run full backup level jobs so far. _JobFiles_ and _JobBytes_ is self-explanatory again. And in case of _JobStatus_, a "T" means _terminated_, an "R" _running_, "A" is for _aborted_, etc. Now we'll cancel the job that is waiting for a new volume:

* cancel 4
yes
* mes
[...]
Backup Canceled
[...]

So we've canceled the job. Let's exit the console now and take a look at the pool configuration again:

* exit
# vi /usr/local/etc/bacula/includes/dir_pool.conf

Our pool resource is missing a directive that specifies how the label names are to be composed. We should add that one line real quick:

Label Format = "Test-"

Now we need to restart the dir and invoke the bconsole again:

# service bacula-dir restart
# bconsole

Turn up the volume(s)!

Then we can take a look at the volumes that we have so far:

* list volumes
[...]
Pool: Testpool
No results to list.
[...]

The pool _Testpool_ is empty right now. Take a look at the pools used for our previous jobs and get an idea of what they look like. Now let's run the test job again and see what happens:

* run
4
yes
* mes
[...]
Labeled new Volume “Test-0003” on file device "Stor0-Test" (/mnt/storage0).
[...]

So auto labeling obviously works. This is one huge benefit of using a pool! But there are others like limiting the volume sizes and many more. Did the backup complete successfully by now?

* mes
[...]

Backup OK

[...]

It did. Time to look at the volumes again; look for the column _VolStatus_:

* list volumes
[...]
Full
Append
[...]

The first volume has a _VolStatus_ of _Full_ and the second one has _Append_ which means that more backup data can be written to it. We'll do that by simply running our test job again:

* run
4
yes
* mes
[...]
WARNING: device is full! Please add more disk space then ...
Please mount append Volume "Test-0005" or label a new one for:
[...]

We've created a really small ramdisk for /mnt/storage0 and it is full before the second full backup could be completed. But how's that? There's 300 MB of space and the fileset backs up less that 150 MB! Is there so much overhead? No, not really. What has happened here is that we restricted the pool to volumes of 90 MB each. The three volumes occupy 270 MB - and while there's 30 MB more left on the pool, that's too little space to create another volume on it! So what do we do now? First have a look at the volume list again:

* list volumes
[...]
Full
Full
Full
[...]

Coming full cycle

All of them are listed as _Full_. But there's some more interesting info there. Notice the _Recycle_ column? We've forbidden Bacula to recycle old volumes when we defined the pool resource. We can fix that, right? Let's exit the bconsole and edit the configuration file:

* exit
# vi /usr/local/etc/bacula/includes/dir_pool.conf

Change the respective line to:

Recycle = yes

Then save the file and exit the editor. The configuration changed and so the dir needs to reinitialize. To do so we restart it.

# service bacula-dir restart

Seems like it's not responding? Hit _CTRL+C_ to cancel. Let's stop and start it instead:

# service bacula-dir stop
# service bacula-dir start

That worked. Now we enter the bconsole again and take a look at the volumes:

# bconsole
* list volumes

Huh? That configuration change didn't work! The recycling flag is still set to 0. Why? There is an easy answer to that: Because this value is not read from the configuration! It comes from the catalog. The configuration setting is applied at the moment a new value is created. Once the volume exists, the configuration setting is irrelevant. Of course we are not out of luck here. We can modify the flag using the bconsole (and yes, that asterisk right before the _MediaId_ (the three in this case) is NOT a prompt symbol; type it in!):

* update
1
7
4
*3
* yes
18

Let's see if that worked:

* list volumes
[... some output ...]

It did! So will Bacula now reuse the old volume and overwrite all data on it? No it won't. Bacula knows that there's data on it because it keeps track of all that in the catalog. And it tries to preserve that data even though the volume allows recycling.

Purging a volume

However we can tell Bacula to get rid of the catalog data that references this volume. To do so, we purge job and volume information for that volume from the catalog:

purge [purge jobs volume]
3
4
*3
* list volumes
[...]
Purged
Full
Full
[...]

See how the _VolStatus_ changed? It's purged *and* recycling is enabled. That means that Bacula will reuse the volume. But will our job resume automatically? Let's take a look:

* list jobs
[... output ...]

Oh no! What's that _JobStatus_? It's "f" for _failed_! What happened? Well, we stopped the director, remember? That killed the running job! So to try out recycling we need to run another backup job. Let's start the job now:

* run
4
yes

Now take a look at the volumes again:

[...]
Recycle
Full
Full
[...]

The VolStatus changed to _Recycle_ and Bacula will reuse the old volume. In theory we'd have to purge one more volume for our backup to succeed. But since this is just an example job to show off some important things, we're actually done at this point. Enough for today.

Save the current status for later:

* exit
# shutdown -p now
% vagrant status
% vagrant snapshot save tut_5_end

Intermission

Now you know the basics of pool creation and some of the features that come with it. You've also purged and recycled a volume and should have a better understanding of how Bacula works in general. There's a lot more to pools however and the next post in this series should probably go into retention periods and discuss a topic that we've only touched upon so far: The Catalog.

However this post concludes my "Bacula October" and I'll end this tutorial series here. It takes a lot of effort and time to write these posts and while I hope that this is of any use to somebody, I have no idea whether it is or not. For that reason I might or might not take this topic up again in the future. I had planned to simulate multiple backup clients with Vagrant, to do encrypted backups and so on. But now I'm looking forward to write about something else again! Of course feel free to comment on any of the parts if you liked (or want to tell my why you didn't like) this tutorial.

BACK TO 2016 OVERVIEW