About me: My name is Solène Rapenne, pronouns she/her. I like learning and sharing knowledge. Hobbies: '(BSD OpenBSD Lisp cmdline gaming internet-stuff). I love percent and lambda characters. OpenBSD developer solene@.

Contact me: solene on Freenode, solene+www at dataswamp dot org or @solene@bsd.network (mastodon). If for some reason you want to give me some money, I accept paypal at the address donate@perso.pw.

Backup software: borg vs restic

Written by Solène, on 21 May 2021.
Tags: #backup #openbsd #unix

Comments on Mastodon

Introduction §

Backups are important, lot of our life is now related to digital data and it's important to take care of them because computers are unreliable, can be stolen and mistakes happen. I really like two programs which are restic and borg, they have nearly the same features but it's hard to decide between both, this is an attempt to understand the differences for my use case.

Restic §

Restic is a backup software written in Go with a "push" workflow, it supports data deduplication within a repository and multiple systems using the same repository and also encryption.

Restic can backup to a remote sftp server but also many network services storage like S3/Minio and even more when using with the program rclone (which can turn any supported backend into a compatible restic backend). Restic seems compatible with Windows (I didn't try).

restic website

Borg §

Borg is a backup software written in Python with a "push" workflow, it supports encryption, data deduplication within a repository and compression. You can backup to a remote server using ssh but the remote server requires borg to be installed.

It's a very good and reliable backup software. It has a companion app named "borgmatic" to automate the backup process and snapshots managements (daily/hourly/monthly ... and integrity checking).

*BSD specific note: borg can honor the "nodump" flag in the filesystem to skip saving those files.

borgbackup website

borgmatic website

Experiment §

I've been making a backup of my /home/ partition (minus some directories that has been excluded in both cases) using borg and restic. I always performed the restic backup and then the borg backup, measuring bandwidth for each and execution time for each.

There are five steps: init for the first backup of lot of data, little changes twice, which is basically opening firefox, browsing a few pages, closing it, refreshing my emails in claws-mail (this changes a lot of small files) and use the computer for an hour. There is a massive change as fourth step, I found a few game installers that I unzipped, producing lot of small files instead of one big file and finally, 24h of normal use between the fourth and last step which is a good representation of a daily backup.

Data §

				restic	borg
Data transmitted (MB)
---------------------
Backup 1 (init)			62860	53730
Backup 2 (little changes)	15	26
Backup 3 (little changes)	168	171
Backup 4 (massive changes)	4820	3910
Backup 5 (typical day of use)	66	44
		
Local cache size (MB)
---------------------
Backup 1 (init)			161	45
Backup 2 (little changes)	163	45
Backup 3 (little changes)	207	46
Backup 4 (massive changes)	211	47
Backup 5 (typical day of use)	216	47
		
Backup time (seconds)
---------------------
Backup 1 (init)			2139	2999
Backup 2 (little changes)	38	131
Backup 3 (little changes)	43	114
Backup 4 (massive changes)	201	355
Backup 5 (typical day of use)	50	110

Repository size (GB)		65	56

Analysis §

Borg was a lot slower than restic but in my experiment the remote ssh server is a dual core atom system, borg is using a process on the other end to manage the data, so maybe that CPU was slowing the backup process. Nevertheless, in my real use case, borg is effectively slower.

Most of the time, borg was more bandwidth effective than restic: it saved 15% of bandwidth for the first backup and 18% after some big changes, but in some cases it used a bit more bandwidth. I have no explanation for this, I guess it depends how file chunks are calculated, if a big database file is changing then one may be able to save only the difference and not the whole file. Borg is also compressing the data (using lz4 by default), this may explain the bandwidth saving that doesn't work for binary data.

The local cache (typically in /root/.cache/) was a lot bigger for restic than for borg, and was increasing slightly at each new backup while borg cache never changed much.

Finally, the whole repo size holding all the snapshots has a different size for restic and borg, respectively 65 GB and 56 GB, which makes a 14% difference between each which may due to the compression done by borg.

Other backup software §

I tested Restic and Borg because they are both good software using the "push" workflow (local computer sends the data) making full snapshots of every backup, but there are many other backup solution available.

- duplicity: fully scriptable, works over many remote protocols but requires a full snapshot and then incremental snapshots to work, when you need to make a new full snapshot it will take a lot of space which is not always convenient. Supports GPG encrypted backup stored over FTP, this is useful for some dedicated server offering 100GB of free FTP.

- burp: not very well known, the setup uses TLS certificates for encryption, requires a burp server and a burp client

- rsnapshot: based on rsync, automate the rotation of backups, use hard links to avoid data duplication for files that didn't change between two backups, it pulls data from servers from a central backup system.

- backuppc: a perl app that will pull data from servers to its repository, not really easy to use

- bacula: enterprise grade solution that I never got to work because it's really complicated but can support many things, even saving on tapes

Conclusion §

In this benchmark, borg is clearly slower but was the most storage and bandwidth efficient. On the other hand, restic is easier to deploy (static binary) and supports a simple sftp server while borg requires borg installed on both sides.

A biggest difference between restic and borg, is that restic supports multiples systems backup in the same repository, allowing a massive data deduplication gain across machines, while a borg repository is for single system (it could work with multiples systems but they should not backup at the same time and they would have to rebuild the local cache every time which is slow).

I'll stick with borg because the backup time isn't a real issue given it's not dramatically slower than restic and that I really enjoy using borgmatic to automatically manage the backups.

For doing backups to a remote server over the Internet, the bandwidth efficiency would be my main concern of all the differences, borg seems a clear winner here.