About the author

My name is Solène Rapenne. I like to learn and share my knowledge with other. With this blog I can share my experiences and issues. Some of my interests : '(BSD Lisp Emacs cli-tool Web-infrastructure Gaming Crossbow). I love % and lambda characters.

Contact : solene on Freenode or solene+www at dataswamp dot org

This website is generated using cl-yag. A gopher version is available here

Monitor your systems with reed-alert

Written by Solène, on 17 January 2018.
Tags: #unix #monitoring #lisp

This article will present my software reed-alert, it checks user-defined states and send user-defined notification. I made it really eay to use but still configurable and extensible.

Description

reed-alert is not a monitoring tool producing graph or storing values. It does a job sysadmins are looking for because there are no alternative product (the alternatives comes from a very huge infrastructure like Zabbix so it’s not comparable).

From its configuration file, reed-alert will check various states and then, if it fails, will trigger a command to send a notification (totally user-defined).

Fetch it

This is a open-source and free software released under MIT license, you can get it with the following command :

git clone git://bitreich.org/reed-alert
cd reed-alert

A README file is available as documentation to describe how to use it, but we will see here how to get started quickly.

You will find a few files there, reed-alert is a Common LISP software and it has been choosed for (I hope) good reasons that the configuration file is plain Common LISP.

There is a config file looking like a real world example named config.lisp.sample and another configuration file I use for testing named example.lisp containing lot of cases.

Let’s start

In order to use reed-alert we only need to create a new configuration file and then add a cron job.

Configuration

We are going to see how to configure reed-alert. You can find more explanations or details in the README file.

Alerts

We have to configure two kind of parameters, first we need to set-up a way to receive alerts, easiest way to do so is by sending a mail with “mail” command. Alerts are declared with the function alert and as parameters the alert name and the command to be executed. Some variables are replaced with values from the probe, in the README file you can find the list of probes, it looks like %date% or %params%.

In Common LISP functions are called by using a parenthesis before its name and until the parenthesis is closed, we are giving its parameters.

Example :

(alert mail "echo 'problem on %hostname%' | mail me@example.com")

One should take care about nesting quotes here.

reed-alert will fork a shell to start the command, so pipes and redirections works. You can be creative when writing alerts that :

  • use a SMS service
  • write a script to post on a forum
  • publishing a file on a server
  • send text to irc with ii client

Checks

Now we have some alerts, we will configure some checks in order to make reed-alert useful. It uses probes which are pre-defined checks with parameters, a probe could be “has this file not been updated since N minutes ?” or “Is the disk space usage of partition X more than Y ?”

I choosed to name the function “=>” to make a check, it isn’t a name and reminds an item or something going forward. Both previous example using our previous mail notifier would look like :

(=> mail file-updated :path "/program/file.generated" :limit "10")
(=> mail disk-usage   :limit 90)

It’s also possible to use shell commands and check the return code using the command probe, allowing the user to define useful checks.

(=> mail command :command "echo '/is-this-gopher-server-up?' | nc -w 3 dataswamp.org 70"
                 :desc "dataswamp.org gopher server")

We use echo + netcat to check if a connection to a socket works. The :desc keyword will give a nicer name in the output instead of just “COMMAND”.

Garniture

We wrote the minimum required to configure reed-alert, now the configuration file requires only two more lines, so your my-config.lisp file should looks like the following :

(load "functions.lisp")
(alert mail "echo 'problem on %hostname%' | mail me@example.com")
(=> mail file-updated :path "/program/file.generated" :limit "10")
(=> mail disk-usage   :limit 90)
(quit)

The first line is mandatory to make things working, the last line will prevent the lisp interpreter to stay in REPL (user interactive input).

Now, you can start it every 5 minutes from a crontab with this :

*/5 * * * * ( cd /home/user/projects/reed-alert/ ; sbcl --load my-config.lisp )

If you prefer to use ecl :

*/5 * * * * ( cd /home/user/projects/reed-alert/ ; ecl --load my-config.lisp )

The time between each run is up to you, depending on what you monitor.

Important

By default, when a check returns a failure, reed-alert will only trigger the notifier associated once it reach the 3rd failure. And then, will notify again when the service is back (the variable %state% is replaced by start or end to know if it starts or stops.)

This is to prevent reed-alert to send a notification each time it checks, there is absolutely no need for this for most users.

The number of failures before triggering can be modified by using the keyword “:try” as in the following example :

(=> mail disk-usage :limit 90 :try 1)

In this case, you will get notified at the first failure of it.

The number of failures of failed checks is stored in files (1 per check) in the “states/” directory of reed-alert working directory.

New cl-yag version

Written by Solène, on 16 December 2017.
Tags: #unix

Introduction

cl-yag is a static website generator. It's a software used to publish a website and/or a gopher hole from a list of articles. As the developer of cl-yag I'm happy to announce that a new version has been released.

New features

The new version, with its number 0.6, bring lot of new features :

  • supporting different markup language per article
  • date format configurable
  • gopher output format configurable
  • ship with the default theme "clyma", minimalist but responsive (the one used on this website)
  • easier to use
  • full user documentation

The code is available at git://bitreich.org/cl-yag, the program requires sbcl or ecl to work.

Per article markup language

The best feature I'm proud of is allowing to use a different language per article. While on my blog I choosed to use markdown, it's sometimes not adapted for more elaborated articles like the one about LISP containing code which was written in org-mode then converted to markdown manually to fit to cl-yag. Now, the user can declare a named "converter" which is a command line with pattern replacement, to produce the html file. We can imagine a lot of things with this, even producing a gallery with find + awk command. Now, I can use markdown by default and specify if I want to use org-mode or something else.

This is the way to declare a converter, taking org-mode as example, which is not very simple, because of emacs not being script friendly :

(converter :name :org-mode  :extension ".org"
	   :command (concatenate 'string
				 "emacs data/%IN --batch --eval '(with-temp-buffer (org-mode) "
				 "(insert-file \"%IN\") (org-html-export-as-html nil nil nil t)"
				 "(princ (buffer-string)))' --kill | tee %OUT"))

And an easy way to produce a gallery with awk from a .txt file containing a list of images path.

(converter :name :gallery :extension ".txt"
	   :command (concatenate 'string
				 "awk 'BEGIN { print \"<div class=\\\"gallery\\\">\"} "
				 "{ print \"<img src=\\\"static/images/\"$1\"\\\" />\" } "
				 " END { print  \"</div>\"} data/%IN | tee %OUT"))

The concatenate function is only used to improve the presentation, to split the command in multiples lines and make it easier to read. It's possible to write all the command in one line.

The patterns %IN and %OUT are replaced by the input file name and the output file name when the command is executed.

For an easier example, the default markdown converter looks like this, calling multimarkdown command :

(converter :name :markdown :extension ".md"
	   :command "multimarkdown -t html -o %OUT data/%IN")

It's really easy (I hope !) to add new converters you need with this feature.

Date format configurable

One problem I had with cl-yag is that it's plain vanilla Common LISP without libraries, so it's easier to fetch and use but it lacks some elaborated libraries like one to parse date and format a date. Before this release, I was writing in plain text "14 December 2017" in the date field of a blog post. It was easy to use, but not really usable in the RSS feed in the pubDate attribute, and if I wanted to change the display of the date for some reason, I would have to rewrite everything.

Now, the date is simply in the format "YYYYMMDD" like "20171231" for the 31rd December 2017. And in the configuration variable, there is a :date-format keyword to define the date display. This variable is a string allowing pattern replacement of the following variables :

%DayNumber
day of the month in number, from 1 to 31
%DayName
day of the week, from Monday to Sunday, names are written in english in the source code and can be translated
%MonthNumber
month in number, from 1 to 12
%MonthName
month name, from January to December, names are written in english in the source code and can be translated
%Year
year

Currently, as the time of writing, I use the value "%DayNumber %MonthName %Year"

A :gopher-format keyword exist in the configuration file to configure the date format in the gopher export. It can be different from the html one.

More Gopher configuration

There are cases where the gopher server use an unusual syntax compared to most of the servers. I wanted to make it configurable, so the user could easily use cl-yag without having to mess with the code. I provide the default for geomyidae and in comments another syntax is available. There is also a configurable value to indicates where to store the gopher page menu, it's not always gophermap, it could be index.gph or whatever you need.

Easier to use

A comparison of code will make it easier to understand. There was a little change the way blog posts are declared :

From

(defparameter *articles*
  (list
   (list :id "third-article"  :title "My third article" :tag "me" :date "20171205")
   (list :id "second-article" :title "Another article"  :tag "me" :date "20171204")
   (list :id "first-article"  :title "My first article" :tag "me" :date "20171201")
   ))

to

(post :id "third-article"  :title "My third article" :tag "me" :date "20171205")
(post :id "second-article" :title "Another article"  :tag "me" :date "20171204")
(post :id "first-article"  :title "My first article" :tag "me" :date "20171201")

Each post are independtly declared and I plan to add a "page" function to create static pages, but this is going to be for the next version !

Future work

I am very happy to hack on cl-yag, I want to continue improving it but I should really think about each feature I want to add. I want to keep it really simple even if it limits the features.

I want to allow the creation of static pages like "About me", "Legal" or "websites I liked" that integrates well in the template. The user may not want all the static pages links to go at the same place in the template, or use the same template. I'm thinking about this.

Also, I think the gopher generation could be improved, but I still have no idea how.

Others themes may come in the default configuration, allowing the user to have a choice between themes. But as for now, I don't plan to bring a theme using javascript.

How to type using only one hand : keyboard mirroring

Written by Solène, on 12 December 2017.
Tags: #unix

Hello

Today is a bit special because I’m writing with a mirror keyboard layout. I use only half my keyboard to type all characters. To make things harder, the layout is qwerty while I use azerty usually (I’m used to qwerty but it doesn’t help).

Here, “caps lock” is a modifier key that must be pressed to obtain characters of the other side. As a mirror, one will find ‘p’ instead of ‘q’ or ‘h’ instead of ‘g’ while pressing caps lock.

It’s even possible to type backspace to delete characters or to achieve a newline. All the punctuation isn’t available throught this, only ‘.<|¦>’",’.

While I type this I get a bit faster and it become more and more easier. It’s definitely worth if you can’t use hands two.

This a been made possible by Randall Munroe. To enable it just download the file Here and type

xkbcomp mirrorlayout.kbd $DISPLAY

backspace is use with tilde and return with space, using the modifier of course.

I’ve spent approximately 15 minutes writing this, but the time spent hasn’t been linear, it’s much more fluent now !

Mirrorboard: A one-handed keyboard layout for the lazy by Randall Munroe

Bandwidth limit / queue on OpenBSD 6.1

Written by Solène, on 25 April 2017.
Tags: #openbsd #unix #network

Today I will explain how to do traffic limit with OpenBSD and PF. This is not hard at all if you want something easy, the man page pf.conf(5) in QUEUEING section is pretty good but it may disturbing when you don’t understand how it works. This is not something I master, I’m not sure of the behaviour in some cases but the following example works as I tested it ! :)

Use case

Internet is down at home, I want to use my phone as 4G router trough my OpenBSD laptop which will act as router. I don’t want the quota (some Gb) to be eaten in a few seconds, this connection allow to download up to 10 Mb/s so it can go quickly !

We will limit the total bandwidth to 1M (~ 110 kb/s) for people behind the NAT. It will be slow, but we will be sure that nothing behind the NAT like a program updating, cloud stuff synchronizing or videos in auto play won’t consume our quota.

Edit /etc/pf.conf accordigly to your network

internet="urndis0"
lan="em0"

# we define our available bandwidth
queue main on $lan bandwidth 100M

# we will let 1M but we will allow
# 3M during 200 ms when initiating connection to keep the web a bit interactive
queue limited parent main bandwidth 1M min 0K max 1M burst 3M for 200ms default

set skip on lo

# we do NAT here
match out on egress inet from !(egress:network) to any nat-to (egress:0)

block all
pass out quick inet

# we apply the queue here on EVERYTHING coming from the internet
pass in on $lan set queue limited

This ONLY defines queue for DOWNLOADING, you can only set the queue on the lan interface, this won’t work on egress (network interface having internet) because you can’t limit what go in your interface, it’s already there when you want to limit.

Per protocol ?

You can define queues per remote port by creating new queues and doing something like this :

pass in on $lan proto tcp port ssh set queue ssh
pass in on $lan proto tcp port www set queue web

Per host ?

As before, you can apply queues on IP host/range rather than protocols, or you can even mix both if you want.

Warning

The limit function changed in OpenBSD 5.5, everything you can read on the internet about ALTQ isn’t working anymore.

Markup languages comparison

Written by Solène, on 13 April 2017.
Tags: #unix

For the fun, here is a few examples of the same output in differents markup languages. The list isn’t exhaustive of course.

This is org-mode :

* This is a title level 1

+ first item
+ second item
+ third item with a [[http://dataswamp.org][link]]

** title level 2

Blah blah blah blah blah
blah blah blah *bold* here

#+BEGIN_SRC lisp
(let ((hello (init-string)))
   (format t "~A~%" (+ 1 hello))
   (print hello))
#+END_SRC

This is markdown :

# this is title level 1

+ first item
+ second item
+ third item with a [Link](http://dataswamp.org)

## Title level 2

Blah blah blah blah blah
blah blah blah **bold** here

    (let ((hello (init-string)))
       (format t "~A~%" (+ 1 hello))
       (print hello))

or

```
(let ((hello (init-string)))
   (format t "~A~%" (+ 1 hello))
   (print hello))
```

This is HTML :

<h1>This is title level 1</h1>
<ul>
  <li>first item></li>
  <li>second item</li>
  <li>third item with a <a href="http://dataswamp.org">link</a></li>
</ul>

<h2>Title level 2</h2>

<p>Blah blah blah blah blah
  blah blah blah <strong>bold</strong> here

<code><pre>(let ((hello (init-string)))
   (format t "~A~%" (+ 1 hello))
   (print hello))</pre></code>

This is LaTeX :

\begin{document}

\section{This is title level 1}

\begin{itemize}
\item First item
\item Second item
\item Third item
\end{itemize}

\subsection{Title level 2}

Blah blah blah blah blah
blah blah blah \textbf{bold} here

\begin{verbatim}
(let ((hello (init-string)))
    (format t "~A~%" (+ 1 hello))
    (print hello))
\end{verbatim}

\end{document}

OpenBSD 6.1 released

Written by Solène, on 11 April 2017.
Tags: #openbsd #unix

Today OpenBSD 6.1 has been released, I won’t copy & paste the change list but, in a few words, it gets better.

Link to the official announce

I already upgraded a few servers, with both methods. One with bsd.rd upgrade but that requires physical access to the server and the other method well explained in the upgrade guide which requires to untar the files and do move some files. I recommend using bsd.rd if possible.

Connect to pfsense box console by usb

Written by Solène, on 10 April 2017.
Tags: #unix #network #openbsd

Hello,

I have a pfsense appliance (Netgate 2440) with a usb console port, while it used to be a serial port, now devices seems to have a usb one. If you plug an usb wire from an openbsd box to it, you woull see this in your dmesg

uslcom0 at uhub0 port 5 configuration 1 interface 0 "Silicon Labs CP2104 USB to UART Bridge Controller" rev 2.00/1.00 addr 7
ucom0 at uslcom0 portno 0

To connect to it from OpenBSD, use the following command :

doas cu -l /dev/cuaU0 -s 115200

And you’re done

List of useful tools

Written by Solène, on 22 March 2017.
Tags: #unix

Here is a list of software that I find useful, I will update this list everytime I find a new tool. This is not an exhaustive list, theses are only software I enjoy using :

Backup Tool

  • duplicity
  • borg backup
  • restore/dump

File synchronization tool

  • unison
  • rsync

File sharing tool / “Cloud”

  • nextcloud / owncloud
  • seafile
  • pydio
  • syncthing (works as peer-to-peer without a master)
  • sparkleshare (uses a git repository so I would recommend storing only text files)

Editors

  • emacs
  • vim
  • jedit

Web browsers using keyboard

  • qutebrowser
  • xombrero (obsolete)
  • firefox with vimperator extension

Todo list / Personal Agenda…

  • productivity
  • when
  • org-mode (within emacs)

Mail client

  • mu4e (inside emacs, requires the use of offlineimap or mbsync to fetch mails)
  • alpine
  • sylpheed

Network

  • curl
  • bwm-ng (to see bandwith usage in real time)
  • mtr (traceroute with a gui that updates every n seconds)

Files integrity

  • bitrot
  • par2cmdline
  • aide

Image viewer

  • sxiv
  • feh

Stuff

  • entr (run command when a file change)
  • rdesktop (RDP client to connect to Windows VM)
  • xclip (read/set your X clipboard from a script)
  • autossh (to create tunnels that stays up)
  • mosh (connects to your ssh server with local input and better resilience)
  • ncdu (watch file system usage interactively in cmdline)
  • mupdf (PDF viewer)
  • pdftk (PDF manipulation tool)
  • x2x (share your mouse/keyboard between multiple computers through ssh)
  • profanity (XMPP cmdline client)
  • prosody (XMPP server)
  • pgmodeler (PostgreSQL database visualization tool)

How to check your data integrity ?

Written by Solène, on 17 March 2017.
Tags: #unix #security #data

Today, the topic is data degradation, bit rot, birotting, damaged files or whatever you call it. It’s when your data get corrupted over the time, due to disk fault or some unknown reason.

What is data degradation ?

I shamelessy paste one line from wikipedia : “Data degradation is the gradual corruption of computer data due to an accumulation of non-critical failures in a data storage device. The phenomenon is also known as data decay or data rot.”.

Data degradation on Wikipedia

So, how do we know we encounter a bit rot ?

bit rot = (checksum changed) && NOT (modification time changed)

While updating a file could be mistaken as bit rot, there is a difference

update = (checksum changed) && (modification time changed)

How to check if we encounter bitrot ?

There is no way you can prevent bitrot. But there are some ways to detect it, so you can restore a corrupted file from a backup, or repair it with the right tool (you can’t repair a file with a hammer, except if it’s some kind of HammerFS ! :D )

In the following I will describe software I found to check (or even repair) bitrot. If you know others tools which are not in this list, I would be happy to hear about it, please mail me.

In the following examples, I will use this method to generate bitrot on a file :

% touch -d "2017-03-16T21:04:00" my_data/some_file_that_will_be_corrupted
% generate_checksum_database_with_tool
% echo "a" >> my_data/some_file_that_will_be_corrupted
% touch -d "2017-03-16T21:04:00" my_data/some_file_that_will_be_corrupted
% start_tool_for_checking

We generate the checksum database, then we alter a file by adding a “a” at the end of the file and we restore the modification and acess time of the file. Then, we start the tool to check for data corruption.

The first touch is only for convenience, we could get the modification time with stat command and pass the same value to touch after modification of the file.

bitrot

This is a python script, it’s very easy to use. I will scan a directory and create a database with the checksum of the files and their modification date.

Initialization usage :

% cd /home/my_data/
% bitrot
Finished. 199.41 MiB of data read. 0 errors found.
189 entries in the database, 189 new, 0 updated, 0 renamed, 0 missing.
Updating bitrot.sha512... done.
% echo $?
0

Verify usage (case OK) :

% cd /home/my_data/
% bitrot
Checking bitrot.db integrity... ok.
Finished. 199.41 MiB of data read. 0 errors found.
189 entries in the database, 0 new, 0 updated, 0 renamed, 0 missing.
% echo $?
0

Exit status is 0, so our data are not damaged.

Verify usage (case Error) :

% cd /home/my_data/
% bitrot
Checking bitrot.db integrity... ok.
error: SHA1 mismatch for ./sometextfile.txt: expected 17b4d7bf382057dc3344ea230a595064b579396f, got db4a8d7e27bb9ad02982c0686cab327b146ba80d. Last good hash checked on 2017-03-16 21:04:39.
Finished. 199.41 MiB of data read. 1 errors found.
189 entries in the database, 0 new, 0 updated, 0 renamed, 0 missing.
error: There were 1 errors found.
% echo $?
1

When something is wrong. As the exit status of bitrot isn’t 0 when it fails, it’s easy to write a script running every day/week/month.

Github page

bitrot is available in OpenBSD ports in sysutils/bitrot since 6.1 release.

par2cmdline

This tool works with PAR2 archives (see below for more informations about what PAR ) and from them, it will be able to check your data integrity AND repair it.

While it has some pros like being able to repair data, the cons is that it’s not very easy to use. I would use this one for checking integrity of long term archives that won’t changes. The main drawback comes from PAR specifications, the archives are created from a filelist, if you have a directory with your files and you add new files, you will need to recompute ALL the PAR archives because the filelist changed, or create new PAR archives only for the new files, but that will make the verify process more complicated. That doesn’t seems suitable to create new archives for every bunchs of files added in the directory.

PAR2 let you choose the percent of a file you will be able to repair, by default it will create the archives to be able to repair up to 5% of each file. That means you don’t need a whole backup for the files (while it’s would be a bad idea) and only an approximately extra of 5% of your data to store.

Create usage :

% cd /home/
% par2 create -a integrity_archive -R my_data
Skipping 0 byte file: /home/my_data/empty_file

Block size: 3812
Source file count: 17
Source block count: 2000
Redundancy: 5%
Recovery block count: 100
Recovery file count: 7

Opening: my_data/[....]
[text cut here]
Opening: my_data/[....]

Computing Reed Solomon matrix.
Constructing: done.
Wrote 381200 bytes to disk
Writing recovery packets
Writing verification packets
Done

% echo $?
0

% ls -1
integrity_archive.par2
integrity_archive.vol000+01.par2
integrity_archive.vol001+02.par2
integrity_archive.vol003+04.par2
integrity_archive.vol007+08.par2
integrity_archive.vol015+16.par2
integrity_archive.vol031+32.par2
integrity_archive.vol063+37.par2
my_data

Verify usage (OK) :

% par2 verify integrity_archive.par2 
Loading "integrity_archive.par2".
Loaded 36 new packets
Loading "integrity_archive.vol000+01.par2".
Loaded 1 new packets including 1 recovery blocks
Loading "integrity_archive.vol001+02.par2".
Loaded 2 new packets including 2 recovery blocks
Loading "integrity_archive.vol003+04.par2".
Loaded 4 new packets including 4 recovery blocks
Loading "integrity_archive.vol007+08.par2".
Loaded 8 new packets including 8 recovery blocks
Loading "integrity_archive.vol015+16.par2".
Loaded 16 new packets including 16 recovery blocks
Loading "integrity_archive.vol031+32.par2".
Loaded 32 new packets including 32 recovery blocks
Loading "integrity_archive.vol063+37.par2".
Loaded 37 new packets including 37 recovery blocks
Loading "integrity_archive.par2".
No new packets found

There are 17 recoverable files and 0 other files.
The block size used was 3812 bytes.
There are a total of 2000 data blocks.
The total size of the data files is 7595275 bytes.

Verifying source files:

Target: "my_data/....." - found.
[...cut here...]
Target: "my_data/....." - found.


All files are correct, repair is not required.
% echo $?
0

Verify usage (with error) :

par2 verify integrity_archive.par.par2                                                 
Loading "integrity_archive.par.par2".
Loaded 36 new packets
Loading "integrity_archive.par.vol000+01.par2".
Loaded 1 new packets including 1 recovery blocks
Loading "integrity_archive.par.vol001+02.par2".
Loaded 2 new packets including 2 recovery blocks
Loading "integrity_archive.par.vol003+04.par2".
Loaded 4 new packets including 4 recovery blocks
Loading "integrity_archive.par.vol007+08.par2".
Loaded 8 new packets including 8 recovery blocks
Loading "integrity_archive.par.vol015+16.par2".
Loaded 16 new packets including 16 recovery blocks
Loading "integrity_archive.par.vol031+32.par2".
Loaded 32 new packets including 32 recovery blocks
Loading "integrity_archive.par.vol063+37.par2".
Loaded 37 new packets including 37 recovery blocks
Loading "integrity_archive.par.par2".
No new packets found

There are 17 recoverable files and 0 other files.
The block size used was 3812 bytes.
There are a total of 2000 data blocks.
The total size of the data files is 7595275 bytes.

Verifying source files:


Target: "my_data/....." - found.
[...cut here...]
Target: "my_data/....." - found.
Target: "my_data/Ebooks/Lovecraft/Quete Onirique de Kadath l'Inconnue.epub" - damaged. Found 95 of 95 data blocks.

Scanning extra files:


Repair is required.
1 file(s) exist but are damaged.
16 file(s) are ok.
You have 2000 out of 2000 data blocks available.
You have 100 recovery blocks available.
Repair is possible.
You have an excess of 100 recovery blocks.
None of the recovery blocks will be used for the repair.

% echo $?
1

Repair usage :

% par2 repair integrity_archive.par.par2      
Loading "integrity_archive.par.par2".
Loaded 36 new packets
Loading "integrity_archive.par.vol000+01.par2".
Loaded 1 new packets including 1 recovery blocks
Loading "integrity_archive.par.vol001+02.par2".
Loaded 2 new packets including 2 recovery blocks
Loading "integrity_archive.par.vol003+04.par2".
Loaded 4 new packets including 4 recovery blocks
Loading "integrity_archive.par.vol007+08.par2".
Loaded 8 new packets including 8 recovery blocks
Loading "integrity_archive.par.vol015+16.par2".
Loaded 16 new packets including 16 recovery blocks
Loading "integrity_archive.par.vol031+32.par2".
Loaded 32 new packets including 32 recovery blocks
Loading "integrity_archive.par.vol063+37.par2".
Loaded 37 new packets including 37 recovery blocks
Loading "integrity_archive.par.par2".
No new packets found

There are 17 recoverable files and 0 other files.
The block size used was 3812 bytes.
There are a total of 2000 data blocks.
The total size of the data files is 7595275 bytes.

Verifying source files:

Target: "my_data/....." - found.
[...cut here...]
Target: "my_data/....." - found.
Target: "my_data/Ebooks/Lovecraft/Quete Onirique de Kadath l'Inconnue.epub" - damaged. Found 95 of 95 data blocks.

Scanning extra files:


Repair is required.
1 file(s) exist but are damaged.
16 file(s) are ok.
You have 2000 out of 2000 data blocks available.
You have 100 recovery blocks available.
Repair is possible.
You have an excess of 100 recovery blocks.
None of the recovery blocks will be used for the repair.


Wrote 361069 bytes to disk

Verifying repaired files:

Target: "my_data/Ebooks/Lovecraft/Quete Onirique de Kadath l'Inconnue.epub" - found.

Repair complete.

% echo $?
0

par2cmdline is only one implementation doing the job, others tools working with PAR archives exists. They should be able to all works with the same PAR files.

Parchive on Wikipedia

Github page

par2cmdline is available in OpenBSD ports in archivers/par2cmdline.

If you find a way to add new files to existing archives, please mail me.

mtree

One can write a little script using mtree (in base system on OpenBSD and FreeBSD) which will create a file with the checksum of every files in the specified directories. If mtree output is different since last time, we can send a mail with the difference. This is a process done in base install of OpenBSD for /etc and some others files to warn you if it changed.

While it’s suited for directories like /etc, in my opinion, this is not the best tool for doing integrity check.

ZFS

I would like to talk about ZFS and data integrity because this is where ZFS is very good. If you are using ZFS, you may not need any other software to take care about your data. When you write a file, ZFS will also store its checksum as metadata. By default, the option “checksum” is activated on dataset, but you may want to disable it for better performance.

There is a command to ask ZFS to check the integrity of the files. Warning : scrub is very I/O intensive and can takes from hours to days or even weeks to complete depending on your CPU, disks and the amount of data to scrub :

# zpool scrub zpool

The scrub command will recompute the checksum of every file on the ZFS pool, if something is wrong, it will try to repair it if possible. A repair is possible in the following cases :

If you have multiple disks like raid-Z or raid–1 (mirror), ZFS will be look on the differents disks if the non corrupted version of the file exists, if it finds it, it will restore it on the disk(s) where it’s corrupted.

If you have set the ZFS option “copies” to 2 or 3 (1 = default), that means that the file is written 2 or 3 time on the disk. Each file of the dataset will be allocated 2 or 3 time on the disk, so take care if you want to use it on a dataset containing heavy files ! If ZFS find thats a version of a file is corrupted, it will check the others copies of it and tries to restore the corrupted file is possible.

You can see the percentage of filesystem already scrubbed with

zfs status zpool

and the scrub can be stopped with

zfs scrub -s zpool

AIDE

Its name is an acronym for “Advanced Intrusion Detection Environment”, it’s an complicated software which can be used to check for bitrot. I would not recommend using it if you only need bitrot detection.

Here is a few hints if you want to use it for checking your file integrity :

/etc/aide.conf

/home/my_data/ R
# Rule definition
All=m+s+i+sha256
summarize_changes=yes

The config file will create a database of all files in /home/my_data/ (R for recursive). “All” line list the checks we do on each file. For bitrot checking, we want to check modification time, size, checksum and inode of the files. The summarize_change line permit to have a list of changes if something is wrong.

This is the most basic config file you can have. Then you will have to run aide to create the database and then run aide to create a new database and compare the two databases. It doesn’t update its database itself, you will have to move the old database and tell it where to found the older database.

My use case

I have different kind of data. On a side, I have static data like pictures, clips, music or things that won’t change over time and the other side I have my mails, documents and folders where the content changes regularly (creation, deletetion, modification). I am able to afford a backup for 100% of my data with some history of the backup on a few days, so I won’t be interested about file repairing.

I want to be warned quickly if a file get corrupted, so I can still get the backup in my history but I don’t keep every versions of my files for too long. I choose to go with the python tool bitrot, it’s very easy to use and it doesn’t become a mess with my folders getting updated often.

I would go with par2cmdline if I could not be able to backup all my data. Having 5% or 10% of redundancy of my files should be enough to restore it in case of corruption without taking too much space.

Port of the week : rss2email

Written by Solène, on 24 January 2017.
Tags: #portoftheweek #unix

This is the kind of Port of the week I like. This is a software I just discovered and fall in love to. The tool r2e which is the port mail/rss2email on OpenBSD is a small python utility that solves a problem : how to deal with RSS feeds ?

Until last week, I was using a “web app” named selfoss which was aggregating my RSS feeds and displaying it on a web page, I was able to filter by read/unread/marked and also filter by source. It is a good tool that does the job well but I wanted something that doesn’t rely on a web browser. Here comes r2e !

This simple software will send you a mail for each new entry in your RSS feeds. It’s really easy to configure and set-up. Just look at how I configured mine :

$ r2e new my-address+rss@my-domain.com
$ r2e add "http://undeadly.org/cgi?action=rss"
$ r2e add "https://dataswamp.org/~solene/rss.xml"
$ r2e add "https://www.dragonflydigest.com/feed"
$ r2e add "http://phoronix.com/rss.php"

Add this in your crontab to check new RSS items every 10 minutes :

*/10 * * * * /usr/local/bin/r2e run

Add a rule for my-address+rss to store mails in a separate folder, and you’re done !

NOTE : you can use r2e run –no-send for the first time, it will create the database and won’t send you mails for current items in feeds.

Convert mailbox to maildir with dovecot

Written by Solène, on 17 January 2017.
Tags: #unix

I have been using mbox format for a few years on my personal mail server. For those who don’t know what mbox is, it consists of only one file per folder you have on your mail client, each file containing all the mails of the corresponding folder. It’s extremely ineficient when you backup the mail directory because it must copy everything each time. Also, it reduces the system cache possibility of the server because if you have folders with lots of mails with attachments, it may not be cached.

Instead, I switched to maildir, which is a format where every mail is a regular file on the file system. This takes a lot of inodes but at least, it’s easier to backup or to deal with it for analysis.

Here how to switch from mbox to maildir with a dovecot tool.

dsync -u solene mirror mbox:~/mail/:INBOX=~/mail/inbox

That’s all ! In this case, my mbox folder was ~/mail/ and my INBOX file was ~/mail/inbox. It tooks me some time to find where my INBOX really was, at first I tried a few thing that didn’t work and tried a perl convert tool named mb2md.pl which has been able to extract some stuff but a lot of mails were broken. So I have been going back getting dsync working.

If you want to migrate, here the whole process :

service smtpd stop  # stop the mail server, no need to receive mails now

modify dovecot/conf.d/10-mail.conf, replace the first line
mail_location = mbox:~/mail:INBOX=/var/mail/%u   # BEFORE
mail_location = maildir:~/maildir                 # AFTER

service dovecot restart

dsync -u solene mirror mbox:~/mail/:INBOX=~/mail/inbox

service smtpd start

Port of the week : entr

Written by Solène, on 07 January 2017.
Tags: #unix

entr is a command line tool that let you run arbitrary command on file change. This is useful when you are doing something that requires some processing when you modify it.

Recently, I have used it to edit a man page. At first, I had to run mandoc each time I modified to file to check the render. This was the first time I edited a man page so I had to modify it a lot to get what I wanted. I remembered about entr and this is how you use it :

ls stagit.1 | entr mandoc /_

This simple command will run “mandoc stagit.1” each time stagit.1 is modified. The file names must be given by stdin to entr, and then use the characters sequence /_ to replace the names (like {} in find).

The man page of entr is very well documented if you need more examples.

Port of the week : dnscrypt-proxy

Written by Solène, on 19 October 2016.
Tags: #unix #security #portoftheweek #dns

Today I will talk about net/dnscrypt-proxy. This let you encrypt your DNS traffic between your resolver and the remote DNS recursive server. More and more countries and internet provider use DNS to block some websites, and now they tend to do “man in the middle” with DNS answers, so you can’t just use a remote DNS you find on the internet. While a remote dnscrypt DNS server can still be affected by such “man in the middle” hijack, there is a very little chance DNS traffic is altered in datacenters / dedicated server hosting.

The article also deal with unbound as a dns cache because dnscrypt is a bit slow and asking multiple time the same domain in a few minutes is a waste of cpu/network/time for everyone. So I recommend setting up a DNS cache on your side (which can also permit to use it on a LAN).

At the time I write this article, their is a very good explanation about “how to install it” is named dnscrypt-proxy–1.7.0p1 in the folder /usr/local/share/doc/pkg-readmes/. The following article is made from this file. The file on OpenBSD 6.0 don’t speak about unbound (while in -current, next 6.1, this is explained).

While I write for OpenBSD this can be easily adapted to anthing else Unix-like.

Resolv.conf

Modify your resolv.conf file to this

/etc/resolv.conf :

nameserver 127.0.0.1
lookup file bind
options edns0

If you use dhcp, you can use the following line to force having 127.0.0.1 as nameserver by modifying dhclient config file

Dhcp ?

/etc/dhclient.conf :

supersede domain-name-servers 127.0.0.1;

Unbound

Now, we need to modify unbound config to tell him to ask DNS at 127.0.0.1 port 40. Please adapt your config, I will just add what is mandatory. Unbound configuration file isn’t in /etc because it’s chrooted

/var/unbound/etc/unbound.conf :

server:
    # ↓ this line is MANDATORY ↓
    do-not-query-localhost: no

forward-zone:
    name: "."
    forward-addr: 127.0.0.1@40
    # ↑ address dnscrypt listen on

If you want to allow other to resolv through your unbound daemon, please see parameters interface and access-control. You will need to tell unbound to bind on external interfaces and allow requests on it.

Dnscrypt-proxy

Now we need to configure dnscrypt, pick a server in the following LIST /usr/local/share/dnscrypt-proxy/dnscrypt-resolvers.csv, the name is the first column.

As root type the following (or use doas/sudo), in the example we choose dnscrypt.eu-nl as a DNS provider

rcctl enable dnscrypt_proxy
rcctl set dnscrypt_proxy flags -E -m1 -R dnscrypt.eu-nl -a 127.0.0.1:40
rcctl start dnscrypt_proxy

Conclusion

You should be able to resolv address through dnscrypt now. You can use tcpdump on your external interface to see if you see something on udp port 53, you should not see traffic there.

If you want to use dig hostname -p 40 @127.0.0.1 to make DNS request to dnscrypt without unbound, you will need net/isc-bind which will provide /usr/local/bin/dig. OpenBSD base dig can’t use a port different than 53.

How to publish a git repository on http

Written by Solène, on 07 October 2016.
Tags: #unix #git

Here is an how-to in order to make a git repository available for cloning through a simple http server. This method only allow people to fetch the repository, not to push. I wanted to set-up this to get my code, I don’t plan to have any commit on it from other people at this time so it’s enough.

In a folder publicly available from your http server clone your repository in bare mode. As explained in the [https://git-scm.com/book/tr/v2/Git-on-the-Server-The-Protocols](man page) :

cd /var/www/htdocs/some-path/
git clone --bare /path/to/git_project gitproject.git
cd gitproject.git
git update-server-info
mv hooks/post-update.sample hooks/post-update
chmod o+x hooks/post-update

Then you will be able to clone the repository with

git clone https://your-hostname/some-path/gitproject.git

I’ve lost time because I did not execute git update-server-info so the clone wasn’t possible.

Port of the week : rlwrap

Written by Solène, on 04 October 2016.
Tags: #unix #shell #portoftheweek

Today I will present misc/rlwrap which is an utility tool when you use some command-line software which doesn’t provide you a nice readline input. By using rlwrap, you will be able to use telnet, a language REPL or any command-line tool where you input text with an history of what you type, ability to use emacs bindings like C-a C-e M-Ret etc… I use it often with telnet or sbcl.

Usage :

rlwrap telnet host port

Redirect stdin into a variable in shell

Written by Solène, on 12 September 2016.
Tags: #shell #unix

If you want to write a script reading stdin and put it into a variable, there is an very easy way to procede :

#!/bin/sh
var=`cat`
echo $var

That’s all