Introduction to IPFS

Written by Solène, on 17 April 2021.
Tags: #openbsd #ipfs

1. introduction to IPFS
2. Using a public gateway
3. Using your own node
- 3. 1. Installing the node on OpenBSD
- 3. 2. Installing the node on NixOS
- 3. 3. Testing your gateway
4. OpenBSD experiment
5. Using DNS
6. Conclusion
7. Interesting IPFS resources

1. introduction to IPFS §

IPFS is a distributed storage network protocol that comes with a public network. Anyone can run a peer and access content from IPFS and then relay the content while it's in your cache.

Gateways are websites used to allow accessing content of IPFS through http, there are several public gateways allowing to get data from IPFS without being a peer.

Every publish content has an unique CID to identify it, we usually add a /ipfs/ to it like in /ipfs/QmRVD1V8eYQuNQdfRzmMVMA6cy1WqJfzHu3uM7CZasD7j1. The CID is unique and if someone add the same file from another peer, they will get the same hash as you.

If you add a whole directory in IPFS, the top directory hash will depend on the hash of its content, this mean if you want to share a directory like a blog, you will need to publish the CID every time you change the content, as it's not practical at all, there is an alternative for making the process more dynamic.

A peer can publish data in a long name called an IPNS. The IPNS string will never change (it's tied to a private key) but you can associate a CID to it and update the value when you want and then tell other peers the value changed (it's called publishing). The IPNS notation used is looking like /ipns/k51qzi5uqu5dmebzq75vx3z23lsixir3cxi26ckl409ylblbjigjb1oluj3f2z.ipns, you can access an IPNS content with public gateways with a different notation.

- IPNS gateway use example: https://k51qzi5uqu5dmebzq75vx3z23lsixir3cxi26ckl409ylblbjigjb1oluj3f2z.ipns.dweb.link/

- IPFS gateway use example: https://ipfs.io/ipfs/QmRVD1V8eYQuNQdfRzmMVMA6cy1WqJfzHu3uM7CZasD7j1/

The IPFS link will ALWAYS return the same content because it's a defined hash to a specific resource. The IPNS link can be updated to have a newer CID over time, allowing people to bookmark the location and browse it for updates later.

2. Using a public gateway §

There are many public gateways you can use to fetch content.

Health check of public gateways, useful to pick one

You will find two kind of gateways url, one like "https://$domain/" and other like "https://$something_very_long.ipfs.$domain/", for the first one you need to append your /ipfs/something or /ipns/something requests like in the previous examples. For the latter, in web browser it only works with ipns because web browsers think the CID is a domain and will change the case of the letters and it's not long valid. When using an ipns like this, be careful to change the .ipfs. by .ipns. in the url to tell the gateway what kind of request you are doing.

3. Using your own node §

First, be aware that there is no real bandwidth control mechanism and that IPFS is known to create too many connections that small routers can't handle. On OpenBSD it's possible to mitigate this behavior using queuing. It's possible to use a "lowpower" profile that will be less demanding on network and resources but be aware this will degrade IPFS performance. I found that after a few hours of bootstrapping and reaching many peers, the bandwidth usage becomes less significant but it's may be an issue for DSL connections like mine.

When you create your own node, you can use its own gateway or the command line client. When you request a data that doesn't belong to your node, it will be downloaded from known peers able to distribute the blocks and then you will keep it in cache until your cache reach the defined limited and the garbage collector comes to make some room. This mean when you get a content, you will start distributing it, but nobody will use your node for content you never fetched first.

When you have data, you can "pin" it so it will never be removed from cache, and if you pin a directory CID, the content will be downloaded so you have a whole mirror of it. When you add data to your node, it's automatically pinned by default.

The default ports are 4001 (the one you need to expose over the internet and potentially forwarding if you are behind a NAT), the Web GUI is available at http://localhost:5001/ and the gateway is available at http://localhost:8080/

3.1. Installing the node on OpenBSD §

To make it brief, there are instructions in the provided pkg-readme but I will give a few advice (that I may add to the pkg-readme later).

pkg_add go-ipfs
su -l -s /bin/sh _go-ipfs -c "IPFS_PATH=/var/go-ipfs /usr/local/bin/ipfs init"
rcctl enable go_ipfs

# recommended settings
rcctl set go_ipfs flags --routing=dhtclient --enable-namesys-pubsub

cat <<EOF >> /etc/login.conf
go_ipfs:\
	:openfiles=2048:\
	:tc=daemon:
EOF
rcctl start go_ipfs

You can change the profile to lowpower with "env IPFS_PATH=/var/go-ipfs/ ipfs config profile apply lowpower", you can also list profiles with the ipfs command.

I recommend using queues in PF to limit the bandwidth usage, for my DSL connection I've set a maximum of 450K and it doesn't disrupt my network anymore. I explained how to proceed with queuing and bandwidth limitations in a previous article.

3.2. Installing the node on NixOS §

Installing IPFS is easy on NixOS thanks to its declarative way. The system has a local IPv4 of 192.168.1.150 and a public IP of 136.214.64.44 (fake IP here). it is started with a 50GB maximum for cache. The gateway will be available on the local network on http://192.168.1.150:8080/.

services.ipfs.enable = true;
services.ipfs.enableGC = true;
services.ipfs.gatewayAddress = "/ip4/192.168.1.150/tcp/8080";
services.ipfs.extraFlags = [ "--enable-namesys-pubsub" ];
services.ipfs.extraConfig = {
    Datastore = { StorageMax = "50GB"; };
    Routing = { Type = "dhtclient"; };
};
services.ipfs.swarmAddress = [
        "/ip4/0.0.0.0/tcp/4001"
        "/ip4/136.214.64.44/tcp/4001"
        "/ip4/136.214.64.44/udp/4001/quic"
        "/ip4/0.0.0.0/udp/4001/quic"
];

3.3. Testing your gateway §

Let's say your gateway is http://localhost:8080/ for making simpler incoming examples. If you want to request the data /ipfs/QmRVD1V8eYQuNQdfRzmMVMA6cy1WqJfzHu3uM7CZasD7j1 , you just have to add this to your gateway, like this: http://localhost:8080/ipfs/QmRVD1V8eYQuNQdfRzmMVMA6cy1WqJfzHu3uM7CZasD7j1 and you will get access to your file.

When using ipns, it's quite the same, for /ipns/blog.perso.pw/ you can request http://localhost:8080/ipns/blog.perso.pw/ and then you can browse my blog.

4. OpenBSD experiment §

To make all of this really useful, I started an experiment: distributing OpenBSD amd64 -current and 6.9 both with sets and packages over IPFS. Basically, I have a server making a rsync of both sets once a day, will add them to the local IPFS node, get the CID of the top directory and then publish the CID under an IPNS. Note that I have to create an index.html file in the packages sets because IPFS doesn't handle directory listing very well.

The following examples will have to be changed if you don't use a local gateway, replace localhost:8080 by your favorite IPFS gateway.

You can upgrade your packages with this command:

env PKG_PATH=http://localhost:8080/ipns/k51qzi5uqu5dmebzq75vx3z23lsixir3cxi26ckl409ylblbjigjb1oluj3f2z/pub/OpenBSD/snapshots/packages/amd64/ pkg_add -Dsnap -u

You can switch to latest snapshot:

sysupgrade -s http://localhost:8080/ipns/k51qzi5uqu5dmebzq75vx3z23lsixir3cxi26ckl409ylblbjigjb1oluj3f2z/pub/OpenBSD/

While it may be slow to update at first, if you have many systems, running a local gateway used by all your computers will allow to have a cache of downloaded packages, making the whole process faster.

I made a "versions.txt" file in the top directory of the repository, it contains the date and CID of every publication, this can be used to fetch a package from an older set if it's still available on the network (because I don't plan to keep all sets, I have a limited disk).

You can simply use the url http://localhost:8080/ipns/k51qzi5uqu5dmebzq75vx3z23lsixir3cxi26ckl409ylblbjigjb1oluj3f2z/pub/OpenBSD/ in the file /etc/installurl to globally use IPFS for pkg_add or sysupgrade without specifying the url every time.

5. Using DNS §

It's possible to use a DNS entry to associate an IPFS resource to a domain name by using dnslink. The entry would look like:

_dnslink.blog	IN	TXT	"dnslink=/ipfs/somehashhere"

Using an /ipfs/ syntax will be faster to resolve for IPFS nodes but you will need to update your DNS every time you update your content over IPFS.

To avoid manipulating your DNS every so often (you could use an API to automate this by the way), you can use an /ipns/ record.

_dnslink.blog	IN	TXT	"dnslink=/ipns/something"

This way, I made my blog available under the hostname blog.perso.pw but it has no A or CNAME so it work only in an IPFS context (like a web browser with IPFS companion extension). Using a public gateway, the url becomes https://ipfs.io/ipns/blog.perso.pw/ and it will download the last CID associated to blog.perso.pw.

6. Conclusion §

IPFS is a wonderful piece of technology but in practice it's quite slow for DSL users and may not work well if you don't need a local cache. I do really love it though so I will continue running the OpenBSD experiment.

Please write me if you have any feedback or that you use my OpenBSD IPFS repository. I would be interested to know about people's experiences.