Bandwidth management in go-IPFS

Comment on Mastodon

Introduction

In this article I will explain a few important parameters for the reference IPFS node server go-ipfs in order to manage the bandwidth correctly for your usage.

Configuration File

The configuration file of go-ipfs is set by default to $HOME/.ipfs/config but if IPFS_PATH is set it will be $IPFS_PATH/.config

Tweaks

There are many tweaks possible in the configuration file, but there are pros and cons for each one so I can't tell you what values you want. I will rather explain what you can change and in which situation you would want it.

Connections number

By default, go-ipfs will keep a number of connections to peers between 600 and 900 and new connections will last at least 20 seconds. This may totally overwhelm your router to have to manage that quantity of TCP sessions.

The HighWater will define the maximum sessions you want to exist, so this may be the most important setting here. On the other hand, the LowWater will define the number of connections you want to keep all the time, so it will drain bandwidth if you keep it high.

I would say if you care about your bandwidth usage, keep the LowWater low like 50 and have the HighWater quite high and a short GracePeriod, this will allow go-ipfs to be quiet when unused but responsive (able to connect to many peers to find a content) when you need it.

Documentation about Swarm.ConnMgr

DHT Routing

IPFS use a distributed hash table to find peers (it's the common way to proceed in P2P networks), but your node can act as a client and only fetch the DHT from other peer or be active and distribute it to other peer.

If you have a low power server (CPU) and that you are limited in your bandwidth, you should use the value "dhtclient" to no distribute the DHT. You can configure this in the configuration file or use --routing=dhtclient in the command line.

Documentation about Routing.type

Reprovider

Strategy

This may be the most important choice you have to make for your IPFS node. With the Reprovider.Strategy setting you can choose to be part of the IPFS network and upload data you have locally, only upload data you pinned or upload nothing.

If you want to actively contribute to the network and you have enough bandwidth, keep the default "all" value, so every data available in your data store will be served to clients over IPFS.

If you self host data on your IPFS node but you don't have much bandwidth, I would recommend setting this value to "pinned" so only the data pinned in your IPFS store will be available. Remember that pinned data will never be removed from the store by the garbage collector and files you add to IPFS from the command line or web GUI are automatically pinned, the pinned data are usually data we care about and that we want to keep and/or distribute.

Finally, you can set it to empty and your IPFS node will never upload any data to anyone which could be consider as unfair in a peer to peer network but under some quota limited or high latency connection it would make sense to not upload anything.

Documentation about Reprovider.Strategy

Interval

While you can choose what kind of data you want your node to relay as a part of the IPFS network, you can choose how often your node will publish the content of the data hold in its data store.

The default is 12 hours, meaning every 12 hours your node will publish the list of everything available for upload to the other peers. If you care about bandwidth and your content doesn't change often, you can increase this value, on the other hand if you may want to publish more often if your data store is rapidly changing.

If you don't want to publish your content, you can set it to "0", then you would still be able to publish it manually using the IPFS command line.

Documentation about Reprovider.Interval

Gateway management

If you want to provide your data over a public gateway, you may not want everyone to use this gateway to download IPFS content because of legal concerns, resource limits or you simply don't want that.

You can set Gateway.NoFetch to make your gateway to only distribute files available in the node data store. Meaning it will act as an http·s server for your own data but the gateway can't be used to get any other data. It's a convenient way to publish content over IPFS and make it available from a gateway you trust while keeping control over the data relayed.

Documentation about Gateway.NoFetch

Conclusion

There are many settings here for various use case. I'm running an IPFS node on a dedicated server but also another one at home and they have a very different configuration.

My home connection is limited to 900 kb/s which make IPFS very unfriendly to my ISP router and bandwidth usage.

Unfortunately, go-ipfs doesn't provide an easy way to set download and upload limit, that would be very useful.