IPv6 anonymization portability problems

IPv6-addresses are represented by 128 bits. This makes it possible to provide far more addresses than IPv4.

However, this also causes problems when working with IPv6. In this case, I am currently working on an IPv6-anonymiation function for rsyslog.

There are some systems that support an unsigned 128 bit integer when using GCC or clang compilers. However, many systems do not support this datatype.

Since rsyslog tries to cater to as many systems as possible, the implementation has to work on all platforms. As such, we have to use two unsigned 64 bit integers instead of one with 128 bits.

The main problem this causes is that this implementation is on software-level as opposed to he hardware-level a 128 bit integer is implemented on. This brings up a conflict of portability vs. speed, since an implementation on software-level is slower than one on hardware-level.

I have decided to make the first implementation as portable as possible, but might also later on try to speed the anonymization up. This might be possible by checking whether the 128 bit integer is supported by the system and if it is using it instead of two 64 bit integers.


rewriting mmanon: an update

I have completed the rewrite of the IPv4-function of the mmanon module.
I have managed to keep the original parameters available, although the "rewrite" option is now called "zero". This is not going to be a problem causing older, already running, configurations to no longer work, as the old parameter-names still work. However, having the "ipv4." prefix attached to configuration regarding IPv4-anonymization is going to improve clarity, as the IPv6 feature follows.

But whats new? I have added the options to randomize bits in an IP-address as a form of anonymization. Also it is now possible to anonymize IP-addresses as random, while still having one IP-address always anonymized as the same alias (generated randomly).

Now I plan on implementing an IPv6-functionality for the mmanon module with similar parameter as the IPv4 one. After that I might add other functions, like the ability to configure other separators for IP-addresses like '-' or an option to reverse the direction of anonymization.


Rewriting mmanon

Currently, rsyslog's mmanon module has the task of anonymizing ip-addresses. However, due to only being able to anonymize ipv4-addresses, I decided to overhaul the module to be able to also work with ipv6-addresses (see this feature request). In doing this, I also noticed some bugs with the ipv4-module.

Now there are two options: try to fix these or rewrite the function. While it might seem like more work, I have decided to rewrite the function, since I was already planning to add new configuration options like the option to randomize ip-addresses.

I have already worked on a similar tool for liblognorm which has never been released due to a lack of time on my part. An UNFINISHED VERSION of this work is available in my private repository.

Since this function already has some of the options I plan to implement, I will implement this as part of the new mmanon ipv4 anonymization.

But what about the current options? I will try to implement as many as possible in the initial rewrite, however some may be up for later reintroduction since I also plan on starting with the ipv6 anonymization as soon as possible. However I think that not too many people were using the module and consequently, its options. At least searching the github issue tracker for mmanon did not bring up many issues. Most of them not even really related to mmanon. There are also only few questions regarding mmanon on the mailing list. I think this proves my point, since some of the bugs affect major parts of the function.

If you were using or are currently using the module, I would be glad if you could tell me what configurations you are using and what is important to you when using the module. I have created a github issue tracker for this purpose.


Improving rsyslog debug output