In sed matching \d might not be what you would expect

A friend asked me the other day whether a certain “search and replace” operation over a credit card number could be done with sed: Given a number like 5105 1051 0510 5100, replace the first three components with something and leave the last one intact.

So my first take on this was:

# echo 5105 1051 0510 5100 | sed -e 's/^\([0-9]\{4\} \)\{3\}/lala /'
lala 5100

which works, but is not very legible. So here is taking advantage of the -r flag, if your modern sed supports it:

# echo 5105 1051 0510 5100 | sed -re 's/^([[:digit:]]{4} ){3}/lala /' 
lala 5100

So my friend asked, why not use \d instead of [[:digit:]] (or even [0-9])?

# echo 5105 1051 0510 5100 | sed -re 's/^(\d{4} ){3}/lala /' 
5105 1051 0510 5100

Why does this not work? Because as it is pointed in the manual:

In addition, this version of sed supports several escape characters (some of which are multi-character) to insert non-printable characters in scripts (\a, \c, \d, \o, \r, \t, \v, \x). These can cause similar problems with scripts written for other seds.

There. I guess that is why I still do not make much use of the -r flag and prefer to escape parentheses when doing matches in sed.


Confessions of a Necromancer

Confessions of a NecromancerConfessions of a Necromancer by Pieter Hintjens

I knew of Hintjens’s work (Xitami, ZeroMQ, etc) but not much more of him. The book popped up in a Slack I am a member of while discussing Torvalds’s decision to take a step back and work on himself.

Hintjens writes a technical memoir. At least that is the first part of the book. And because he writes stuff about the era of computing I grew up into, I like it. He reminded me of technologies, tricks and methods I had long forgotten. I even learned new old stuff that I had never come across.

And the there is the second part of the book. The most important and most interesting one. What can I say about it? Not much I am afraid. I can only declare my respect for his effort to document the process and his voyage towards the end. I envy his clarity, even though I cannot even begin to imagine the cost for it to be maintained during the cancer treatment process.

Highly touching.

View all my reviews

PS: I am trying to see whether using Goodreads to write my thoughts on books I read is a thing that I like.

vagrant, ansible local and docker

This is a minor annoyance to people who want to work with docker on their vagrant boxes and provision them with the ansible_local provisioner.

To have docker installed in your box, you simply need to enable the docker provisoner in your Vagrantfile:

config.vm.provision "docker", run: "once"

Since you’re using the ansible_local provisiner, you might skip this and write a task that installs docker from or wherever it suits you anyway, but I prefer this as vagrant knows how to best install docker onto itself.

Now obviously you can have the provisioner pull images for you, but for any crazy reason you want to pass most, if not all, of the provisioning to ansible. And thus you want to use among others the docker_image module. So you write something like:

- name: install python-docker
  become: true
    name: python-docker
    state: present

- name: install docker images
    name: busybox

Well this is going to greet you with an error message when you up the machine for the fist time:

Error message

TASK [install docker images] ***************************************************
fatal: [default]: FAILED! => {“changed”: false, “msg”: “Error connecting: Error while fetching server API version: (‘Connection aborted.’, error(13, ‘Permission denied’))”}
to retry, use: –limit @/vagrant/ansible.retry

Whereas when you happily run vagrant provision right away:

TASK [install docker images] ***************************************************
changed: [default]

Why does this happen? Because even though the installation of docker makes the vagrant user a member of group docker, this becomes effective with the next login.

The quickest way to bypass this is to make that part of your first run of ansible provisioning as super user:

- name: install docker images
  become: true  
    name: busybox

I am using the docker_image module only as an example here for lack of a better example with other docker modules on a Saturday morning. Pulling images is something that is of course very easy to do with the vagrant docker provisioner itself.

default: Running ansible-playbook…

PLAY [all] *********************************************************************

TASK [Gathering Facts] *********************************************************
ok: [default]

TASK [install python-docker] ***************************************************
changed: [default]

TASK [install docker images] ***************************************************
changed: [default]

PLAY RECAP *********************************************************************
default : ok=3 changed=2 unreachable=0 failed=0

Vagrant, ansible_local and pip

I try to provision my Vagrant boxes with the ansible_local provisioner. The other day I was using the pip ansible module while I was booting the box, but was getting errors while installing packages. It turns out that the pip version I had when I created the environment needed an upgrade. Sure you can run a pip install pip --upgrade from the command line, but how do you do so within a playbook? Pretty easy it seems:

- hosts: all
    - name: create the needed virtual environment and upgrade pip
        chdir: /home/vagrant
        virtualenv: work
        virtualenv_command: /usr/bin/python3 -mvenv
        name: pip
        extra_args: --upgrade

    - name: now install the requirements
        chdir: /home/vagrant
        virtualenv: work
        virtualenv_command: /usr/bin/python3 -mvenv
        requirements: /vagrant/requirements.txt

(Link to pastebin here in case the YAML above does not render correctly for you.)

I hope it helps you too.

Happy sysadmin day

Hello and happy SysAdmin day. The baby swing bellow is from 2011. While at rest it looks like a safe swing, it is not. The chains latch too close to the middle and it is very easy for the seat to revolve around a second horizontal axis while swinging. You can understand how I know.

It is SysAdmin day today. We make sure the chains latch properly so your software runs without extra revolutions.

You’re welcome :)

baby swing


unbound, python and conditional replies based on source IP address

We’re using unbound internally for DNS resolution. It works smoothly and allows for some DNS tricks when you want to implement some split-brain trickery, but not a complete split-brain deployment.  The other day we needed to send out conditional replies based on the IP address of the querying machine.  Unbound comes with a python module but it has some of the weirdest, unhelpful documentation ever.  I am not alone in believing this.

It is very hard to figure out the source IP address of a DNS query using the unbound python library. My first pointer on how to do so was on ServerFault.  I have uploaded my own version of an operate function at pastebin. The code in question that you need to consider is:

# Find out source IP address
rl = qstate.mesh_info.reply_list
while (rl):
  if rl.query_reply:
    q = rl.query_reply
  rl =

# Careful with this conditional
try: addr = q.addr
except NameError: addr = None

The try … except handling is needed because I found out that sometimes the q.addr may not be defined and thus further down the line you may be bitten by an abnormal exit of your script.

Update: two friends have suggested that I change the while loop with a more Pythonic list comprehension:

q = next((x for x in qstate.mesh_info.reply_list if x.query_reply), None)
try: addr = q.query_reply.addr
except NameError: addr = None

One of them actually has a pretty cool pastebin about it.

Your first steps installing Graylog

A new colleague needed some help to setup a Graylog installation. He had never done this before, so he asked for assistance. What follows is a rehash of an email I sent him on how to proceed and build knowledge on the subject:

So initially I had zero knowledge of Graylog. What I did to accustom myself with it was to download an OVA file with a prepared virtual machine and run it via VMware Fusion. The same VM can also be imported to VirtualBox and even to AWS, although they also provide ready AMIs for deployment in AWS.  Links:
Keep in mind that this is a full installation of what Graylog needs to work with and it also comes with a handly little script named “graylog-ctl” that manipulates a lot of configuration. The big catch is that graylog-ctl is not part of any standard Graylog deployment. It only comes with the OVA and the AMI images.
So after I had some fun with it on a VM on my workstation, reading the documentation and testing stuff, I had an initial deployment of the AMI image in AWS. But this is not an installation that can scale.  Which brings us to the next steps:
  • For Graylog to work you need to provide it with a MongoDB and an ElasticSearch database. It is your choice whether these will be clustered for high availability or not, whether they will run in the same machine or not. You control the complete architecture. So in my case I made the following decisions:
  • I am running a MongoDB replica set using three VMs. This is a standard setup as it is described in the MongoDB online documentation. Since it is not password protected, it only accepts connections from the Graylog instance. I used AWS security groups for that.
  • I am using an ElasticSearch cluster with three VMs where the nodes are both data and masters. If you can, use 7 nodes, three masters (lower machines since they do not run queries and do not index any data) and four data nodes (higher end machines). Again, since this is not password protected, I used AWS security groups to allow access only from the Graylog instance.
  • I am running a single Graylog instance on a separate VM. Currently it only listens for syslog stuff.  When the need arises, I will add a two more nodes to increase the availability.  I think I changed as many as four or five lines in the main configuration file. Graylog uses MongoDB to store its configuration, which includes anything you configure via the web interface.
  • Pay extra attention to the versions of ElasticSearch and MongoDB that your Graylog version requires. Use exactly what is mentioned in the documentation. For example in my case I am not running ES 6.x but the latest 5.x.
Now it is time to up your game. Once you see that your installation is working you have to decide whether to password protect access to MongoDB and ElasticSearch and whether to encrypt traffic between all those instances or not. I say give it a go.
I’ve not even touched issues like database management for Mongo and Elastic, backing them up, restoring, deleting indices, etc because this is post from zero to your first week testing Graylog.  There is plenty of stuff out there to take you to the next level, once you get used to the complexity of the software involved.
Should you need any more help, ping me anytime.