Wasting time with gawk while parsing lsof output

So I wanted to parse lsof, to see on what ports was a machine accepting connections. Normally one would write something like:

# lsof -Pn -i | grep LISTEN | awk '{print $9}' | cut -d: -f2 | sort -n | uniq
22
111
6066
7011
7015
7077
8080
10050
35735
37480
39118
44262
44444
52539

You get a sorted list of the open ports and are done with it. But why invoke four different programs to do extraction and sorting, when gawk is a complete programming language? Yes it is possible to do it with gawk in one go (and learn something in the process):

# lsof -Pn -i | awk '/LISTEN/ { split($9, a, ":"); b[a[2]] = 1; } END { n = asorti(b, c, "@ind_num_asc"); for (i = 1; i <= n; i++) { print c[i]; } }'
22
111
6066
7011
7015
7077
8080
10050
35735
37480
39118
44262
44444
52539

The /LISTEN/ effectively greps the lsof output for lines containing LISTEN and executes on them the code in curly braces to its right. Which splits the 9th column into an array using : as a delimiter. In awk arrays are indexed from 1 and the indices are strings (make a note of that).

END is a special match that executes the code in curly braces to its right after we’ve finished reading the input data. So, here is where the printing is done. Using the asorti() function we obtain a new array, indexed based on the values of the indices. We use @ind_num_asc to ensure that the order is 1, 5, 10, 15 and not 1, 10, 15, 5 as it would, should the indices be treated as strings. Finally, we can print the elements from the new array.

This would not be easily possible with awk / nawk, because as the gawk manual says:

In most awk implementations, sorting an array requires writing a sort() function. This can be educational for exploring different sorting algorithms, but usually that’s not the point of the program. gawk provides the built-in asort() and asorti() functions.

Somehow this reminds me of Knuth vs McIlory but of course I am neither.

Advertisements

fizzbuzz

Years ago, while browsing the original wiki site, I stumbled upon the fizzbuzz test:

“Write a program that prints the numbers from 1 to 100. But for multiples of three print “Fizz” instead of the number and for the multiples of five print “Buzz”. For numbers which are multiples of both three and five print “FizzBuzz”.”

Imagine my surprise when reading that it was designed to make most programming candidates fail the test. From time to time I have the fine opportunity to introduce people who have never coded before in their lives to Python. Within the first two hours they have managed to produce a fizzbuzz version that looks like this:

for i in range(1, 101):
  if i % 3 == 0 and i % 5 == 0:
    print("fizzbuzz")
  elif i % 3 == 0:
    print("fizz")
  elif i % 5 == 0:
    print("buzz")
  else:
    print(i)

I like the myth that 99.5% of candidates fail fizzbuzz. I tell students that they now can celebrate their achievement. But this is not where I stop with the test. I can now tell them about functions and then have them make one like:

def fizzbuzz(start, stop):
  :

where I have them modify their code above to make it a function. And afterwards they can learn about named parameters with default values:

def fizzbuzz(start=1, stop=100, three=3, five=5):
  :

Notice above how one can change the values of 3 and 5 from the original test and try a different pair. And yes, I leave the functions as exercises to the reader :)

But the best sparks in their eyes come when they remember that I had taught them certain properties of strings some three hours ago, like:

>>> 'fizz' + 'buzz'
'fizzbuzz'
>>> 'fizz' * 1
'fizz'
>>> 'fizz' * True
'fizz'
>>> 'fizz' * 0
''
>>> 'fizz' * False
''
>>> 'fizz' + ''
'fizz'
>>>

So their first observation that 'fizz' + 'buzz' equals 'fizzbuzz' is followed by what is summed up by the table:

String Sum of strings Expanded sum
‘fizzbuzz’ ‘fizz’ + ‘buzz’ ‘fizz’ * True + ‘buzz’ * True
‘fizz’ ‘fizz’ + ” ‘fizz’ * True + ‘buzz’ * False
‘buzz’ ” + ‘fizz’ ‘fizz’ * False + ‘buzz’ * True
” + ” ‘fizz’ * False + ‘buzz’ * False

Which makes them write something like that in the end:

def fizzbuzz(x, y):
  return 'fizz' * x + 'buzz' * y

for i in range(1, 101):
  print(fizzbuzz(i % 3 == 0, i % 5 == 0) or i)

How many things one can learn within a day starting from zero using the humble (and sometimes humbling) fizzbuzz.

A handy configuration snippet that I am using with the nginx ingress controller

One of the most common ways to implement Ingress on Kubernetes is the nginx ingress controller. The nginx ingress controller is configured via annotations that modify the default behavior of the controller. That way for example by using the configuration snipper you can add to the controller nginx directives that would go to a location block on a normal nginx.

In fact whenever I am spinning up an nginx ingress I now always add the following annotation:

nginx.ingress.kubernetes.io/configuration-snippet: #deny all;

Whenever I need for some emergency reason or whatever to block incoming traffic to the served site, I can do it immediately with kubectl edit ingress and simply uncommenting the hash, rather than googling that time for the specific annotation name.

PS: If you want to define a whitelist properly, it is best that you use nginx.ingress.kubernetes.io/whitelist-source-range.

Happy Sysadmin Day.

Well, it is that day of the year again. I remembered that I used to run a greek translation of the site once upon a time. And I just saw that someone else has picked up the torch. Good.

I did some Rancher2 work today. Somewhat fulfilling I can say. That magic when things you do fall into place and you keep your people happy.

We work on top of layers, on top of layer, on top of yet more layers of abstraction. The machine is so far away that we may not even need to know the details of it to make it work (“The line is a dot to you” like Joey said). Nor do our horror stories look like this. Yet while they are equally epic, they’re not just as legendary.

[Each year I keep saying to myself I won’t write a blog post, but always the next day I do, I write one and change the date to match the Day… ]

Sometimes you need JDK8 in your docker image …

… and depending the base image this may not be available, even from the ppa repositories. Moreover, you cannot download it directly from Oracle without a username / password anymore, so this may break automation, if you do not host your own repositories. Have no fear, sdkman can come to the rescue and fetch for you a supported JDK8 version. You need to pay some extra care with JAVA_HOME and PATH though:

FROM python:3.6-slim
RUN apt-get update && apt-get upgrade -y
RUN apt-get install -y curl zip unzip
RUN curl -fsSL https://get.sdkman.io -o get-sdkman.sh
RUN sh ./get-sdkman.sh
RUN bash -c "source /root/.sdkman/bin/sdkman-init.sh && sdk install java 8.0.212-amzn"
ENV JAVA_HOME=/root/.sdkman/candidates/java/current
ENV PATH=$JAVA_HOME/bin:$PATH
CMD bash

Since curl | sudo bash is a topic of contention, do your homework and decide whether you’re going to use it as displayed in the Dockerfile above, or employ some extra verification mechanism of your choice.

minikube instead of docker-machine

docker-machine is a cool project that allows you to work with different docker environments, especially when your machine or OS (like Windows 10 Home) does not allow for native docker. My primary use case for docker-machine was spinning up a VirtualBox host and using it to run the docker service.

It seems that docker-machine is not in so much active development any more and I decided to switch to another project for my particular use case: Windows 10 Home with VirtualBox. One solution is minikube:

minikube start
minikube docker-env | Invoke-Expression # Windows PowerShell
eval $(minikube docker-env) # bash shell
docker pull python:3
docker images

and you’re set to go.