Kafka, dotnet and SASL_SSL

This is similar to my previous post, only now the question is, how do you connect to a Kafka server using dotnet and SASL_SSL? This is how:

// based on https://github.com/confluentinc/confluent-kafka-dotnet/blob/v1.0.0/examples/Producer/Program.cs

using Confluent.Kafka;
using System;
using System.IO;
using System.Text;
using System.Threading.Tasks;
using System.Collections.Generic;


namespace Confluent.Kafka.Examples.ProducerExample
{
    public class Program
    {
        public static async Task Main(string[] args)
        {
            string topicName = "test-topic";

            var config = new ProducerConfig {
                BootstrapServers = "kafka-server.example.com:19094",
                SecurityProtocol = SecurityProtocol.SaslSsl,
                SslCaLocation = "ca-cert",
                SaslMechanism = SaslMechanism.Plain,
                SaslUsername = "USERNAME",
                SaslPassword = "PASSWORD",
                Acks = Acks.Leader,
                CompressionType = CompressionType.Lz4,
            };

            using (var producer = new ProducerBuilder<string, string>(config).Build())
            {
                for (int i = 0; i < 1000000; i++)
                {
                    var message = $"Event {i}";

                    try
                    {
                        // Note: Awaiting the asynchronous produce request
                        // below prevents flow of execution from proceeding
                        // until the acknowledgement from the broker is
                        // received (at the expense of low throughput).

                        var deliveryReport = await producer.ProduceAsync(topicName, new Message<string, string> { Key = null, Value = message } );
                        // Console.WriteLine($"delivered to: {deliveryReport.TopicPartitionOffset}");

                        // Let's not await then
                        // producer.ProduceAsync(topicName, new Message<string, string> { Key = null, Value = message } );
                        // Console.WriteLine($"Event {i} sent.");
                    }
                    catch (ProduceException<string, string> e)
                    {
                        Console.WriteLine($"failed to deliver message: {e.Message} [{e.Error.Code}]");
                    }
                }

                // producer.Flush(TimeSpan.FromSeconds(120));

                // Since we are producing synchronously, at this point there will be no messages
                // in-flight and no delivery reports waiting to be acknowledged, so there is no
                // need to call producer.Flush before disposing the producer.
            }
        }
    }
}

Since I am a total .NET newbie, I usually docker run -it --rm microsoft/dotnet and experiment from there.

Advertisements

Kafka, PHP and SASL_SSL

When you want to connect to a Kafka cluster from PHP there are numerous examples showing how to use php-rdkafka, but unauthenticated. But what happens when you need to let a customer connect to a Kafka setup and IP whitelisting is not enough? Not much easily locatable information is out there.

Why not correct this by combing through various web pages and the librdkafka source code:

<?php

$conf = new RdKafka\Conf();
$conf->set('security.protocol', 'SASL_SSL');
$conf->set('sasl.mechanisms', 'PLAIN');
$conf->set('sasl.username', 'USERNAME_HERE');
$conf->set('sasl.password', 'PASSWORD_HERE');
$conf->set('ssl.ca.location', '/usr/local/etc/ca-cert.pem');
$conf->set('ssl.cipher.suites', 'TLSv1.2');

$rk = new RdKafka\Producer($conf);
$rk->addBrokers("SASL_SSL://kafka-1.example.com:19094");
$rk->addBrokers("SASL_SSL://kafka-2.example.com:19094");
$rk->addBrokers("SASL_SSL://kafka-3.example.com:19094");

$topic = $rk->newTopic("kafka-test-topic");

for ($i = 0; $i < 10; $i++) {
    $topic->produce(RD_KAFKA_PARTITION_UA, 0, "Message $i");
    $rk->poll(0);
}

while ($rk->getOutQLen() > 0) {
    $rk->poll(50);
}

?>

Still this may not be enough if it is the case that your Kafka server is on OpenSSL-1.0.2 (CentOS 7 for example) and your php client is on OpenSSL-1.1.0 (like the php:7.2-cli docker image). In such a case you need to alter your client’s openssl.cnf to comment out the following line:

;CipherString = DEFAULT@SECLEVEL=2

Wasting time with gawk while parsing lsof output

So I wanted to parse lsof, to see on what ports was a machine accepting connections. Normally one would write something like:

# lsof -Pn -i | grep LISTEN | awk '{print $9}' | cut -d: -f2 | sort -n | uniq
22
111
6066
7011
7015
7077
8080
10050
35735
37480
39118
44262
44444
52539

You get a sorted list of the open ports and are done with it. But why invoke four different programs to do extraction and sorting, when gawk is a complete programming language? Yes it is possible to do it with gawk in one go (and learn something in the process):

# lsof -Pn -i | awk '/LISTEN/ { split($9, a, ":"); b[a[2]] = 1; } END { n = asorti(b, c, "@ind_num_asc"); for (i = 1; i <= n; i++) { print c[i]; } }'
22
111
6066
7011
7015
7077
8080
10050
35735
37480
39118
44262
44444
52539

The /LISTEN/ effectively greps the lsof output for lines containing LISTEN and executes on them the code in curly braces to its right. Which splits the 9th column into an array using : as a delimiter. In awk arrays are indexed from 1 and the indices are strings (make a note of that).

END is a special match that executes the code in curly braces to its right after we’ve finished reading the input data. So, here is where the printing is done. Using the asorti() function we obtain a new array, indexed based on the values of the indices. We use @ind_num_asc to ensure that the order is 1, 5, 10, 15 and not 1, 10, 15, 5 as it would, should the indices be treated as strings. Finally, we can print the elements from the new array.

This would not be easily possible with awk / nawk, because as the gawk manual says:

In most awk implementations, sorting an array requires writing a sort() function. This can be educational for exploring different sorting algorithms, but usually that’s not the point of the program. gawk provides the built-in asort() and asorti() functions.

Somehow this reminds me of Knuth vs McIlory but of course I am neither.

fizzbuzz

Years ago, while browsing the original wiki site, I stumbled upon the fizzbuzz test:

“Write a program that prints the numbers from 1 to 100. But for multiples of three print “Fizz” instead of the number and for the multiples of five print “Buzz”. For numbers which are multiples of both three and five print “FizzBuzz”.”

Imagine my surprise when reading that it was designed to make most programming candidates fail the test. From time to time I have the fine opportunity to introduce people who have never coded before in their lives to Python. Within the first two hours they have managed to produce a fizzbuzz version that looks like this:

for i in range(1, 101):
  if i % 3 == 0 and i % 5 == 0:
    print("fizzbuzz")
  elif i % 3 == 0:
    print("fizz")
  elif i % 5 == 0:
    print("buzz")
  else:
    print(i)

I like the myth that 99.5% of candidates fail fizzbuzz. I tell students that they now can celebrate their achievement. But this is not where I stop with the test. I can now tell them about functions and then have them make one like:

def fizzbuzz(start, stop):
  :

where I have them modify their code above to make it a function. And afterwards they can learn about named parameters with default values:

def fizzbuzz(start=1, stop=100, three=3, five=5):
  :

Notice above how one can change the values of 3 and 5 from the original test and try a different pair. And yes, I leave the functions as exercises to the reader :)

But the best sparks in their eyes come when they remember that I had taught them certain properties of strings some three hours ago, like:

>>> 'fizz' + 'buzz'
'fizzbuzz'
>>> 'fizz' * 1
'fizz'
>>> 'fizz' * True
'fizz'
>>> 'fizz' * 0
''
>>> 'fizz' * False
''
>>> 'fizz' + ''
'fizz'
>>>

So their first observation that 'fizz' + 'buzz' equals 'fizzbuzz' is followed by what is summed up by the table:

String Sum of strings Expanded sum
‘fizzbuzz’ ‘fizz’ + ‘buzz’ ‘fizz’ * True + ‘buzz’ * True
‘fizz’ ‘fizz’ + ” ‘fizz’ * True + ‘buzz’ * False
‘buzz’ ” + ‘fizz’ ‘fizz’ * False + ‘buzz’ * True
” + ” ‘fizz’ * False + ‘buzz’ * False

Which makes them write something like that in the end:

def fizzbuzz(x, y):
  return 'fizz' * x + 'buzz' * y

for i in range(1, 101):
  print(fizzbuzz(i % 3 == 0, i % 5 == 0) or i)

How many things one can learn within a day starting from zero using the humble (and sometimes humbling) fizzbuzz.

A handy configuration snippet that I am using with the nginx ingress controller

One of the most common ways to implement Ingress on Kubernetes is the nginx ingress controller. The nginx ingress controller is configured via annotations that modify the default behavior of the controller. That way for example by using the configuration snipper you can add to the controller nginx directives that would go to a location block on a normal nginx.

In fact whenever I am spinning up an nginx ingress I now always add the following annotation:

nginx.ingress.kubernetes.io/configuration-snippet: #deny all;

Whenever I need for some emergency reason or whatever to block incoming traffic to the served site, I can do it immediately with kubectl edit ingress and simply uncommenting the hash, rather than googling that time for the specific annotation name.

PS: If you want to define a whitelist properly, it is best that you use nginx.ingress.kubernetes.io/whitelist-source-range.

Happy Sysadmin Day.

Well, it is that day of the year again. I remembered that I used to run a greek translation of the site once upon a time. And I just saw that someone else has picked up the torch. Good.

I did some Rancher2 work today. Somewhat fulfilling I can say. That magic when things you do fall into place and you keep your people happy.

We work on top of layers, on top of layer, on top of yet more layers of abstraction. The machine is so far away that we may not even need to know the details of it to make it work (“The line is a dot to you” like Joey said). Nor do our horror stories look like this. Yet while they are equally epic, they’re not just as legendary.

[Each year I keep saying to myself I won’t write a blog post, but always the next day I do, I write one and change the date to match the Day… ]

Sometimes you need JDK8 in your docker image …

… and depending the base image this may not be available, even from the ppa repositories. Moreover, you cannot download it directly from Oracle without a username / password anymore, so this may break automation, if you do not host your own repositories. Have no fear, sdkman can come to the rescue and fetch for you a supported JDK8 version. You need to pay some extra care with JAVA_HOME and PATH though:

FROM python:3.6-slim
RUN apt-get update && apt-get upgrade -y
RUN apt-get install -y curl zip unzip
RUN curl -fsSL https://get.sdkman.io -o get-sdkman.sh
RUN sh ./get-sdkman.sh
RUN bash -c "source /root/.sdkman/bin/sdkman-init.sh && sdk install java 8.0.212-amzn"
ENV JAVA_HOME=/root/.sdkman/candidates/java/current
ENV PATH=$JAVA_HOME/bin:$PATH
CMD bash

Since curl | sudo bash is a topic of contention, do your homework and decide whether you’re going to use it as displayed in the Dockerfile above, or employ some extra verification mechanism of your choice.