aws ssm describe-instance-information for quick ansible dynamic inventory

The aws ssm agent is very useful when working both with EC2 instances and with machinery outside AWS. Once you add an outside instance by installing and configuring the SSM agent, be it on-premises or a VM at another provider, you can tag it for further granularity with aws ssm add-tags-to-resource --resource-type ManagedInstance --resource-id mi-WXYZWXYZ --tags Key=onpremise,Value=true --region eu-west-1 where mi-WXYZWXYZ is the instance ID you see at the SSM’s managed instances list (alternatively you can get this list with aws ssm describe-instance-information along with lots of other information).

It may the case that sometimes you want to apply with ansible a certain change to those machines that live outside AWS. Yes you can run ansible workbooks via the SSM directly, but this requires ansible installed on said machines. If you need the simplest of dynamic inventories, to $ ansible -u user -i ./lala all -m ping here is the crudest version of ./lala, one that happily ignores the --list argument:

#!/bin/bash
printf "%s%s%s" \
'{ "all": { "hosts": [' \
$(aws ssm describe-instance-information --region eu-west-1 --filter Key=tag:onpremise,Values=true --query "InstanceInformationList[].IPAddress" --output text | tr '[:blank:]' ',') \
'] } }'

You can go all the way scripting something like this for a proper solution though.

Why printf instead of echo above? Because jpmens suggested so.

Running monit inside Kubernetes

Sometimes you may want to run monit inside a Kubernetes cluster just to validate what you’re getting from your standard monitoring solution with a second monitor that does not require that much configuration or tinkering. In such cases the Dockerfile bellow might come handy:

FROM ubuntu:bionic
RUN apt-get update
RUN apt-get install monit bind9-host netcat fping -y
RUN ln -f -s /dev/fd/1 /var/log/monit.log
COPY monitrc /etc/monit
RUN chmod 0600 /etc/monit/monitrc
EXPOSE 2812
ENTRYPOINT [ "/usr/bin/monit" ]
CMD [ "-I", "-c", "/etc/monit/monitrc" ]

I connected to it via kubectl -n monit-test port-forward --address=0.0.0.0 pod/monit-XXXX-YYYY 2812:2812. Most people do not need --address=0.0.0.0, but I run kubectl inside a VM for some degree of compartmentalization. Stringent, I know…

Why would you need something like this you ask? Well imagine the case where you have multiple pods running, no restarts, everything fine, but randomly you get connection timeouts to the clusterIP address:port pair. If you have no way of reproducing this, don’t you want an alert the exact moment it happens? That was the case for me.

And also the fun of using a tool in an unforeseen way.

Rancher’s cattle-cluster-agent and error 404

It may be the case that when you deploy a new Rancher2 Kubernetes cluster, all pods are working fine, with the exception of cattle-cluster-agent (whose scope is to connect to the Kubernetes API of Rancher Launched Kubernetes clusters) that enters a CrashLoopBackoff state (red state in your UI under the System project).

One common error you will see from View Logs of the agent’s pod is 404 due to a HTTP ping failing:

ERROR: https://rancher-ui.example.com/ping is not accessible (The requested URL returned error: 404)

It is a DNS problem

The issue here is that if you watch the network traffic on your Rancher2 UI server, you will never see pings coming from the pod, yet the pod is sending traffic somewhere. Where?

Observe the contents of your pod’s /etc/resolv.conf:

nameserver 10.43.0.10
search default.svc.cluster.local svc.cluster.local cluster.local example.com
options ndots:5

Now if you happen to have a wildcard DNS A record in example.com the HTTP ping in question becomes http://rancher-ui.example.com.example.com/ping which happens to resolve to the A record of the wildcard (most likely not the A RR of the host where the Rancher UI runs). Hence if this machine runs a web server, you are at the mercy of what that web server responds.

One quick hack is to edit your Rancher2 cluster’s YAML and instruct the kubelet to start with a different resolv.conf that does not contain a search path with your domain with the wildcard record in it. The kubelet appends the search path line to the default and in this particular case you do not want that. So you tell your Rancher2 cluster the following:

  kubelet:
    extra_args:
      resolv-conf: /host/etc/resolv.rancher

resolv.rancher contains only nameserver entries in my case. The path is /host/etc/resolv.rancher because you have to remember that in Rancher2 clusters, the kubelet itself runs from within a container and access the host’s file system under /host.

Now I am pretty certain this can be dealt with, with some coredns configuration too, but did not have the time to pursue it.

The Uprising (Revolution Book 1)

The Uprising (Revolution Book 1)The Uprising by Konstantinos Christidis
My rating: 3 of 5 stars

This is the first book from the author and if I am not mistaken, it is self-published. It could use some more help with editing. That is why I did not give it more stars.

Think of this book as a prequel to the Expanse. The setting is similar, the Earth is governed by the UN and there is a terraforming project for Mars (and Venus) and at Callisto there is an installation that ships material and water (ice) to the terraforming projects.

Think also of the existence of the equivalent of the East India Company with its own private army, monopoly status and control over the judicial system and the government. What can go wrong when some theoretical physicist backed with VC money (to put it in today’s terms) threatens the status quo with faster than light travel?

This is what the book is about.

Could it have been written better? Yes. Does it matter that at times the author does not manage to keep the pace and be a bit boring? I don’t know. Maybe. It took me more than I anticipated to finish it.

Did I have a good time ultimately reading it? Sure.

View all my reviews

once again bitten by the MTU

At work we use Rancher2 clusters a lot. The UI makes some things easier I have to admit. Like sending logs from the cluster somewhere. I wanted to test sending such logs to an ElasticSearch and thus I setup a test installation with docker-compose:

version: "3.4"

services:
  elasticsearch:
    restart: always
    image: elasticsearch:7.5.1
    container_name: elasticsearch
    ports:
      - "9200:9200"
    environment:
      - ES_JAVA_OPTS=-Xmx16g
      - cluster.name=lala-cluster
      - bootstrap.memory_lock=true
      - discovery.type=single-node
      - node.name=lala-node
      - http.port=9200
      - xpack.security.enabled=true
      - xpack.monitoring.collection.enabled=true
    volumes:
      # ensure chown 1000:1000 /opt/elasticsearch/data please.
      - /opt/elasticsearch/data:/usr/share/elasticsearch/data

  kibana:
    restart: always
    image: kibana:7.5.1
    ports:
      - "5601:5601"
    container_name: kibana
    depends_on:
      - elasticsearch
    volumes:
      - /etc/docker/compose/kibana.yml:/usr/share/kibana/config/kibana.yml

Yes, this is a yellow cluster, but then again, it is a test cluster on a single machine.

This seemed to work for some days, and the it stopped. tcpdump showed packets arriving at the machine, but not really responding back after the three way handshake. So the old mantra kicked in:

It is a MTU problem.

Editing daemon.json to accommodate for that assumption:

{
  "mtu": 1400
}

and logging was back to normal.

I really hate fixes like this, but sometimes when pressed by other priorities they present a handy arsenal.

coreDNS and nodesPerReplica

[ It is always a DNS problem; or systemd]

It is well established that one does not run a Kubernetes cluster that spans more than one region (for whatever the definition of the region is for you cloud provider). Except when sometimes, one does do this, for reasons, and learns what leads to the rule stated above. Instabilities arise.

One such instability is the behavior of the internal DNS. It suffers. Latency is high and the internal services cannot communicate with one another, or things happen become very slow. Imagine for example your coreDNS resolvers running not in the same region where two pods that want to talk to each other are. You may initially think it is the infamous ndots:5, which while it may contribute, is not the issue here. The (geographical) location of the DNS service is.

When you are in a situation like that, maybe it will come handy to run a DNS resolver on each host (kind of a DaemonSet). Is this possible? Yes it is, if you take the time to read Autoscale the DNS Service in a Cluster:

The actual number of backends is calculated using this equation:
replicas = max( ceil( cores × 1/coresPerReplica ) , ceil( nodes × 1/nodesPerReplica ) )

Armed with that information, we edit the coredns-autoscaler configMap:

$ kubectl -n kube-system edit cm coredns-autoscaler
:
linear: '{"coresPerReplica":128,"min":1,"nodesPerReplica":1,"preventSinglePointFailure":true}'

Usually the default value for nodesPerReplica is 4. By assigning to it the value of 1, you’re ensuring you have #nodes of resolver instances, speeding up your DNS resolution in the unfortunate case where your cluster spans more than one region.

The things we do when we break the rules…

on brilliant assholes

[ yet another meaningless micropost ]

From time to time people get to read the autobiography, or memoirs of a specific time when a highly successful individual reached their peak. Fascinated by their success and seeking to improve their own status, these followers* copy the behavior they read about. Interestingly, if this behavior is assholic and abusive even more easier. Someone with psychology studies would have more to say here, I’m sure. In my “communication radius” this is very easy to observe with people who want to copy successful sports coaches, and you can see this pattern crossing over to other work domains too.

It is not easy to understand that someone can be an asshole whose brilliance may make them somewhat tolerable to their immediate workplace, while the other way round does not stand: assholic behavior does not generate brilliance. Solutions do.

If you think you’re brilliant, just pick a hard problem and solve it. I know, it’s …hard.

[*] Leadership programs and books create followers, not leaders.

Work expands to fill the time available for its completion.

This is also known as Parkinson’s Law. But earlier today I dug up from the Internet Archive this gem of a post (you should read it all, really):

Parkinson inferred this effect from two central principles governing the behavior of bureaucrats:

1. Officials want to multiply subordinates, not rivals.
2. Officials make work for one another.

Just a note when I seek to understand big organization behavior.

Where do you see yourself in five years?

This must be the most annoying question for many people during an interview process. An interview is a stressful situation, even for the most experienced performer, and yet you’re asked to predict the future, your future, just like that, without any other factors in. To be honest, I do not know what HR people want or think when they ask this, and I’m not sure they also do; I bet most are asking it because it is in their checklist.

Parenthesis on prediction: Five years is a long time. Often I am asked by friends and acquaintances on the (business) value of certain Engineering studies. I have learned to not answer this question, because it carries a lot of my biases, but also because I tell them this: If someone told you in 2007 that they began studying Civil Engineering in Greece, you’d assume -and tell them- that if they like the field, they won’t be out of work. Now Civil Engineering is 5 years of study (add one more for an MSc, a normal tradition in Greece) and you’re now in 2013, right in the middle of the Greek economic crisis, competing with thousands of other Civil Engineers, both old and new for virtually no work offered and for pennies.

That is why I hate these questions. But I’ve learned why certain people ask them. And this may provide a guideline how to think about these without hating the question or the interviewer. I am copying the relevant parts from an interview Brian Krzanich gave a few years ago:

I give the same advice to women and men about that. I tell them that there are three mistakes that people make in their careers, and that those three mistakes limit their potential growth.

The first mistake is not having a five-year plan. I meet so many people who say: I just want to contribute. But that doesn’t necessarily drive you to where you want to go in your career. You have to know: Where do you want to be? What do you want to accomplish? Who do you want to be in five years?

The second mistake is not telling somebody. If you don’t go talk to your boss, if you don’t go talk to your mentors, if you don’t go talk to people who can influence where you want to be, then they don’t know. And they’re not mind readers.

The third thing is you have to have a mentor. You have to have someone who’s watching out, helping you navigate the decision-making processes, how things get done, how you’re perceived from a third-party view.

After that you can now have a discussion. When you want a raise you’re not only going in saying: I want more money. You’re going in and saying: Here’s what I want out of my career. Here’s what I accomplished. Here’s what I said I was going to do. Here’s what I’ve done. Not only do I deserve more money but I want to get to here on my career.

Because what you really want is to build a career, not just get the raise. And if you do those things, whether you’re a man or a woman, you’ll be a lot better off.

If you’d asked me in 2013 where would I see myself in 5 years, there is no way I would have predicted that I would have worked for three startups, a brief stint at Intel and the software house of the biggest lottery on the planet. I was working in the Public Sector and doing relatively OK. But I know I needed two things: A pay raise and to sharpen my skills. And mind you, I was not telling anybody nor had a mentor. I did not even know how to interview. You do not have to wait to have coffee with someone to get your push like I did.

Plan ahead.

rkube: Rancher2 Kubernetes cluster on a single VM using RKE

There are many solutions to run a complete Kubernetes cluster in a VM on your machine, minikube, microk8s or even with kubeadm. So embarking into what others have done before me, I wanted to do the same with RKE. Mostly because I work with Rancher2 lately and I want to experiment on VirtualBox without remorse.

Enter rkube (the name directly inspired from minikube and rke). It does not do the many things that minikube does, but it is closer to my work environments.

We use vagrant to boot an Ubuntu Bionic box. It creates a 4G RAM / 2 CPU machine. We provision the machine using ansible_local and install docker from the Ubuntu archives. This is version 17 for Bionic. If you need a newer version, check the docker documentation and modify ansible.yml accordingly.

Once the machine boots up and is provisioned, it is ready for use. You will find the kubectl configuration file named kube_cluster_config.yml installed in the cloned repository directory. You can now run a simple echo server with:

kubectl --kubeconfig kube_cluster_config.yml apply -f echo.yml

Check that the cluster is deployed with:

kubectl --kubeconfig kube_cluster_config.yml get pod
kubectl --kubeconfig kube_cluster_config.yml get deployment
kubectl --kubeconfig kube_cluster_config.yml get svc
kubectl --kubeconfig kube_cluster_config.yml get ingress

and you can visit the echo server at http://192.168.98.100/echo Ignore the SSL error. We have not created a specific SSL certificate for the Ingress controller yet.

You can change the IP address you can connect to the RKE VM in the Vagrantfile.

Suppose you now want to upgrade the Kubernetes version. vagrant ssh into the VM and run rke config -l -s -a and pick the new version that you want to install. Look for the containers named hypercube. You now edit /vagrant/cluster.yml and run rke up --config /vagrant/cluster.yml.

Note that thanks to vagrant’s niceties, the /vagrant directory within the VM is the directory you cloned the repository into.

I developed the whole thing in Windows 10, so it should be able to run just about anywhere. I hope you like it and help me make it a bit better if you find it useful.

You can browse rkube here