coreDNS and nodesPerReplica

[ It is always a DNS problem; or systemd]

It is well established that one does not run a Kubernetes cluster that spans more than one region (for whatever the definition of the region is for you cloud provider). Except when sometimes, one does do this, for reasons, and learns what leads to the rule stated above. Instabilities arise.

One such instability is the behavior of the internal DNS. It suffers. Latency is high and the internal services cannot communicate with one another, or things happen become very slow. Imagine for example your coreDNS resolvers running not in the same region where two pods that want to talk to each other are. You may initially think it is the infamous ndots:5, which while it may contribute, is not the issue here. The (geographical) location of the DNS service is.

When you are in a situation like that, maybe it will come handy to run a DNS resolver on each host (kind of a DaemonSet). Is this possible? Yes it is, if you take the time to read Autoscale the DNS Service in a Cluster:

The actual number of backends is calculated using this equation:
replicas = max( ceil( cores × 1/coresPerReplica ) , ceil( nodes × 1/nodesPerReplica ) )

Armed with that information, we edit the coredns-autoscaler configMap:

$ kubectl -n kube-system edit cm coredns-autoscaler
:
linear: '{"coresPerReplica":128,"min":1,"nodesPerReplica":1,"preventSinglePointFailure":true}'

Usually the default value for nodesPerReplica is 4. By assigning to it the value of 1, you’re ensuring you have #nodes of resolver instances, speeding up your DNS resolution in the unfortunate case where your cluster spans more than one region.

The things we do when we break the rules…

on brilliant assholes

[ yet another meaningless micropost ]

From time to time people get to read the autobiography, or memoirs of a specific time when a highly successful individual reached their peak. Fascinated by their success and seeking to improve their own status, these followers* copy the behavior they read about. Interestingly, if this behavior is assholic and abusive even more easier. Someone with psychology studies would have more to say here, I’m sure. In my “communication radius” this is very easy to observe with people who want to copy successful sports coaches, and you can see this pattern crossing over to other work domains too.

It is not easy to understand that someone can be an asshole whose brilliance may make them somewhat tolerable to their immediate workplace, while the other way round does not stand: assholic behavior does not generate brilliance. Solutions do.

If you think you’re brilliant, just pick a hard problem and solve it. I know, it’s …hard.

[*] Leadership programs and books create followers, not leaders.

Work expands to fill the time available for its completion.

This is also known as Parkinson’s Law. But earlier today I dug up from the Internet Archive this gem of a post (you should read it all, really):

Parkinson inferred this effect from two central principles governing the behavior of bureaucrats:

1. Officials want to multiply subordinates, not rivals.
2. Officials make work for one another.

Just a note when I seek to understand big organization behavior.

Where do you see yourself in five years?

This must be the most annoying question for many people during an interview process. An interview is a stressful situation, even for the most experienced performer, and yet you’re asked to predict the future, your future, just like that, without any other factors in. To be honest, I do not know what HR people want or think when they ask this, and I’m not sure they also do; I bet most are asking it because it is in their checklist.

Parenthesis on prediction: Five years is a long time. Often I am asked by friends and acquaintances on the (business) value of certain Engineering studies. I have learned to not answer this question, because it carries a lot of my biases, but also because I tell them this: If someone told you in 2007 that they began studying Civil Engineering in Greece, you’d assume -and tell them- that if they like the field, they won’t be out of work. Now Civil Engineering is 5 years of study (add one more for an MSc, a normal tradition in Greece) and you’re now in 2013, right in the middle of the Greek economic crisis, competing with thousands of other Civil Engineers, both old and new for virtually no work offered and for pennies.

That is why I hate these questions. But I’ve learned why certain people ask them. And this may provide a guideline how to think about these without hating the question or the interviewer. I am copying the relevant parts from an interview Brian Krzanich gave a few years ago:

I give the same advice to women and men about that. I tell them that there are three mistakes that people make in their careers, and that those three mistakes limit their potential growth.

The first mistake is not having a five-year plan. I meet so many people who say: I just want to contribute. But that doesn’t necessarily drive you to where you want to go in your career. You have to know: Where do you want to be? What do you want to accomplish? Who do you want to be in five years?

The second mistake is not telling somebody. If you don’t go talk to your boss, if you don’t go talk to your mentors, if you don’t go talk to people who can influence where you want to be, then they don’t know. And they’re not mind readers.

The third thing is you have to have a mentor. You have to have someone who’s watching out, helping you navigate the decision-making processes, how things get done, how you’re perceived from a third-party view.

After that you can now have a discussion. When you want a raise you’re not only going in saying: I want more money. You’re going in and saying: Here’s what I want out of my career. Here’s what I accomplished. Here’s what I said I was going to do. Here’s what I’ve done. Not only do I deserve more money but I want to get to here on my career.

Because what you really want is to build a career, not just get the raise. And if you do those things, whether you’re a man or a woman, you’ll be a lot better off.

If you’d asked me in 2013 where would I see myself in 5 years, there is no way I would have predicted that I would have worked for three startups, a brief stint at Intel and the software house of the biggest lottery on the planet. I was working in the Public Sector and doing relatively OK. But I know I needed two things: A pay raise and to sharpen my skills. And mind you, I was not telling anybody nor had a mentor. I did not even know how to interview. You do not have to wait to have coffee with someone to get your push like I did.

Plan ahead.

rkube: Rancher2 Kubernetes cluster on a single VM using RKE

There are many solutions to run a complete Kubernetes cluster in a VM on your machine, minikube, microk8s or even with kubeadm. So embarking into what others have done before me, I wanted to do the same with RKE. Mostly because I work with Rancher2 lately and I want to experiment on VirtualBox without remorse.

Enter rkube (the name directly inspired from minikube and rke). It does not do the many things that minikube does, but it is closer to my work environments.

We use vagrant to boot an Ubuntu Bionic box. It creates a 4G RAM / 2 CPU machine. We provision the machine using ansible_local and install docker from the Ubuntu archives. This is version 17 for Bionic. If you need a newer version, check the docker documentation and modify ansible.yml accordingly.

Once the machine boots up and is provisioned, it is ready for use. You will find the kubectl configuration file named kube_cluster_config.yml installed in the cloned repository directory. You can now run a simple echo server with:

kubectl --kubeconfig kube_cluster_config.yml apply -f echo.yml

Check that the cluster is deployed with:

kubectl --kubeconfig kube_cluster_config.yml get pod
kubectl --kubeconfig kube_cluster_config.yml get deployment
kubectl --kubeconfig kube_cluster_config.yml get svc
kubectl --kubeconfig kube_cluster_config.yml get ingress

and you can visit the echo server at http://192.168.98.100/echo Ignore the SSL error. We have not created a specific SSL certificate for the Ingress controller yet.

You can change the IP address you can connect to the RKE VM in the Vagrantfile.

Suppose you now want to upgrade the Kubernetes version. vagrant ssh into the VM and run rke config -l -s -a and pick the new version that you want to install. Look for the containers named hypercube. You now edit /vagrant/cluster.yml and run rke up --config /vagrant/cluster.yml.

Note that thanks to vagrant’s niceties, the /vagrant directory within the VM is the directory you cloned the repository into.

I developed the whole thing in Windows 10, so it should be able to run just about anywhere. I hope you like it and help me make it a bit better if you find it useful.

You can browse rkube here

the case of the yellow cluster

org.elasticsearch.transport.ConnectTransportException or the case of the yellow cluster

Some time ago we needed to add two datanode in our ElasticSearch cluster which we happily ordered from our cloud provider. The first one joined OK and shards started moving around nicely. A happy green cluster. However upon adding a second node, the cluster started accepting shards but remained in yellow state. Consistently. Like hours. Even trying to empty the node in order to remove it was not working. Some shards would stay there forever.

Upon looking at the node logs, here is what caught our attention:

org.elasticsearch.transport.ConnectTransportException: [datanode-7][10.1.2.7:9300] connect_exception

A similar log entry was found in datanode-7‘s log file. What was going on here? Well, these two machines were assigned sequential IP addresses, 10.1.2.6 and 10.1.2.7. They could literally ping the whole internet but not find each other. To which fact the cloud provider’s support group replied:

in this case you need to configure a hostroute via the gw as our switch doesn't allow a direct communication.

Enter systemd territory then, and not wanting to make this yet another service, I defaulted to the oldest boot solution. Edit /etc/rc.local (in reality /etc/rc.d/rc.local) on both machines with appropriate routes:

ip route add 10.1.2.7/32 via 10.1.2.1 dev enp0s31f6

and enable it:

# chmod +x /etc/rc.d/rc.local
# systemctl enable rc-local

rc.local will never die. It is that loyal friend that will wait to be called upon when you need them most.

resizing a vagrant box disk

[ I am about to do what others have done before me and blog about it one more time ]

While I do enjoy working with Windows 10, I am still not using WSL (waiting for WSL2) and work with either chocolatey or a vagrant Ubuntu box. It so happens that after pulling a few docker images the default 10G disk if full and you cannot work anymore. So, let’s resize the disk:

The disk on my ubuntu/bionic64 box, is a VMDK one. So before resizing, we need to transform it to a VDI first, which is easier for VirtualBox to handle:

VBoxManage clonehd .\ubuntu-bionic-18.04-cloudimg.vmdk .\ubuntu-bionic-18.04-cloudimg.vdi --format vdi

Now we can resize it, to say 20G:

VBoxManage modifymedium disk .\ubuntu-bionic-18.04-cloudimg.vdi --resize 20000

We’re almost there. We need to tell vagrant to boot from the VDI disk now. To do so open VirtualBox and visit the storage settings of the vagrant VM. Remove the VDMK disk(s) there and add the VDI on SCSI0 port. That’s it. We’re one step closer. Close VirtualBox and vagrant up now to boot from the VDI.

Now you have a 20G disk, but still a 10G partition. parted to the rescue:

$ sudo parted /dev/sda
(parted) resizepart 

It will ask you the partition number. You answer 1 (which is the /dev/sda1). It will ask you for the end of the partition. You answer -1 (which means until the end of disk). quit and you’re out.

You have changed the partition size, but still the filesystem reports the old size. resize2fs (assuming a compatible filesystem) and:

$ sudo resize2fs /dev/sda1

Now you’re done. You may want to vagrant reload to check whether everything works fine. Once you’re sure of that you can delete the old VMDK disk.