Skip to content

Kubernetes

Kaniko Setup for an Azure DevOps Linux build agent

NOTE: This document is still being tested so some parts may not quite work yet.

This document takes you through the process of setting up and using Kaniko for building containers on a Kubernetes-hosted Linux build agent without Docker being installed. This then allows for the complete removal of Docker from your worker nodes when switching over to containerd, etc…

For the purposes of this demo, the assumption is that a namespace called build-agents will be used to host Kaniko jobs and the Azure DevOps build agents. There is also a Docker secret required to push the container to Docker Hub.

PreRequisites

This process makes use of a ReadWriteMany (RWX) persistent storage volume and is assumed to be running using a build agent in the cluster as outlined in this Kubernetes Build Agent repo. The only change required is adding the following under the agent section (storageClass and size are optional):

  kaniko:
    enabled: true
    storageClass: longhorn
    size: 5Gi

Setup

Namepace

To set up your namespace for Kaniko (i.e. build-agents) run the following command:

kubectl create ns build-agents

Service Account

Next, create a file called kaniko-setup.sh and copy in the following script:

#!/bin/bash

namespace=build-agents
dockersecret=dockerhub-jabbermouth

while getopts ":n:d:?:" opt; do
  case $opt in
    n) namespace="$OPTARG"
    ;;
    d) dockersecret="$OPTARG"
    ;;
    ?) 
    echo "Usage: helpers/kaniko-setup.sh [OPTIONS]"
    echo
    echo "Options"
    echo "  n = namespace to create kaniko account in (default: $namespace)"
    echo "  d = name of Docker Hub secret to use (default: $dockersecret)"
    exit 0
    ;;
    \?) echo "Invalid option -$OPTARG" >&2
    ;;
  esac
done

echo
echo Removing existing file if present
rm kaniko-user.yaml

echo
echo Generating new user creating manifests
cat <<EOM >kaniko-user.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: pod-runner
rules:
  -
    apiGroups:
      - ""
      - apps
    resources:
      - pods
      - pods/log
    verbs: ["get", "watch", "list", "create", "delete", "update", "patch"]

---

apiVersion: v1
kind: ServiceAccount
metadata:
  name: kaniko
  namespace: $namespace
imagePullSecrets:
- name: $dockersecret

---

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: kaniko-pod-runner
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: pod-runner
subjects:
- kind: ServiceAccount
  name: kaniko
  namespace: $namespace
EOM

echo
echo Applying user manifests
kubectl apply -f kaniko-user.yaml

echo
echo Tidying up manifests
rm kaniko-user.yaml

echo
echo Getting secret
echo
secret=$(kubectl get serviceAccounts kaniko -n $namespace -o=jsonpath={.secrets[*].name})

token=$(kubectl get secret $secret -n $namespace -o=jsonpath={.data.token})

echo Or paste the following token where needed:
echo
echo $token | base64 --decode
echo

This can then be executed using the following command:

bash kaniko-setup.sh 

Token and Secure File

Create a new Azure DevOps library group called KanikoUserToken and add an entry to it. Name the variable ServiceAccount_Kaniko and copy the token from above into the value and make it a secret.

Under “Pipeline permissions” for the group, click the three vertical dots next to the + and choose “Open access”. This will be used

Azure Pipeline

The example below assumes the build is done using template files and this particular one would be for the container build and push. Update the parameter defaults as required. This will deploy to a repository on Docker Hub in a lowercase form of REPOSITORY_NAME with . replaced with - to make it compliant. This can be modified as required.

parameters:
  REPOSITORY_NAME: ''
  TAG: ''
  REGISTRY_SECRET: 'dockerhub-jabbermouth'
  DOCKER_HUB_IDENTIFIER: 'jabbermouth'

jobs:  
- job: Build${{ parameters.TAG }}
  displayName: 'Build ${{ parameters.TAG }}'
  condition: and(succeeded(), startsWith(variables['Build.SourceBranch'], 'refs/heads/${{ parameters.BRANCH_PREFIX }}'))
  pool: 'Docker (Linux)'
  variables:
    group: KanikoUserToken
  steps:
  - task: KubectlInstaller@0
    inputs:
      kubectlVersion: 'latest'

# copy code to shared folder: kaniko/buildId
  - task: CopyFiles@2
    displayName: Copy files to shared Kaniko folder
    inputs:
      SourceFolder: ''
      Contents: '**'
      TargetFolder: '/kaniko/$(Build.BuildId)/'
      CleanTargetFolder: true

# download K8s config file
  - task: DownloadSecureFile@1
    name: fetchK8sConfig
    displayName: Download Kaniko config
    inputs:
      secureFile: 'build-agent-kaniko.config'

# create pod script with folder mapped to kaniko/buildId
  - task: Bash@3
    displayName: Execute pod and wait for result
    inputs:
      targetType: 'inline'
      script: |
        #Create a deployment yaml to create the Kaniko Pod
        cat > deploy.yaml <<EOF
        apiVersion: v1
        kind: Pod
        metadata:
          name: kaniko-$(Build.BuildId)
          namespace: build-agents
        spec:
          imagePullSecrets:
          - name: ${{ parameters.REGISTRY_SECRET }}
          containers:
          - name: kaniko
            image: gcr.io/kaniko-project/executor:latest
            args:
            - "--dockerfile=Dockerfile"
            - "--context=/src/$(Build.BuildId)"
            - "--destination=${{ parameters.DOCKER_HUB_IDENTIFIER }}/${{ replace(lower(parameters.REPOSITORY_NAME),'.','-') }}:${{ parameters.TAG }}"
            volumeMounts:
            - name: kaniko-secret
              mountPath: /kaniko/.docker
            - name: source-code
              mountPath: /src
          restartPolicy: Never
          volumes:
          - name: kaniko-secret
            secret:
              secretName: ${{ parameters.REGISTRY_SECRET }}
              items:
              - key: .dockerconfigjson
                path: config.json
          - name: source-code
            persistentVolumeClaim:
              claimName: $DEPLOYMENT_NAME-kaniko
        EOF

        echo Applying pod definition to server
        kubectl apply -f deploy.yaml -n build-agents --token=$(ServiceAccount_Kaniko)

        # await pod completing
        # Monitor for Success or failure        
          while [[ $(kubectl get pods ${{ variables.jobName }} --token=$(Kaniko_ServiceAccount) -n build-agents -o jsonpath='{..status.phase}') != "Succeeded" && $(kubectl get pods ${{ variables.jobName }} --token=$(Kaniko_ServiceAccount)  -n build-agents -o jsonpath='{..status.phase}') != "Failed" ]]; do echo "waiting for pod ${{ variables.jobName }}: $(kubectl logs ${{ variables.jobName }} --token=$(Kaniko_ServiceAccount) -n build-agents | tail -1)" && sleep 10; done

        # Exit the script with error if build failed        
        if [ $(kubectl get pods kaniko-$(Build.BuildId) --token=$(ServiceAccount_Kaniko) -n build-agents -o jsonpath='{..status.phase}') == "Failed" ]; then 
            echo Build or push failed - outputing log
            echo
            kubectl logs ${{ variables.jobName }} --token=$(Kaniko_ServiceAccount) -n build-agents
            echo 
            echo Now deleting pod...
            kubectl delete -f deploy.yaml -n build-agents --token=$(ServiceAccount_Kaniko)

            echo Removing build source files
            rm -R -f /kaniko/$(Build.BuildId)

            exit 1;
        fi

        # if pod succeeded, delete the pod
        echo Build and push successed and now deleting pod
        kubectl delete -f deploy.yaml -n build-agents --token=$(ServiceAccount_Kaniko)

        echo Removing build source files
        rm -R -f /kaniko/$(Build.BuildId)

This template is called using something like:

  - template: templates/job-build-container.yaml
    parameters:
      REPOSITORY_NAME: 'Your.Respository.Name'
      TAG: 'latest'

Longhorn Restart

If your Longhorn setup gets ‘stuck', run this script to trigger a restart of all the Longhorn pods.

kubectl rollout restart daemonset engine-image-ei-d4c780c6 -n longhorn-system
kubectl rollout restart daemonset longhorn-csi-plugin -n longhorn-system
kubectl rollout restart daemonset longhorn-manager -n longhorn-system

kubectl rollout restart deploy csi-attacher -n longhorn-system
kubectl rollout restart deploy csi-provisioner -n longhorn-system
kubectl rollout restart deploy csi-resizer -n longhorn-system
kubectl rollout restart deploy csi-snapshotter -n longhorn-system
kubectl rollout restart deploy longhorn-driver-deployer -n longhorn-system
kubectl rollout restart deploy longhorn-ui -n longhorn-system

If you wish to make this script executable, save it to a file (e.g. restart-longhorn.sh) and put the following as the first line of the script file:

#!/bin/bash

And then run the following command:

chmod +x restart-longhorn.sh

Removing a failed/no longer available control plane node from the etcd cluster

If you have a control plane node fail in a cluster to the point you can no longer connect to it, this is how to remove it, including etcd, from your cluster.

The first step is to delete the node itself. For these examples, we'll assume the node is called kubernetes-cp-gone. Run the following command from a working control plane:

kubectl delete node kubernetes-cp-gone

Next we need to tidy up etcd. Firstly, we'll tidy up the Kubernetes level configuration by running the following command:

kubectl -n kube-system edit cm kubeadm-config

Once in Vi, delete the three lines underneath apiEndpoints that correspond to the deleted server (press the insert key to go into the correct mode). Once done, save your changes by pressing escape then :wq followed by enter.

Next, you need to get the name of a working and available etcd pod. You can do this by typing the following:

kubectl get pods -n kube-system | grep etcd-

Next, enter the following command, replacement etcd-pod with the name of one of your working etcd pods:

kubectl exec -n kube-system etcd-pod -it -- etcdctl --cacert /etc/kubernetes/pki/etcd/ca.crt --cert /etc/kubernetes/pki/etcd/peer.crt --key /etc/kubernetes/pki/etcd/peer.key member list -w table

Take note of the ID and then run the following command, again, replacing etcd-pod with the name of your working control plane pod:

kubectl exec -n kube-system etcd-pod -it -- etcdctl --cacert /etc/kubernetes/pki/etcd/ca.crt --cert /etc/kubernetes/pki/etcd/peer.crt --key /etc/kubernetes/pki/etcd/peer.key member remove failednodeid

Scripts for creating a kubeconfig for a new user in Kubernetes

This script is intended to create a kubeconfig file for a user and then give that user permissions to read pod data as well as exec to those pods in a particular namespace.

Once you've created the file, copy it to the machine you wish to connect to the cluster form and place it in a folder called .kube and rename it to config (no extension).

Create a file using Nano (or your preferred editor) name user-create.sh with the following content:

#!/bin/bash

if [ "$2" != "" ]; then
company=$2
else
company=your-company-name
fi

if [ "$3" != "" ]; then
namespace=$3
else
namespace=default
fi

openssl genrsa -out user-$1.key 2048
openssl req -new -key user-$1.key -out user-$1.csr -subj "/CN=$1/O=$company"
openssl x509 -req -in user-$1.csr -CA /etc/kubernetes/pki/ca.crt -CAkey /etc/kubernetes/pki/ca.key -CAcreateserial -out user-$1.crt -days 500

rm user-$1.csr

cacrt=$(cat /etc/kubernetes/pki/ca.crt | base64 | tr -d '\n')
crt=$(cat user-$1.crt | base64 | tr -d '\n')
key=$(cat user-$1.key | base64 | tr -d '\n')

cat <<EOM >user-$1.config
apiVersion: v1
kind: Config
users:
- name: $1
  user:
    client-certificate-data: $crt
    client-key-data: $key
clusters:
- cluster:
    certificate-authority-data: $cacrt
    server: https://10.10.4.20:6443
  name: kubernetes
contexts:
- context:
    cluster: kubernetes
    namespace: $namespace
    user: $1
  name: $1-context@kubernetes
current-context: $1-context@kubernetes
EOM

if [ "$4" != "" ]; then
kubectl config set-credentials $1 --client-certificate=user-$1.crt  --client-key=user-$1.key
kubectl config set-context $1-context --cluster=kubernetes --namespace=$namespace --user=$1
else
rm user-$1.crt
rm user-$1.key
fi

cat <<EOM >user-$1.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  namespace: $namespace
  name: pod-reader
rules:
- apiGroups: [""] # "" indicates the core API group
  resources: ["pods","pods/log","services","ingress","configmaps"]
  verbs: ["get", "watch", "list", "exec"]
- apiGroups: [""]
  resources: ["pods/exec"]
  verbs: ["create"]
---

apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: pod-reader-$1
  namespace: $namespace
subjects:
- kind: User
  name: $1
  apiGroup: rbac.authorization.k8s.io
roleRef:
  kind: Role
  name: pod-reader
  apiGroup: rbac.authorization.k8s.io
EOM

kubectl apply -f user-$1.yaml

rm user-$1.yaml

Save it and then you can use between 1 and 4 parameters to configure the user created:

1: user-name (should be in lowercase and parts separated by hyphens e.g. john-smith)

2: company-name (any text name you like for your company – update line 6 to set the default)

3: namespace (Kubernetes namespace to grant access to – update line 12 to set the default you want to use)

4: any-value (if a value is present, a context is created in the kubeconfig of the machine the script is executed on so you can use the following command to get a list of all pods: kubectl --context=john-smith-context get pods

Update the role definition to determine what the user can access.

To create the user with the default country name and namespace (which must exist in the cluster), run the following:

sudo bash create-user.sh john-smith

To delete a user, create the a new file called delete-user.sh and add the following content:

#!/bin/bash

if [ "$2" != "" ]; then
namespace=$2
else
namespace=default
fi

kubectl config delete-context $1-context
kubectl config delete-user $1

kubectl delete rolebinding -n $namespace pod-reader-$1

You then execute this, assuming you want the user deleting from the default namespace, with the following command:

bash user-delete.sh john-smith

Using NGINX Ingress Controller with MetalLB

This document describes getting the NGINX Ingress Controller (the community version rather than the one from NGINX Inc) to work easily with MetalLB.

For the purposes of this guide, it's assumed that MetalLB is set up. In the example, we'll assume it's using an IP range of 192.168.10.x

Firstly, we'll download the manifest so we can make a change:

curl -o ingress-nginx.yaml https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-v0.43.0/deploy/static/provider/baremetal/deploy.yaml

Once downloaded, we need to edit the file:

nano ingress-nginx.yaml

We now need to find the service that handles the ingress so press Ctrl+W and search for:

name: ingress-nginx-controller

Next, replace the line type: NodePort with the following:

  type: LoadBalancer
  loadBalancerIP: 192.168.10.254

You can use any IP within the MetalLB range but it's recommended you specify one to prevent it somehow changing in the future.

Another change you can make is update the deployment so it's a daemon set rather than a standard deployment. To do this, replace the line:

kind: Deployment

With:

kind: DaemonSet

When you're finished, save the document using control+s and then quit using control+x

Finally, we'll install everything by running the following command:

kubectl apply -f ingress-nginx.yaml

The ingress controller, once an ingress is configured, will be accessible on http(s)://192.168.10.254/

Seq on Kubernetes

This guide will tell you how to set up an HTTPS protected sec instance. This is using LetsEncrypt with cert-manager to get an SSL certificate so it assumes your Seq instance is public facing. It also uses Longhorn for storage.

First, create a ClusterIssuer:

mkdir seq
cd seq
nano certmanager.yaml

Then insert the following:

apiVersion: cert-manager.io/v1alpha3
kind: ClusterIssuer
metadata:
  name: letsencrypt-seq
  namespace: cert-manager
spec:
  acme:
    # The ACME server URL
    server: https://acme-v02.api.letsencrypt.org/directory
    # Email address used for ACME registration
    email: your.email@yourdomain.com
    # Name of a secret used to store the ACME account private key
    privateKeySecretRef:
      name: letsencrypt-seq
    # Enable the HTTP-01 challenge provider
    solvers:
    - http01:
        ingress:
          class: nginx

Don't forget to update your email address and then Ctrl+s and Ctrl+x to save and exit.

Once that's done, we'll create another file called config.yaml and save that. Populate it as follows, replacing with your domain in the three necessary places:

control# config.yaml
ingress:
  annotations:
    kubernetes.io/ingress.class: nginx
    cert-manager.io/cluster-issuer: "letsencrypt-seq"
  tls:
    - hosts:
        - seq.yourdomain.com
      secretName: letsencrypt-seq

baseURI: https://seq.yourdomain.com

ui:
  ingress:
    enabled: true
    path: /
    hosts:
      - seq.yourdomain.com

persistence:
  storageClass: longhorn

resources:
  limits:
    memory: 4Gi

baseURI: https://seq.yourdomain.com

Once saved, we'll install the instance using Helm.

helm install -f config.yaml seq datalust/seq

After about a minute, you should be able to access Seq. I highly recommend turning on user authentication immediately.