# Setting up Kubernetes Addons

{% hint style="info" %}
See the GitHub repo for this post in [kengz/k0s-cluster](https://github.com/kengz/k0s-cluster).
{% endhint %}

After creating a Kubernetes cluster, we would want to add a set of standard cluster addons using [Helm](https://helm.sh) for DevOps:

* [cert-manager](https://cert-manager.io/docs/installation/helm/): certificate management
* [cluster-autoscaler](https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler): to dynamically autoscale cluster by adding or reducing nodes.
* [metrics-server](https://github.com/kubernetes-sigs/metrics-server/tree/master/charts/metrics-server): for monitoring and and HPA (HorizontalPodAutoscaler) to work.
* [kubernetes-dashboard](https://github.com/kubernetes/dashboard#access): basic cluster monitoring (if [Lens](https://k8slens.dev/) is not available)
* [Loki (scalable)](https://github.com/grafana/loki/tree/main/production/helm/loki): to aggregate and index all logs in the cluster, with retention policy; the logs are searchable in Grafana. Additionally:
  * [promtail](https://grafana.com/docs/loki/latest/clients/promtail/#:~:text=Promtail%20is%20an%20agent%20which,Attaches%20labels%20to%20log%20streams) to aggregate logs
  * Note: [Elasticsearch charts](https://github.com/elastic/helm-charts) (hence ELK) have been deprecated in favor of their licensed ECK; plus Loki is much easier to run and maintain
* [kube-prometheus-stack](https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack): for cluster monitoring with many useful preconfigured cluster Prometheus metrics in Grafana dashboards. Additionally:
  * [prometheus-adapter](https://github.com/prometheus-community/helm-charts/tree/main/charts/prometheus-adapter) for custom metrics API, e.g. for HPA to scale using custom-defined metrics.
  * [prometheus-pushgateway](https://github.com/prometheus-community/helm-charts/tree/main/charts/prometheus-pushgateway) to push application metrics
  * [prometheus-blackbox-exporter](https://github.com/prometheus-community/helm-charts/tree/main/charts/prometheus-blackbox-exporter) to probe endpoints for uptime monitoring

Additionally, install [Lens](https://k8slens.dev/) for GUI monitoring and access to the cluster. Get a free license to use.

## Installations

### cert-manager

[Helm chart here](https://cert-manager.io/docs/installation/helm/).

```bash
# cert manager
helm repo add jetstack https://charts.jetstack.io
helm upgrade -i cert-manager jetstack/cert-manager -n cert-manager --create-namespace --version 'v1.12.1' --set installCRDs=true
```

### cluster-autoscaler

[Helm chart here](https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler). This component has some [custom settings](https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler) dependending on which cloud provider is used, but the main gist is:

```bash
helm repo add autoscaler https://kubernetes.github.io/autoscaler
helm upgrade -i cluster-autoscaler autoscaler/cluster-autoscaler -n kube-system
```

### metrics-server

[Helm chart here](https://github.com/kubernetes-sigs/metrics-server/tree/master/charts/metrics-server). Some kubernetes providers have this preinstalled, but the gist is:

```bash
helm repo add metrics-server https://kubernetes-sigs.github.io/metrics-server/
helm upgrade -i metrics-server metrics-server/metrics-server -n kube-system --version '3.10.0'
```

### kubernetes-dashboard

[Manifest here](https://github.com/kubernetes/dashboard#access) - this installs more reliably than its Helm chart. First, prepare manifest to create a service account for dashboard access:

{% code title="./cluster/dashboard-admin-user.yaml" %}

```yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  name: admin-user
  namespace: kubernetes-dashboard
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: admin-user
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cluster-admin
subjects:
  - kind: ServiceAccount
    name: admin-user
    namespace: kubernetes-dashboard
```

{% endcode %}

Then run:

```bash
kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/v2.7.0/aio/deploy/recommended.yaml
kubectl apply -f ./cluster/dashboard-admin-user.yaml
```

> Preferably, install [Lens](https://k8slens.dev/) for GUI monitoring and access to the cluster (it connects using kubeconfig). Get a free license to use.

### Loki

[Helm chart here](https://github.com/grafana/loki/tree/main/production/helm/loki). [Loki is a Grafana](https://github.com/grafana/loki) project for log aggregation - it is scalable (to TBs), cheap, and simple to maintain. The logs will show up in Grafana dashboards - which is a must-have for Kubernetes clusters.

{% hint style="info" %}
Elasticsearch has [archived their Helm charts](https://github.com/elastic/helm-charts) and [moved to a licensed model (ECK)](https://github.com/elastic/helm-charts/issues/1731), I can no longer recommend it. Also, it is quite bloated and fragile just for log aggregation.
{% endhint %}

First, prepare the Helm override file to:

* [configure storage](https://github.com/grafana/loki/blob/main/production/helm/loki/values.yaml#L253) for [scalable mode](https://grafana.com/docs/loki/latest/fundamentals/architecture/deployment-modes/#simple-scalable-deployment-mode) (s3/GCP/Azure etc., or use minio to try first)
* [configure retention](https://grafana.com/docs/loki/latest/operations/storage/retention/)

{% code title="./cluster/loki-values.yaml" %}

```yaml
# for loki scalable mode https://grafana.com/docs/loki/latest/fundamentals/architecture/deployment-modes/#simple-scalable-deployment-mode
# need to configure storage https://github.com/grafana/loki/blob/main/production/helm/loki/values.yaml#L253
# or try with minio first
minio:
  enabled: true
loki:
  storage:
    type: s3
    s3:
      s3: null
      endpoint: null
      region: null
      secretAccessKey: null
      accessKeyId: null
      s3ForcePathStyle: false
      insecure: false
  # configure retention https://grafana.com/docs/loki/latest/operations/storage/retention/
  # fields: https://grafana.com/docs/loki/latest/configuration/
  compactor:
    shared_store: filesystem
    retention_enabled: true
  limits_config:
    retention_period: 744h
  auth_enabled: false
```

{% endcode %}

We'll also install [promtail](https://grafana.com/docs/loki/latest/clients/promtail/#:~:text=Promtail%20is%20an%20agent%20which,Attaches%20labels%20to%20log%20streams) as log aggregating agent.

Then run:

```bash
helm repo add grafana https://grafana.github.io/helm-charts
helm upgrade -i loki grafana/loki -n logging --create-namespace --version '5.6.4' -f ./cluster/loki-values.yaml
helm upgrade -i promtail grafana/promtail -n logging --version '6.11.3'
```

{% hint style="info" %}
Grafana dashboard is installed later in kube-prometheus-stack; we will add Loki as a data source to it for log search on Grafana.
{% endhint %}

### kube-prometheus-stack

[Helm chart here](https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack). This includes kube-state-metrics, node-exporter, and Grafana: they gather kubernetes metrics from all the cluster nodes, and preconfigure many useful cluster metric dashboards.

Prepare the Helm override file to set/change default password, and add Loki as data source for log search in Grafana:

{% code title="./cluster/prometheus-values.yaml" %}

```yaml
grafana:
  adminPassword: prom-operator
  # configure anonymous view-access
  grafana.ini:
    auth.anonymous:
      enabled: true
      org_name: Main Org.
      org_role: Viewer
    # auth:
    #   disable_login_form: true

  persistence:
    enabled: true

  ## Configure additional grafana datasources (passed through tpl)
  ## ref: http://docs.grafana.org/administration/provisioning/#datasources
  additionalDataSources:
    - name: Loki
      type: loki
      access: proxy
      url: http://loki-gateway.logging.svc.cluster.local
      version: 1
      isDefault: false
```

{% endcode %}

We will also install 3 additional Helm charts. The first is [prometheus-adapter](https://github.com/prometheus-community/helm-charts/tree/main/charts/prometheus-adapter) for custom metrics API, e.g. for HPA to scale using custom-defined metrics.

The second is [Prometheus Pushgateway](https://github.com/prometheus-community/helm-charts/tree/main/charts/prometheus-pushgateway), e.g. to push application metrics. Specify Helm override file to configure ServiceMonitor with matching label so it is scraped by kube-prometheus-stack:

{% code title="./cluster/pushgateway-values.yaml" %}

```yaml
serviceMonitor:
  enabled: true
  additionalLabels:
    release: prometheus
```

{% endcode %}

The third is [Blackbox Exporter](https://github.com/prometheus-community/helm-charts/tree/main/charts/prometheus-blackbox-exporter) to probe endpoints for uptime monitoring. Specify the Helm override file to configure targets and ServiceMonitor with matching label so it is scraped by kube-prometheus-stack:

{% code title="./cluster/blackbox-values.yaml" %}

```yaml
config:
  modules:
    http_2xx:
      prober: http
      timeout: 5s
      http:
        valid_http_versions: ["HTTP/1.1", "HTTP/2.0"]
        follow_redirects: true
        preferred_ip_protocol: "ip4"
        valid_status_codes:
          - 200

serviceMonitor:
  enabled: true
  defaults:
    labels:
      # match kube-prometheus-stack scrape config
      release: prometheus
    interval: 30s
    scrapeTimeout: 30s
    module: http_2xx
  scheme: http

  targets: # lowercase only
    - name: github
      url: http://github.com/status
    - name: gitlab
      url: https://status.gitlab.com
```

{% endcode %}

Now, install all of these by running:

```bash
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm upgrade -i prometheus prometheus-community/kube-prometheus-stack -n monitoring --create-namespace --version '46.8.0' -f ./cluster/prometheus-values.yaml
# adapter for k8s HPA custom metrics
helm upgrade -i prom-adapter prometheus-community/prometheus-adapter -n monitoring --version '4.2.0'
# pushgateway for app metrics
helm upgrade -i prom-pushgateway prometheus-community/prometheus-pushgateway -n monitoring --version '2.2.0' -f ./cluster/pushgateway-values.yaml
# blackbox exporter for uptime monitoring
helm install blackbox prometheus-community/prometheus-blackbox-exporter -n monitoring --version '7.10.0' -f ./cluster/blackbox-values.yaml
```

> Grafana Dashboards can also be provisioned via ConfigMaps. See the repo linked above for more examples as the ConfigMap file is large.

## Accessing Dashboards

After installing the addons, access all the dashboards as follows:

* [Lens](https://k8slens.dev)
  * just open the app, it will use `~/.kube/config` to connect

<figure><img src="/files/e5pzQA6s3IPZc9fabZ3O" alt=""><figcaption><p>Lens dashboard</p></figcaption></figure>

* [Kubernetes Dashboard](https://github.com/kubernetes/dashboard#access)
  * get token: `kubectl -n kubernetes-dashboard create token admin-user`
  * run `kubectl proxy` and visit <http://localhost:8001/api/v1/namespaces/kubernetes-dashboard/services/https:kubernetes-dashboard:/proxy/>

<figure><img src="/files/OsEBTrppSjwbmgjkwHN1" alt=""><figcaption><p>Kubernetes dashboard</p></figcaption></figure>

* [Grafana](https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack) for cluster and logging monitoring
  * data sources include kube-state-metrics, node-exporter, prometheus, and custom-added loki for logs
  * run `kubectl port-forward -n monitoring svc/prometheus-grafana 6060:80` and visit <http://localhost:6060> to find the preconfigured dashboards
  * (one-time) [import](https://grafana.com/docs/grafana/latest/dashboards/manage-dashboards/#import-a-dashboard) this [Loki Kubernetes Logs](https://grafana.com/grafana/dashboards/15141-kubernetes-service-logs/) and this [Blackbox exporter](https://grafana.com/grafana/dashboards/7587-prometheus-blackbox-exporter/) dashboards

<figure><img src="/files/gcPzl8k0GxjDrGtd5lRd" alt=""><figcaption><p>One of the many Kubernetes cluster metrics dashboards</p></figcaption></figure>

<figure><img src="/files/H6PkA9ChJX5ByDlQXbyH" alt=""><figcaption><p>Loki Kubernetes Logs dashboard</p></figcaption></figure>

<figure><img src="/files/WNZTghlMWSeGbGRuUSHR" alt=""><figcaption><p>Grafana</p></figcaption></figure>

* [Prometheus](https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack) for cluster monitoring
  * run `kubectl port-forward -n monitoring svc/prometheus-kube-prometheus-prometheus 9090:9090` and visit <http://localhost:9090>

<figure><img src="/files/CXvavxS0zJcmJHh0iQM6" alt=""><figcaption><p>Prometheus targets</p></figcaption></figure>

{% hint style="info" %}
See the GitHub repo for this post in [kengz/k0s-cluster](https://github.com/kengz/k0s-cluster).
{% endhint %}


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://kengz.gitbook.io/blog/setting-up-kubernetes-addons.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
