chore: Install CRDs before infrastructure

This commit is contained in:
Tony Du 2025-02-10 13:52:48 -08:00
parent e13ef4bb60
commit 9974fecf31
66 changed files with 30272 additions and 111 deletions

122
README.md
View File

@ -10,7 +10,16 @@ Install dependencies (Arch):
pacman -Sy opentofu kubectl helm helmfile python fluxcd
```
### Promxox
Set up Ansible:
```sh
# Tested on Python 3.13.1
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
ansible-galaxy collection install -r proxmox/ansible/collections/requirements.yml
```
### Proxmox
We first need to configure a Proxmox user for terraform to act on behalf of and
a token for the user.
@ -40,64 +49,91 @@ proxmox_api_token = "terraform@pve!provider=<token from last step>"
```
Customize the other variables in `proxmox/tf/vars.auto.tfvars` and double check
the configuration.
the configuration, running `tofu plan` to get a sense of what will be created.
When ready, run `opentofu apply`. The command might fail the first time if
provisioning from scratch, but it seems to be fine when running it a second
time.
When ready, run `tofu apply`. The command might fail from a broken pipe, but
that just happens occasionally. Run it again if it fails.
### Creating a Docker swarm
After provisioning with Terraform, make sure the SSH keys are updated:
```sh
ansible all --list-hosts -i inventory/full |\
tail -n+2 |\
awk '{ print $1 }' |\
while read -r line; do
ssh-keygen -R "$line"
ssh-keyscan -H "$line" >> ~/.ssh/known_hosts
done
```
## Creating the DNS server
We currently create a [Technitium server](https://technitium.com/dns/) to allow
for service discovery outside of the homelab networks (i.e. on my PC). This is
also imperative for services within the homelab, so **this step cannot be
skipped**.
Run
```sh
ansible-playbook -i inventory/full dns.yml
```
Try logging onto [Technitium](http://10.0.123.123:5380/), creating a DNS zone,
adding a record, editing `/etc/resolv.conf`, and querying it with `dig` to
verify it's working correctly.
**TODO**: Create a declarative DNS configuration to keep track of services for
repeatability.
## Creating a Docker hut
A _hut_ is the name I'm giving to a standalone virtual machine, as opposed to
a group or cluster of virtual machines.
The hut we're creating, _jumper_, acts to jumpstart the rest of our
infrastructure as well as run Docker workloads that are otherwise annoying
to run on Swarm, particularly those where relatively fast disk access is
required, making a network mount unreasonable. Most importantly, it provides
a non-essential Git server for the rest of the homelab.
Run the `jumper.yml` playbook:
```sh
ansible-playbook -i inventory/full jumper.yml
```
Create a context for the jumper host:
```sh
# Use IP address or add a DNS entry. Don't use mDNS, as that doesn't work.
docker context create jumper --docker "host=ssh://tony@jumper.mnke.org"
```
Deploy some compose stacks:
```sh
docker compose up -df docker/compose/traefik
docker compose up -df docker/compose/portainer
```
Preferably, also deploy the `gitea` compose file to allow for GitOps later.
## Creating a Docker swarm
The Docker swarm acts as a launchpad for the rest of the infrastructure. It
bootstraps a Portainer, Traefik, and Gitea deployment so that remaining
configuration can be done through Portainer and Git.
Run the playbook:
```sh
# Add SSH keys to known_hosts
ansible-inventory -i inventory/dolo --list |\
jq -r '._meta.hostvars | keys[]' |\
grep 'stingray' |\
while read -r line; do
ssh-keygen -R "$line"
ssh-keyscan -H "$line" >> ~/.ssh/known_hosts
done
# Initialize swarm
ansible-playbook -i inventory/stingray swarm.yml
```
Traefik will be listening on hosts:
- git.mnke.org
- git.stingray.mnke.org
- portainer.stingray.mnke.org
Set DNS records or edit your hosts file to point those domains to a swarm node.
### Creating a k3s cluster
Set up Ansible:
```sh
# Tested on Python 3.13.1
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
ansible-galaxy collection install -r proxmox/ansible/collections/requirements.yml
```
## Creating a k3s cluster
Set up the k3s cluster:
```sh
# Necessary because the hosts.yml file contains a relative path to the terraform
# project directory
cd proxmox/ansible
# Remove/scan keys
ansible-inventory -i inventory/dolo --list |\
jq -r '._meta.hostvars | keys[]' |\
while read -r line; do
ssh-keygen -R "$line"
ssh-keyscan -H "$line" >> ~/.ssh/known_hosts
done
ansible-playbook lvm.yml site.yml -i inventory/dolo
# You should be left with a kubeconfig. Move it to ~/.kube/config. If you
# already have a ~/.kube/config file, make sure to back it up first.
@ -118,9 +154,6 @@ curl http://[allocated-ip]
kubectl delete -f proxmox/k8s/examples/001-example.yml
```
Lastly, run `kubectl apply -f k8s/pre-infrastructure/dns-config-map.yaml` so
pods in the cluster automatically pick up on the DNS server.
### Set up GitOps
Prerequisites:
@ -129,7 +162,8 @@ Prerequisites:
Follow [the Infisical guide to get a client id and secret](https://infisical.com/docs/documentation/platform/identities/universal-auth).
Use it to apply [a manifest](https://external-secrets.io/latest/provider/infisical/)
**in the `external-secrets` namespace**.
**in the `external-secrets` namespace**. See `k8s/pre-infrastructure/universal-auth-credentials.yaml`
for reference.
Create a Gitea token with at least enough privileges mentioned in [this guide](https://fluxcd.io/flux/installation/bootstrap/gitea/).

View File

@ -2,7 +2,7 @@
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: infrastructure-01
name: crds
namespace: flux-system
spec:
interval: 1h
@ -11,53 +11,25 @@ spec:
sourceRef:
kind: GitRepository
name: flux-system
path: ./k8s/infrastructure/01
path: ./k8s/infrastructure/crds
wait: true
prune: true
---
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: infrastructure-02
namespace: flux-system
spec:
interval: 1h
retryInterval: 30s
timeout: 5m
sourceRef:
kind: GitRepository
name: flux-system
path: ./k8s/infrastructure/02
wait: true
prune: true
dependsOn:
- name: infrastructure-01
---
# What I want is one single unit that the rest of my applications relying on
# general infrastructure stuff can use `dependsOn` for by creating a single
# logical unit around the infrastructure kustomizations. I'm not sure how
# to do this other than creating a dummy Kustomization that doesn't actually
# apply anything meaningful, but just depends on everything else on this file.
# Maybe [components](https://fluxcd.io/flux/components/kustomize/kustomizations/#components)
# would help with this, but I'm not sure how it works and there's currently a
# warning that this feature is experimental and might change soon.
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: infrastructure
namespace: flux-system
spec:
interval: 1h
retryInterval: 10s
retryInterval: 30s
timeout: 5m
sourceRef:
kind: GitRepository
name: flux-system
path: ./k8s/infrastructure/dummy
path: ./k8s/infrastructure
wait: true
prune: false
dependsOn:
- name: infrastructure-01
- name: infrastructure-02
- name: crds

View File

@ -1,14 +0,0 @@
---
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: wildcard-mnke-org
namespace: traefik
spec:
secretName: wildcard-mnke-org-tls
dnsNames:
- "*.mnke.org"
- "*.dolo.mnke.org"
issuerRef:
name: le-cf-issuer
kind: ClusterIssuer

View File

@ -1,4 +1,17 @@
# k8s Infrastructure
Installing all of these manifests will get Cert Manager, an External Secrets
operator, and Traefik all set up.
These manifests set up:
- External Secrets: Allow us to pull secrets from a secrets provider
- Prometheus Stack: Cluster monitoring
- Loki + Promtail: Log aggregation, sending to Prometheus
- Longhorn and NFS: Storage providers
- cert-manager: Certificate provider
- Traefik: Ingress controller
## Notes
We must install the CRDs _before_ the controllers and the configs. We do this
by creating a Flux Kustomization to apply the CRDs before applying the Kustomize
overlay in this directory, which skips the CRDs.

View File

@ -2,3 +2,4 @@ apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- wildcard-mnke-org.yaml
- wildcard-tonydu-me.yaml

View File

@ -0,0 +1,23 @@
---
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: wildcard-mnke-org
namespace: cert-manager
spec:
secretName: wildcard-mnke-org-tls
secretTemplate:
annotations:
reflector.v1.k8s.emberstack.com/reflection-allowed: "true"
reflector.v1.k8s.emberstack.com/reflection-auto-enabled: "true" # Auto create reflection for matching namespaces
# If we don't specify the allow and auto list, then it'll sync to all
# namespaces.
# Yes, this isn't a great idea. Yes, I'm also too lazy too care.
# reflector.v1.k8s.emberstack.com/reflection-allowed-namespaces: "dev,staging,prod" # Control destination namespaces
# reflector.v1.k8s.emberstack.com/reflection-auto-namespaces: "dev,staging,prod" # Control auto-reflection namespaces
dnsNames:
- "*.mnke.org"
- "*.dolo.mnke.org"
issuerRef:
name: le-cf-issuer
kind: ClusterIssuer

View File

@ -0,0 +1,19 @@
---
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: wildcard-tonydu-me
namespace: cert-manager
spec:
secretName: wildcard-tonydu-me-tls
secretTemplate:
annotations:
reflector.v1.k8s.emberstack.com/reflection-allowed: "true"
reflector.v1.k8s.emberstack.com/reflection-auto-enabled: "true" # Auto create reflection for matching namespaces
dnsNames:
- "*.tonydu.me"
- "*.local.tonydu.me"
- "*.home.tonydu.me"
issuerRef:
name: le-cf-issuer
kind: ClusterIssuer

View File

@ -48,3 +48,4 @@ spec:
- kind: Kustomization
name: '*'

View File

@ -2,4 +2,5 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- discord.yaml
- discord-alert.yaml
- webhook-ingress.yaml

View File

@ -0,0 +1,24 @@
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: webhook-receiver
namespace: flux-system
annotations:
cert-manager.io/cluster-issuer: le-cf-issuer
kubernetes.io/ingress.class: traefik
spec:
rules:
- host: flux-webhook.dolo.mnke.org
http:
paths:
- pathType: Prefix
path: /
backend:
service:
name: webhook-receiver
port:
number: 80
tls:
- hosts:
- flux-webhook.dolo.mnke.org
secretName: wildcard-mnke-org-tls

View File

@ -1,4 +1,6 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
# TODO: Turn into overlay
resources:
- le-cf-issuer.yaml
- le-cf-issuer-staging.yaml

View File

@ -0,0 +1,23 @@
---
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: le-cf-issuer-staging
spec:
acme:
server: https://acme-staging-v02.api.letsencrypt.org/directory
email: tonydu121@hotmail.com
privateKeySecretRef:
name: le-cf-issuer-staging-pk
solvers:
- dns01:
cloudflare:
email: tonydu121@hotmail.com
apiTokenSecretRef:
name: cloudflare
key: dns-api-token
selector:
dnsZones:
- mnke.org
- tonydu.me

View File

@ -5,8 +5,7 @@ metadata:
name: le-cf-issuer
spec:
acme:
# server: https://acme-v02.api.letsencrypt.org/directory
server: https://acme-staging-v02.api.letsencrypt.org/directory
server: https://acme-v02.api.letsencrypt.org/directory
email: tonydu121@hotmail.com
privateKeySecretRef:
name: le-cf-issuer-pk
@ -20,3 +19,4 @@ spec:
selector:
dnsZones:
- mnke.org
- tonydu.me

View File

@ -4,5 +4,5 @@ resources:
- secret-stores
- issuers
- secrets
- traefik
- alerts
- flux
- certificates

View File

@ -8,9 +8,12 @@ spec:
interval: 10m
releaseName: cert-manager
targetNamespace: cert-manager
install:
crds: Skip
chart:
spec:
chart: cert-manager
version: v1.17.0
sourceRef:
kind: HelmRepository
name: jetstack

View File

@ -8,8 +8,11 @@ spec:
interval: 10m
releaseName: external-secrets
targetNamespace: external-secrets
install:
crds: Skip
chart:
spec:
version: v0.14.1
chart: external-secrets
sourceRef:
kind: HelmRepository

View File

@ -8,6 +8,8 @@ spec:
interval: 10m
releaseName: kube-prometheus-stack
targetNamespace: monitor
install:
crds: Skip
chart:
spec:
chart: kube-prometheus-stack
@ -28,6 +30,10 @@ spec:
kubernetes.io/ingress.class: traefik
hosts:
- gf.dolo.mnke.org
tls:
- secretName: wildcard-mnke-org-tls
hosts:
- gf.dolo.mnke.org
persistence:
enabled: true
type : sts

View File

@ -10,3 +10,5 @@ resources:
- kube-prometheus-stack
- loki
- promtail
- reflector
- traefik

View File

@ -17,10 +17,11 @@ spec:
namespace: flux-system
interval: 10m
values:
# This is a forward declaration!
ingress:
enabled: true
annotations:
cert-manager.io/cluster-issuer: le-cf-issuer
kubernetes.io/ingress.class: traefik
host: longhorn.dolo.mnke.org
tls: true
tlsSecret: wildcard-mnke-org-tls

View File

@ -5,5 +5,4 @@ resources:
- namespace.yaml
- repository.yaml
- release.yaml
- certificates

View File

@ -0,0 +1,5 @@
---
apiVersion: v1
kind: Namespace
metadata:
name: reflector

View File

@ -0,0 +1,18 @@
---
apiVersion: helm.toolkit.fluxcd.io/v2
kind: HelmRelease
metadata:
name: reflector
namespace: flux-system
spec:
interval: 10m
releaseName: reflector
targetNamespace: reflector
chart:
spec:
chart: reflector
sourceRef:
kind: HelmRepository
name: emberstack
namespace: flux-system
interval: 10m

View File

@ -0,0 +1,9 @@
---
apiVersion: source.toolkit.fluxcd.io/v1
kind: HelmRepository
metadata:
name: emberstack
namespace: flux-system
spec:
interval: 1m
url: https://emberstack.github.io/helm-charts

View File

@ -0,0 +1,7 @@
---
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- namespace.yaml
- repository.yaml
- release.yaml

View File

@ -8,6 +8,8 @@ spec:
interval: 10m
releaseName: traefik
targetNamespace: traefik
install:
crds: Skip
chart:
spec:
chart: traefik
@ -31,9 +33,11 @@ spec:
access:
enabled: true
format: json
defaultMode: keep
headers:
defaultMode: keep
fields:
general:
defaultmode: keep
headers:
defaultmode: keep
deployment:
enabled: true
@ -98,10 +102,10 @@ spec:
loadBalancerSourceRanges: []
externalIPs: []
tlsStore:
default:
defaultCertificate:
secretName: wildcard-mnke-org-tls
# tlsStore:
# default:
# defaultCertificate:
# secretName: wildcard-mnke-org-tls
# Mostly from https://github.com/traefik/traefik-helm-chart/blob/master/EXAMPLES.md#use-prometheus-operator
metrics:

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,26 @@
---
apiVersion: source.toolkit.fluxcd.io/v1
kind: HelmRepository
metadata:
name: prometheus-community
namespace: flux-system
spec:
interval: 10m
url: https://prometheus-community.github.io/helm-charts
---
apiVersion: helm.toolkit.fluxcd.io/v2
kind: HelmRelease
metadata:
name: prometheus-operator-crds
namespace: flux-system
spec:
interval: 10m
chart:
spec:
chart: prometheus-operator-crds
sourceRef:
kind: HelmRepository
name: prometheus-community
namespace: flux-system
interval: 10m

View File

@ -0,0 +1,31 @@
# Traefik has their own chart for CRDs :D
---
apiVersion: source.toolkit.fluxcd.io/v1
kind: HelmRepository
metadata:
name: traefik
namespace: flux-system
spec:
interval: 1m
url: https://helm.traefik.io/traefik
---
apiVersion: helm.toolkit.fluxcd.io/v2
kind: HelmRelease
metadata:
name: traefik-crds
namespace: flux-system
spec:
interval: 10m
chart:
spec:
chart: traefik-crds
sourceRef:
kind: HelmRepository
name: traefik
namespace: flux-system
interval: 10m
values:
traefik: true
gatewayAPI: true
deleteOnUninstall: false

View File

@ -1,3 +0,0 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources: []

View File

@ -0,0 +1,7 @@
---
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
# crds are purposely not included here. See README.md for more info
resources:
- configs
- controllers

View File

@ -36,7 +36,7 @@ resource "proxmox_virtual_environment_vm" "jumper" {
# Don't forget to change the cloud init file if this is changed
name = "jumper"
description = "Managed by Terraform"
tags = ["terraform", "ubuntu", "outpost"]
tags = ["terraform", "ubuntu", "hut"]
node_name = "pve"
vm_id = 7001