chore: Install CRDs before infrastructure

This commit is contained in:
Tony Du 2025-02-10 13:52:48 -08:00
parent e13ef4bb60
commit 9974fecf31
66 changed files with 30272 additions and 111 deletions

122
README.md
View File

@ -10,7 +10,16 @@ Install dependencies (Arch):
pacman -Sy opentofu kubectl helm helmfile python fluxcd pacman -Sy opentofu kubectl helm helmfile python fluxcd
``` ```
### Promxox Set up Ansible:
```sh
# Tested on Python 3.13.1
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
ansible-galaxy collection install -r proxmox/ansible/collections/requirements.yml
```
### Proxmox
We first need to configure a Proxmox user for terraform to act on behalf of and We first need to configure a Proxmox user for terraform to act on behalf of and
a token for the user. a token for the user.
@ -40,64 +49,91 @@ proxmox_api_token = "terraform@pve!provider=<token from last step>"
``` ```
Customize the other variables in `proxmox/tf/vars.auto.tfvars` and double check Customize the other variables in `proxmox/tf/vars.auto.tfvars` and double check
the configuration. the configuration, running `tofu plan` to get a sense of what will be created.
When ready, run `opentofu apply`. The command might fail the first time if When ready, run `tofu apply`. The command might fail from a broken pipe, but
provisioning from scratch, but it seems to be fine when running it a second that just happens occasionally. Run it again if it fails.
time.
### Creating a Docker swarm After provisioning with Terraform, make sure the SSH keys are updated:
```sh
ansible all --list-hosts -i inventory/full |\
tail -n+2 |\
awk '{ print $1 }' |\
while read -r line; do
ssh-keygen -R "$line"
ssh-keyscan -H "$line" >> ~/.ssh/known_hosts
done
```
## Creating the DNS server
We currently create a [Technitium server](https://technitium.com/dns/) to allow
for service discovery outside of the homelab networks (i.e. on my PC). This is
also imperative for services within the homelab, so **this step cannot be
skipped**.
Run
```sh
ansible-playbook -i inventory/full dns.yml
```
Try logging onto [Technitium](http://10.0.123.123:5380/), creating a DNS zone,
adding a record, editing `/etc/resolv.conf`, and querying it with `dig` to
verify it's working correctly.
**TODO**: Create a declarative DNS configuration to keep track of services for
repeatability.
## Creating a Docker hut
A _hut_ is the name I'm giving to a standalone virtual machine, as opposed to
a group or cluster of virtual machines.
The hut we're creating, _jumper_, acts to jumpstart the rest of our
infrastructure as well as run Docker workloads that are otherwise annoying
to run on Swarm, particularly those where relatively fast disk access is
required, making a network mount unreasonable. Most importantly, it provides
a non-essential Git server for the rest of the homelab.
Run the `jumper.yml` playbook:
```sh
ansible-playbook -i inventory/full jumper.yml
```
Create a context for the jumper host:
```sh
# Use IP address or add a DNS entry. Don't use mDNS, as that doesn't work.
docker context create jumper --docker "host=ssh://tony@jumper.mnke.org"
```
Deploy some compose stacks:
```sh
docker compose up -df docker/compose/traefik
docker compose up -df docker/compose/portainer
```
Preferably, also deploy the `gitea` compose file to allow for GitOps later.
## Creating a Docker swarm
The Docker swarm acts as a launchpad for the rest of the infrastructure. It The Docker swarm acts as a launchpad for the rest of the infrastructure. It
bootstraps a Portainer, Traefik, and Gitea deployment so that remaining bootstraps a Portainer, Traefik, and Gitea deployment so that remaining
configuration can be done through Portainer and Git. configuration can be done through Portainer and Git.
Run the playbook:
```sh ```sh
# Add SSH keys to known_hosts
ansible-inventory -i inventory/dolo --list |\
jq -r '._meta.hostvars | keys[]' |\
grep 'stingray' |\
while read -r line; do
ssh-keygen -R "$line"
ssh-keyscan -H "$line" >> ~/.ssh/known_hosts
done
# Initialize swarm
ansible-playbook -i inventory/stingray swarm.yml ansible-playbook -i inventory/stingray swarm.yml
``` ```
Traefik will be listening on hosts: Traefik will be listening on hosts:
- git.mnke.org
- git.stingray.mnke.org
- portainer.stingray.mnke.org - portainer.stingray.mnke.org
Set DNS records or edit your hosts file to point those domains to a swarm node. Set DNS records or edit your hosts file to point those domains to a swarm node.
### Creating a k3s cluster ## Creating a k3s cluster
Set up Ansible:
```sh
# Tested on Python 3.13.1
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
ansible-galaxy collection install -r proxmox/ansible/collections/requirements.yml
```
Set up the k3s cluster: Set up the k3s cluster:
```sh ```sh
# Necessary because the hosts.yml file contains a relative path to the terraform
# project directory
cd proxmox/ansible
# Remove/scan keys
ansible-inventory -i inventory/dolo --list |\
jq -r '._meta.hostvars | keys[]' |\
while read -r line; do
ssh-keygen -R "$line"
ssh-keyscan -H "$line" >> ~/.ssh/known_hosts
done
ansible-playbook lvm.yml site.yml -i inventory/dolo ansible-playbook lvm.yml site.yml -i inventory/dolo
# You should be left with a kubeconfig. Move it to ~/.kube/config. If you # You should be left with a kubeconfig. Move it to ~/.kube/config. If you
# already have a ~/.kube/config file, make sure to back it up first. # already have a ~/.kube/config file, make sure to back it up first.
@ -118,9 +154,6 @@ curl http://[allocated-ip]
kubectl delete -f proxmox/k8s/examples/001-example.yml kubectl delete -f proxmox/k8s/examples/001-example.yml
``` ```
Lastly, run `kubectl apply -f k8s/pre-infrastructure/dns-config-map.yaml` so
pods in the cluster automatically pick up on the DNS server.
### Set up GitOps ### Set up GitOps
Prerequisites: Prerequisites:
@ -129,7 +162,8 @@ Prerequisites:
Follow [the Infisical guide to get a client id and secret](https://infisical.com/docs/documentation/platform/identities/universal-auth). Follow [the Infisical guide to get a client id and secret](https://infisical.com/docs/documentation/platform/identities/universal-auth).
Use it to apply [a manifest](https://external-secrets.io/latest/provider/infisical/) Use it to apply [a manifest](https://external-secrets.io/latest/provider/infisical/)
**in the `external-secrets` namespace**. **in the `external-secrets` namespace**. See `k8s/pre-infrastructure/universal-auth-credentials.yaml`
for reference.
Create a Gitea token with at least enough privileges mentioned in [this guide](https://fluxcd.io/flux/installation/bootstrap/gitea/). Create a Gitea token with at least enough privileges mentioned in [this guide](https://fluxcd.io/flux/installation/bootstrap/gitea/).

View File

@ -2,7 +2,7 @@
apiVersion: kustomize.toolkit.fluxcd.io/v1 apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization kind: Kustomization
metadata: metadata:
name: infrastructure-01 name: crds
namespace: flux-system namespace: flux-system
spec: spec:
interval: 1h interval: 1h
@ -11,53 +11,25 @@ spec:
sourceRef: sourceRef:
kind: GitRepository kind: GitRepository
name: flux-system name: flux-system
path: ./k8s/infrastructure/01 path: ./k8s/infrastructure/crds
wait: true wait: true
prune: true prune: true
--- ---
apiVersion: kustomize.toolkit.fluxcd.io/v1 apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization kind: Kustomization
metadata:
name: infrastructure-02
namespace: flux-system
spec:
interval: 1h
retryInterval: 30s
timeout: 5m
sourceRef:
kind: GitRepository
name: flux-system
path: ./k8s/infrastructure/02
wait: true
prune: true
dependsOn:
- name: infrastructure-01
---
# What I want is one single unit that the rest of my applications relying on
# general infrastructure stuff can use `dependsOn` for by creating a single
# logical unit around the infrastructure kustomizations. I'm not sure how
# to do this other than creating a dummy Kustomization that doesn't actually
# apply anything meaningful, but just depends on everything else on this file.
# Maybe [components](https://fluxcd.io/flux/components/kustomize/kustomizations/#components)
# would help with this, but I'm not sure how it works and there's currently a
# warning that this feature is experimental and might change soon.
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata: metadata:
name: infrastructure name: infrastructure
namespace: flux-system namespace: flux-system
spec: spec:
interval: 1h interval: 1h
retryInterval: 10s retryInterval: 30s
timeout: 5m timeout: 5m
sourceRef: sourceRef:
kind: GitRepository kind: GitRepository
name: flux-system name: flux-system
path: ./k8s/infrastructure/dummy path: ./k8s/infrastructure
wait: true wait: true
prune: false prune: false
dependsOn: dependsOn:
- name: infrastructure-01 - name: crds
- name: infrastructure-02

View File

@ -1,14 +0,0 @@
---
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: wildcard-mnke-org
namespace: traefik
spec:
secretName: wildcard-mnke-org-tls
dnsNames:
- "*.mnke.org"
- "*.dolo.mnke.org"
issuerRef:
name: le-cf-issuer
kind: ClusterIssuer

View File

@ -1,4 +1,17 @@
# k8s Infrastructure # k8s Infrastructure
Installing all of these manifests will get Cert Manager, an External Secrets These manifests set up:
operator, and Traefik all set up.
- External Secrets: Allow us to pull secrets from a secrets provider
- Prometheus Stack: Cluster monitoring
- Loki + Promtail: Log aggregation, sending to Prometheus
- Longhorn and NFS: Storage providers
- cert-manager: Certificate provider
- Traefik: Ingress controller
## Notes
We must install the CRDs _before_ the controllers and the configs. We do this
by creating a Flux Kustomization to apply the CRDs before applying the Kustomize
overlay in this directory, which skips the CRDs.

View File

@ -2,3 +2,4 @@ apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization kind: Kustomization
resources: resources:
- wildcard-mnke-org.yaml - wildcard-mnke-org.yaml
- wildcard-tonydu-me.yaml

View File

@ -0,0 +1,23 @@
---
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: wildcard-mnke-org
namespace: cert-manager
spec:
secretName: wildcard-mnke-org-tls
secretTemplate:
annotations:
reflector.v1.k8s.emberstack.com/reflection-allowed: "true"
reflector.v1.k8s.emberstack.com/reflection-auto-enabled: "true" # Auto create reflection for matching namespaces
# If we don't specify the allow and auto list, then it'll sync to all
# namespaces.
# Yes, this isn't a great idea. Yes, I'm also too lazy too care.
# reflector.v1.k8s.emberstack.com/reflection-allowed-namespaces: "dev,staging,prod" # Control destination namespaces
# reflector.v1.k8s.emberstack.com/reflection-auto-namespaces: "dev,staging,prod" # Control auto-reflection namespaces
dnsNames:
- "*.mnke.org"
- "*.dolo.mnke.org"
issuerRef:
name: le-cf-issuer
kind: ClusterIssuer

View File

@ -0,0 +1,19 @@
---
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: wildcard-tonydu-me
namespace: cert-manager
spec:
secretName: wildcard-tonydu-me-tls
secretTemplate:
annotations:
reflector.v1.k8s.emberstack.com/reflection-allowed: "true"
reflector.v1.k8s.emberstack.com/reflection-auto-enabled: "true" # Auto create reflection for matching namespaces
dnsNames:
- "*.tonydu.me"
- "*.local.tonydu.me"
- "*.home.tonydu.me"
issuerRef:
name: le-cf-issuer
kind: ClusterIssuer

View File

@ -48,3 +48,4 @@ spec:
- kind: Kustomization - kind: Kustomization
name: '*' name: '*'

View File

@ -2,4 +2,5 @@
apiVersion: kustomize.config.k8s.io/v1beta1 apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization kind: Kustomization
resources: resources:
- discord.yaml - discord-alert.yaml
- webhook-ingress.yaml

View File

@ -0,0 +1,24 @@
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: webhook-receiver
namespace: flux-system
annotations:
cert-manager.io/cluster-issuer: le-cf-issuer
kubernetes.io/ingress.class: traefik
spec:
rules:
- host: flux-webhook.dolo.mnke.org
http:
paths:
- pathType: Prefix
path: /
backend:
service:
name: webhook-receiver
port:
number: 80
tls:
- hosts:
- flux-webhook.dolo.mnke.org
secretName: wildcard-mnke-org-tls

View File

@ -1,4 +1,6 @@
apiVersion: kustomize.config.k8s.io/v1beta1 apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization kind: Kustomization
# TODO: Turn into overlay
resources: resources:
- le-cf-issuer.yaml - le-cf-issuer.yaml
- le-cf-issuer-staging.yaml

View File

@ -0,0 +1,23 @@
---
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: le-cf-issuer-staging
spec:
acme:
server: https://acme-staging-v02.api.letsencrypt.org/directory
email: tonydu121@hotmail.com
privateKeySecretRef:
name: le-cf-issuer-staging-pk
solvers:
- dns01:
cloudflare:
email: tonydu121@hotmail.com
apiTokenSecretRef:
name: cloudflare
key: dns-api-token
selector:
dnsZones:
- mnke.org
- tonydu.me

View File

@ -5,8 +5,7 @@ metadata:
name: le-cf-issuer name: le-cf-issuer
spec: spec:
acme: acme:
# server: https://acme-v02.api.letsencrypt.org/directory server: https://acme-v02.api.letsencrypt.org/directory
server: https://acme-staging-v02.api.letsencrypt.org/directory
email: tonydu121@hotmail.com email: tonydu121@hotmail.com
privateKeySecretRef: privateKeySecretRef:
name: le-cf-issuer-pk name: le-cf-issuer-pk
@ -20,3 +19,4 @@ spec:
selector: selector:
dnsZones: dnsZones:
- mnke.org - mnke.org
- tonydu.me

View File

@ -4,5 +4,5 @@ resources:
- secret-stores - secret-stores
- issuers - issuers
- secrets - secrets
- traefik - flux
- alerts - certificates

View File

@ -8,9 +8,12 @@ spec:
interval: 10m interval: 10m
releaseName: cert-manager releaseName: cert-manager
targetNamespace: cert-manager targetNamespace: cert-manager
install:
crds: Skip
chart: chart:
spec: spec:
chart: cert-manager chart: cert-manager
version: v1.17.0
sourceRef: sourceRef:
kind: HelmRepository kind: HelmRepository
name: jetstack name: jetstack

View File

@ -8,8 +8,11 @@ spec:
interval: 10m interval: 10m
releaseName: external-secrets releaseName: external-secrets
targetNamespace: external-secrets targetNamespace: external-secrets
install:
crds: Skip
chart: chart:
spec: spec:
version: v0.14.1
chart: external-secrets chart: external-secrets
sourceRef: sourceRef:
kind: HelmRepository kind: HelmRepository

View File

@ -8,6 +8,8 @@ spec:
interval: 10m interval: 10m
releaseName: kube-prometheus-stack releaseName: kube-prometheus-stack
targetNamespace: monitor targetNamespace: monitor
install:
crds: Skip
chart: chart:
spec: spec:
chart: kube-prometheus-stack chart: kube-prometheus-stack
@ -28,6 +30,10 @@ spec:
kubernetes.io/ingress.class: traefik kubernetes.io/ingress.class: traefik
hosts: hosts:
- gf.dolo.mnke.org - gf.dolo.mnke.org
tls:
- secretName: wildcard-mnke-org-tls
hosts:
- gf.dolo.mnke.org
persistence: persistence:
enabled: true enabled: true
type : sts type : sts

View File

@ -10,3 +10,5 @@ resources:
- kube-prometheus-stack - kube-prometheus-stack
- loki - loki
- promtail - promtail
- reflector
- traefik

View File

@ -17,10 +17,11 @@ spec:
namespace: flux-system namespace: flux-system
interval: 10m interval: 10m
values: values:
# This is a forward declaration!
ingress: ingress:
enabled: true enabled: true
annotations: annotations:
cert-manager.io/cluster-issuer: le-cf-issuer cert-manager.io/cluster-issuer: le-cf-issuer
kubernetes.io/ingress.class: traefik kubernetes.io/ingress.class: traefik
host: longhorn.dolo.mnke.org host: longhorn.dolo.mnke.org
tls: true
tlsSecret: wildcard-mnke-org-tls

View File

@ -5,5 +5,4 @@ resources:
- namespace.yaml - namespace.yaml
- repository.yaml - repository.yaml
- release.yaml - release.yaml
- certificates

View File

@ -0,0 +1,5 @@
---
apiVersion: v1
kind: Namespace
metadata:
name: reflector

View File

@ -0,0 +1,18 @@
---
apiVersion: helm.toolkit.fluxcd.io/v2
kind: HelmRelease
metadata:
name: reflector
namespace: flux-system
spec:
interval: 10m
releaseName: reflector
targetNamespace: reflector
chart:
spec:
chart: reflector
sourceRef:
kind: HelmRepository
name: emberstack
namespace: flux-system
interval: 10m

View File

@ -0,0 +1,9 @@
---
apiVersion: source.toolkit.fluxcd.io/v1
kind: HelmRepository
metadata:
name: emberstack
namespace: flux-system
spec:
interval: 1m
url: https://emberstack.github.io/helm-charts

View File

@ -0,0 +1,7 @@
---
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- namespace.yaml
- repository.yaml
- release.yaml

View File

@ -8,6 +8,8 @@ spec:
interval: 10m interval: 10m
releaseName: traefik releaseName: traefik
targetNamespace: traefik targetNamespace: traefik
install:
crds: Skip
chart: chart:
spec: spec:
chart: traefik chart: traefik
@ -31,9 +33,11 @@ spec:
access: access:
enabled: true enabled: true
format: json format: json
defaultMode: keep fields:
headers: general:
defaultMode: keep defaultmode: keep
headers:
defaultmode: keep
deployment: deployment:
enabled: true enabled: true
@ -98,10 +102,10 @@ spec:
loadBalancerSourceRanges: [] loadBalancerSourceRanges: []
externalIPs: [] externalIPs: []
tlsStore: # tlsStore:
default: # default:
defaultCertificate: # defaultCertificate:
secretName: wildcard-mnke-org-tls # secretName: wildcard-mnke-org-tls
# Mostly from https://github.com/traefik/traefik-helm-chart/blob/master/EXAMPLES.md#use-prometheus-operator # Mostly from https://github.com/traefik/traefik-helm-chart/blob/master/EXAMPLES.md#use-prometheus-operator
metrics: metrics:

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,26 @@
---
apiVersion: source.toolkit.fluxcd.io/v1
kind: HelmRepository
metadata:
name: prometheus-community
namespace: flux-system
spec:
interval: 10m
url: https://prometheus-community.github.io/helm-charts
---
apiVersion: helm.toolkit.fluxcd.io/v2
kind: HelmRelease
metadata:
name: prometheus-operator-crds
namespace: flux-system
spec:
interval: 10m
chart:
spec:
chart: prometheus-operator-crds
sourceRef:
kind: HelmRepository
name: prometheus-community
namespace: flux-system
interval: 10m

View File

@ -0,0 +1,31 @@
# Traefik has their own chart for CRDs :D
---
apiVersion: source.toolkit.fluxcd.io/v1
kind: HelmRepository
metadata:
name: traefik
namespace: flux-system
spec:
interval: 1m
url: https://helm.traefik.io/traefik
---
apiVersion: helm.toolkit.fluxcd.io/v2
kind: HelmRelease
metadata:
name: traefik-crds
namespace: flux-system
spec:
interval: 10m
chart:
spec:
chart: traefik-crds
sourceRef:
kind: HelmRepository
name: traefik
namespace: flux-system
interval: 10m
values:
traefik: true
gatewayAPI: true
deleteOnUninstall: false

View File

@ -1,3 +0,0 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources: []

View File

@ -0,0 +1,7 @@
---
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
# crds are purposely not included here. See README.md for more info
resources:
- configs
- controllers

View File

@ -36,7 +36,7 @@ resource "proxmox_virtual_environment_vm" "jumper" {
# Don't forget to change the cloud init file if this is changed # Don't forget to change the cloud init file if this is changed
name = "jumper" name = "jumper"
description = "Managed by Terraform" description = "Managed by Terraform"
tags = ["terraform", "ubuntu", "outpost"] tags = ["terraform", "ubuntu", "hut"]
node_name = "pve" node_name = "pve"
vm_id = 7001 vm_id = 7001