homelab/README.md

189 lines
5.9 KiB
Markdown

# My Homelab Setup
## Getting started
### Dependencies
Install dependencies (Arch):
```sh
pacman -Sy opentofu kubectl helm helmfile python fluxcd
```
Set up Ansible:
```sh
# Tested on Python 3.13.1
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
ansible-galaxy collection install -r proxmox/ansible/collections/requirements.yml
```
### Proxmox
We first need to configure a Proxmox user for terraform to act on behalf of and
a token for the user.
```sh
# Create the user
pveum user add terraform@pve
# Create a role for the user above
pveum role add Terraform -privs "Datastore.Allocate Datastore.AllocateSpace Datastore.AllocateTemplate Datastore.Audit Pool.Allocate Sys.Audit Sys.Console Sys.Modify SDN.Use VM.Allocate VM.Audit VM.Clone VM.Config.CDROM VM.Config.Cloudinit VM.Config.CPU VM.Config.Disk VM.Config.HWType VM.Config.Memory VM.Config.Network VM.Config.Options VM.Migrate VM.Monitor VM.PowerMgmt User.Modify Pool.Audit"
# Assign the terraform user to the above role
pveum aclmod / -user terraform@pve -role Terraform
# Create the token and save it for later
pveum user token add terraform@pve provider --privsep=0
```
### Provisioning with OpenTofu/Terraform
Create a file `proxmox/tf/credentials.auto.tfvars` with the following content,
making sure to replace as necessary:
```
proxmox_api_endpoint = "https://<domain or ip>"
proxmox_api_token = "terraform@pve!provider=<token from last step>"
```
Customize the other variables in `proxmox/tf/vars.auto.tfvars` and double check
the configuration, running `tofu plan` to get a sense of what will be created.
When ready, run `tofu apply`. The command might fail from a broken pipe, but
that just happens occasionally. Run it again if it fails.
After provisioning with Terraform, make sure the SSH keys are updated:
```sh
ansible all --list-hosts -i inventory/full |\
tail -n+2 |\
awk '{ print $1 }' |\
while read -r line; do
ssh-keygen -R "$line"
ssh-keyscan -H "$line" >> ~/.ssh/known_hosts
done
```
## Creating the DNS server
We currently create a [Technitium server](https://technitium.com/dns/) to allow
for service discovery outside of the homelab networks (i.e. on my PC). This is
also imperative for services within the homelab, so **this step cannot be
skipped**.
Run
```sh
ansible-playbook -i inventory/full dns.yml
```
Try logging onto [Technitium](http://10.0.123.123:5380/), creating a DNS zone,
adding a record, editing `/etc/resolv.conf`, and querying it with `dig` to
verify it's working correctly.
**TODO**: Create a declarative DNS configuration to keep track of services for
repeatability.
## Creating a Docker hut
A _hut_ is the name I'm giving to a standalone virtual machine, as opposed to
a group or cluster of virtual machines.
The hut we're creating, _jumper_, acts to jumpstart the rest of our
infrastructure as well as run Docker workloads that are otherwise annoying
to run on Swarm, particularly those where relatively fast disk access is
required, making a network mount unreasonable. Most importantly, it provides
a non-essential Git server for the rest of the homelab.
Run the `jumper.yml` playbook:
```sh
ansible-playbook -i inventory/full jumper.yml
```
Create a context for the jumper host:
```sh
# Use IP address or add a DNS entry. Don't use mDNS, as that doesn't work.
docker context create jumper --docker "host=ssh://tony@jumper.mnke.org"
```
Deploy some compose stacks:
```sh
docker compose up -df docker/compose/traefik
docker compose up -df docker/compose/portainer
```
Preferably, also deploy the `gitea` compose file to allow for GitOps later.
## Creating a Docker swarm
The Docker swarm acts as a launchpad for the rest of the infrastructure. It
bootstraps a Portainer, Traefik, and Gitea deployment so that remaining
configuration can be done through Portainer and Git.
Run the playbook:
```sh
ansible-playbook -i inventory/stingray swarm.yml
```
Traefik will be listening on hosts:
- portainer.stingray.mnke.org
Set DNS records or edit your hosts file to point those domains to a swarm node.
## Creating a k3s cluster
Set up the k3s cluster:
```sh
ansible-playbook lvm.yml site.yml -i inventory/dolo
# You should be left with a kubeconfig. Move it to ~/.kube/config. If you
# already have a ~/.kube/config file, make sure to back it up first.
mv kubeconfig ~/.kube/config
# Verify that you can connect to the cluster
kubectl get nodes
# Back to root repo directory
cd -
# Verify deployment and service
kubectl apply -f proxmox/k8s/examples/001-example.yml
# This should succeed, and an IP should have been allocated by metallb. Check
# with the following command:
kubectl describe nginx
# Now try checking that the deployment works:
curl http://[allocated-ip]
# Clean it up
kubectl delete -f proxmox/k8s/examples/001-example.yml
```
### Set up GitOps
Prerequisites:
- Gitea is set up
- Infisical or some other secrets provider is set up (if not Infisical, change
the ClusterSecretStore manifest)
Follow [the Infisical guide to get a client id and secret](https://infisical.com/docs/documentation/platform/identities/universal-auth).
Use it to apply [a manifest](https://external-secrets.io/latest/provider/infisical/)
**in the `external-secrets` namespace**. See `k8s/pre-infrastructure/universal-auth-credentials.yaml`
for reference.
Create a Gitea token with at least enough privileges mentioned in [this guide](https://fluxcd.io/flux/installation/bootstrap/gitea/).
Run the commands below.
```sh
export GITEA_TOKEN=<token>
flux bootstrap gitea \
--owner=tony \
--repository=homelab \
--hostname=https://git.mnke.org \
--token-auth \
--path=k8s/clusters/dolo \
--personal \
--branch=master
```
## Credits
- Some inspiration and guidance was taken from [Andreas Marqvardsen's blog post](https://blog.andreasm.io/2024/01/15/proxmox-with-opentofu-kubespray-and-kubernetes)
- An automated setup of a k3s cluster from [Techno Tim's Ansible roles](https://github.com/techno-tim/k3s-ansible)