Nerdz

Life, Tech, Linux, Kubernetes.

Hardware Homelab

Three 990 PROs, One Batch, All Dying — Part 2: The Replacement

Enterprise SSDs arrived, so I migrated a live Talos control plane onto them. First I had to fix the backups, then learn that swapping a boot disk on Talos isn't a swap at all — it's a rebuild. Plus the canary node that taught me five things I only half-believed.

Hardware Homelab

Three 990 PROs, One Batch, All Dying — Part 3: The Part Where the Canary Lied

The canary migration went perfectly, so I ran the same playbook on the last two nodes. They found five new ways to make me earn it — node-local data that vaporises on reinstall, an OSD that booted faster than its network, a password bug I'd only half-fixed, a restore that raced itself, and a serial number I wrongly swore I couldn't read.

Hardware Homelab

The Slow Death of Three Samsung 990 PROs

What happens when you put consumer NVMe under an etcd + Ceph mon workload. Part 1 of 2.

Deploying Open Source LLMs in a Homelab - Part 4

Ditching Ollama for LocalAI, battling P2P federation that doesn't work in Kubernetes, and building a self-hosted AI stack with persistent memory.

Cloud Homelab Android

Cloud Provider Roulette: Finding a Home for Redroid

A journey through TrueNAS, Oracle Cloud, and Hetzner before finally landing on AWS Graviton for running Android containers with acceptable latency from New Zealand.

Running Game Servers from a NAS: Pterodactyl + TrueNAS

Deploying Pterodactyl Panel on Kubernetes with Wings running on TrueNAS for self-hosted game server management

Storage Homelab Backups

CephFS Sparse File Corruption: A Data Recovery Story

How a CephFS sparse file handling quirk silently corrupted my app configs during VolSync restores—and the multi-day recovery effort across qbittorrent, sabnzbd, sonarr, radarr, and filebrowser using a mix of Kopia snapshots and old Restic backups.

Storage Homelab

Upgrading Ceph from Reef to Tentacle in a Rook-Managed Cluster

A real-world walkthrough of upgrading Ceph from v18 (Reef) through v19 (Squid) to v20 (Tentacle) via GitOps—including the correction of my wrong assumptions about Rook version constraints.

Networking Homelab

When BGP Doesn't Fix Hairpin: Cilium DSR and the Same-Node Problem

BGP was supposed to fix my hairpin routing issues. It didn't. Here's how CoreDNS rewriting saved the day when pods couldn't reach LoadBalancer VIPs on the same node.

Database Backups Homelab

pgBackRest: Multi-Destination PostgreSQL Backups in CloudNativePG

How I replaced Barman Cloud Plugin with pgBackRest to get true dual-destination full backups to both Backblaze B2 and Cloudflare R2, then migrated my entire PostgreSQL infrastructure to PostgreSQL 18.