Add known issues, roadmap, and conditional Go toolchain patch

- Document SetVariableRT upgrade failure, 16K page size implications,
  serial console issue, and SBC install disk behavior
- Add production roadmap (4K pages, GRUB boot, serial fix, NVMe)
- Make overlay Go patch conditional: apply only on Go 1.24.x,
  skip on 1.25+ where CVEs are already fixed upstream

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
Mathias Beaulieu-Duncan 2026-02-13 18:05:51 -05:00
parent d933444fbc
commit 8178ba195e
2 changed files with 64 additions and 2 deletions

View File

@ -115,8 +115,15 @@ patches-talos:
git am "$(PATCHES_DIRECTORY)/siderolabs/talos/"*.patch git am "$(PATCHES_DIRECTORY)/siderolabs/talos/"*.patch
patches-overlay: patches-overlay:
cd "$(CHECKOUTS_DIRECTORY)/sbc-raspberrypi5" && \ @cd "$(CHECKOUTS_DIRECTORY)/sbc-raspberrypi5" && \
git am "$(PATCHES_DIRECTORY)/talos-rpi5/sbc-raspberrypi5/"*.patch GO_VER=$$(sed -n 's/^go //p' go.work | head -1) && \
GO_MINOR=$$(echo "$$GO_VER" | cut -d. -f1,2) && \
if [ "$$GO_MINOR" = "1.24" ]; then \
echo "Overlay Go $$GO_VER — applying Go toolchain patch (CVE fix)"; \
git am "$(PATCHES_DIRECTORY)/talos-rpi5/sbc-raspberrypi5/"*.patch; \
else \
echo "Overlay Go $$GO_VER — skipping Go toolchain patch (CVEs fixed upstream)"; \
fi
patches: patches-pkgs patches-talos patches-overlay patches: patches-pkgs patches-talos patches-overlay

View File

@ -46,16 +46,71 @@ zstd -d metal-arm64.raw.zst -o metal-arm64.raw
### Upgrade an existing node ### Upgrade an existing node
> **Warning:** In-place upgrades via `talosctl upgrade` may fail on RPi5/CM5 hardware with a `SetVariableRT` EFI firmware error. See [Known issues](#known-issues) below. For now, the recommended upgrade path is to re-flash the disk image.
```bash ```bash
# Re-flash method (reliable)
zstd -d metal-arm64.raw.zst -o metal-arm64.raw
# Flash to eMMC/SD via your preferred tool
# In-place method (experimental — may fail, see known issues)
talosctl upgrade --image docker.io/svrnty/talos-rpi5:v1.12.3-k6.12.47-2 talosctl upgrade --image docker.io/svrnty/talos-rpi5:v1.12.3-k6.12.47-2
``` ```
### What's included ### What's included
- RPi downstream kernel with CM5/RP1 support - RPi downstream kernel with CM5/RP1 support
- 16K page size (RPi Foundation default — see [known issues](#known-issues) for implications)
- Overclock: 2.6GHz (`arm_freq=2600`, `over_voltage_delta=50000`, `arm_boost=1`) - Overclock: 2.6GHz (`arm_freq=2600`, `over_voltage_delta=50000`, `arm_boost=1`)
- Extensions: `iscsi-tools`, `util-linux-tools` - Extensions: `iscsi-tools`, `util-linux-tools`
## Known issues
### In-place upgrade fails (SetVariableRT)
`talosctl upgrade` may fail during the bootloader installation step with:
```
Firmware does not support SetVariableRT. Can not remount with rw
```
The RPi5/CM5 EFI firmware does not support runtime EFI variable writes, which the Talos bootloader update requires. **Re-flashing the disk image is the reliable upgrade path for now.** We are investigating GRUB-based boot as a fix (see [Roadmap](#roadmap)).
*Upstream: <a href="https://github.com/talos-rpi5/talos-builder/issues/21" target="_blank">talos-builder#21</a>*
### 16K memory pages
The RPi downstream kernel defaults to 16K page size instead of upstream Talos's 4K. This means:
- **Higher per-page memory overhead** — workloads that allocate many small buffers (e.g. Longhorn v2 data engine) consume significantly more RAM
- **Potential OOM on control-plane nodes** — systems running etcd + kube-apiserver + workloads may hit memory pressure, especially on 4GB/8GB boards
- **Incompatibility with some software** that assumes 4K pages
We plan to switch to 4K pages for production readiness (see [Roadmap](#roadmap)).
*Upstream: <a href="https://github.com/talos-rpi5/talos-builder/issues/3" target="_blank">talos-builder#3</a>, <a href="https://github.com/talos-rpi5/talos-builder/issues/11" target="_blank">talos-builder#11</a>*
### No serial console output after boot
Serial output goes silent after the EFI stub decompresses the kernel and exits boot services. This affects headless debugging on CM5 boards where serial is the primary console.
*Upstream: <a href="https://github.com/talos-rpi5/talos-builder/issues/4" target="_blank">talos-builder#4</a>*
### Install disk config ignored on SBCs
Talos ignores the `machine.install.disk` config field on SBC platforms. You **must flash the disk image directly** to your target disk (eMMC, SD, NVMe). Booting from USB or NVMe also requires flashing directly to that disk — the image targets SD (`mmcblk0`) by default.
*Upstream: <a href="https://github.com/talos-rpi5/talos-builder/issues/22" target="_blank">talos-builder#22</a>*
## Roadmap
This project targets production-ready Talos clusters on RPi5/CM5 hardware. Key milestones:
- [ ] **Switch to 4K page size** — Align with upstream Talos kernel config to reduce memory overhead and improve workload compatibility. Requires testing RPi peripheral drivers with 4K pages.
- [ ] **Reliable in-place upgrades** — Investigate GRUB-based boot or alternative bootloader strategies to work around the `SetVariableRT` firmware limitation, enabling `talosctl upgrade` on RPi5/CM5.
- [ ] **Serial console fix** — Debug U-Boot/kernel handoff to restore serial output after EFI stub exit.
- [ ] **NVMe boot support** — Produce images that target NVMe directly, or document a supported NVMe boot flow.
## Building ## Building
For local builds, CI/CD setup, runner configuration, and project structure, see [TECHNICAL.md](TECHNICAL.md). For local builds, CI/CD setup, runner configuration, and project structure, see [TECHNICAL.md](TECHNICAL.md).