Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 11 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,17 @@ talm init -p cozystack -N myawesomecluster --image factory.talos.dev/installer/<

`--image` rewrites the top-level `image:` field in the preset's `values.yaml` before write. The flag is honored on initial `init` only — for an existing project, edit `values.yaml` directly. The `cozystack` preset declares `image:`; the `generic` preset does not, so `--image --preset generic` is rejected up front.

Edit `values.yaml` to set your cluster's control-plane endpoint. This is the URL every node's kubelet and kube-proxy will dial. The chart leaves it empty on purpose so a missed override fails loudly instead of silently embedding a placeholder.
To set the Kubernetes control-plane URL at init time, pass `--cluster-endpoint`:

```bash
talm init -p cozystack -N myawesomecluster --cluster-endpoint https://vip.example.test:6443
```

`--cluster-endpoint` writes the URL into `values.yaml::endpoint`, which the chart renders into `cluster.controlPlane.endpoint` of every node's MachineConfig (the URL kubelet and kube-proxy dial). The flag is honored on initial `init` only — for an existing project, edit `values.yaml` directly.

`--endpoints` and `--cluster-endpoint` address different concepts: `--endpoints` (plural, list) populates the `talosconfig` context for the talosctl client; `--cluster-endpoint` (singular, full URL) populates the Kubernetes control-plane address inside the chart. When `--endpoints` is given a single value, init auto-derives `values.yaml::endpoint` as `https://<that>:6443` — the single-target case is unambiguous. Multi-endpoint inputs never auto-derive (picking one node would silently couple cluster availability to it); the operator must pass `--cluster-endpoint` explicitly or fill `values.yaml::endpoint` later. The init flow prints a hint at the end when the field is left empty.

Edit `values.yaml` to set your cluster's control-plane endpoint if neither flag set it. This is the URL every node's kubelet and kube-proxy will dial. The chart leaves it empty by default so a missed override fails loudly instead of silently embedding a placeholder.

Endpoint / floatingIP combinations:

Expand Down
59 changes: 59 additions & 0 deletions docs/manual-test-plan.md
Original file line number Diff line number Diff line change
Expand Up @@ -98,6 +98,65 @@ Expected: clear error mentioning the missing key path.
rm -rf /tmp/talm-init-test
```

### A6a. `init --decrypt` without `talm.key` surfaces recovery hint

```bash
mkdir -p /tmp/talm-decrypt-test && cd /tmp/talm-decrypt-test
/tmp/talm-safety init --preset cozystack --name test --endpoints 192.0.2.1
mv talm.key /tmp/talm.key.backup
/tmp/talm-safety init --decrypt
mv /tmp/talm.key.backup talm.key # restore for next run
rm -rf /tmp/talm-decrypt-test
```

Expected: error `failed to decrypt secrets: load key: read key file: open <path>/talm.key: no such file or directory` followed by hint `talm.key is required to decrypt secrets.encrypted.yaml. Restore your backed-up key, or re-run \`talm init\` to regenerate (this writes new secrets — the old secrets.encrypted.yaml will not be decryptable without the original key).`

Regression anchor: the hint must name BOTH recovery paths (restore from backup, re-run init to regenerate) AND include the warning that regeneration writes new secrets making the old encrypted secrets undecryptable. A regression that drops either path or the warning silently invites operators to "just run init again" without understanding the data-loss tradeoff.

### A9. `init --cluster-endpoint` populates values.yaml::endpoint

```bash
mkdir -p /tmp/talm-cluster-ep-test && cd /tmp/talm-cluster-ep-test
/tmp/talm-safety init --preset cozystack --name test \
--endpoints 10.0.0.1,10.0.0.2,10.0.0.3 \
--cluster-endpoint https://vip.example.test:6443
grep "^endpoint:" values.yaml
grep -A 3 "endpoints:" talosconfig | head -5
rm -rf /tmp/talm-cluster-ep-test
```

Expected: `values.yaml` has `endpoint: "https://vip.example.test:6443"` (operator's VIP, explicit), `talosconfig` has all three `10.0.0.1`/`10.0.0.2`/`10.0.0.3` under the context's endpoints array. The two flags address different concepts — `--cluster-endpoint` is the kube-apiserver URL, `--endpoints` is the talosctl-client list — and both round-trip independently.

Regression anchor: removing `--cluster-endpoint` and passing only `--endpoints 10.0.0.1,10.0.0.2,10.0.0.3` MUST leave `values.yaml::endpoint` empty (the multi-endpoint case never auto-derives — picking one node would silently couple cluster availability to it). The init flow MUST then print a hint at the end pointing the operator at `values.yaml::endpoint` with examples for VIP / LB shapes.

### A10. `init --endpoints` with single entry auto-derives values.yaml::endpoint

```bash
mkdir -p /tmp/talm-single-ep-test && cd /tmp/talm-single-ep-test
/tmp/talm-safety init --preset cozystack --name test --endpoints 192.0.2.10
grep "^endpoint:" values.yaml
rm -rf /tmp/talm-single-ep-test
```

Expected: `endpoint: "https://192.0.2.10:6443"` — the single-endpoint case is unambiguously "this is also the cluster URL", so init derives the canonical `https://<host>:6443` form. No hint printed at end of init for this case.

Regression anchor: this auto-derive ONLY fires when `len(--endpoints) == 1`. Multi-endpoint inputs MUST leave the field empty (see A9).

### A11. `init --cluster-endpoint` rejects malformed URL before any files land on disk

```bash
mkdir -p /tmp/talm-bad-ep-test && cd /tmp/talm-bad-ep-test
/tmp/talm-safety init --preset cozystack --name test \
--endpoints 192.0.2.10 \
--cluster-endpoint "not-a-url"
ls -la # should be empty
rm -rf /tmp/talm-bad-ep-test
```

Expected: error `cluster-endpoint "not-a-url" is missing scheme, host, or port` with hint pointing at the canonical form. Directory remains empty — NO `talosconfig`, NO `.gitignore`, NO `secrets.yaml` written. Validation happens in PreRunE before any file writes so a malformed flag never produces a half-initialised project tree.

Regression anchor: a regression that defers validation to RunE (i.e. after the secret-bundle generation) would leave `talosconfig` and `.gitignore` behind — verify by checking `ls -la` after the failing command finds an empty directory.

### A8. `init --update` without `--preset` and without preset dep in Chart.yaml

```bash
Expand Down
14 changes: 14 additions & 0 deletions pkg/age/age.go
Original file line number Diff line number Diff line change
Expand Up @@ -104,6 +104,20 @@ func LoadKey(rootDir string) (*age.X25519Identity, error) {

keyData, err := os.ReadFile(keyFile)
if err != nil {
if errors.Is(err, os.ErrNotExist) {
// Talm.key absence is a real operational scenario
// (laptop swap, lost backup, fresh checkout) and the
// raw "open ...: no such file" surfaces as if it's an
// internal bug. Attach the recovery hint naming both
// paths the operator can take so the error reads as
// "what to do next", not "talm is broken".
//nolint:wrapcheck // cockroachdb/errors.WithHint is the project's wrapping/hinting idiom at boundaries.
return nil, errors.WithHint(
errors.Wrap(err, "read key file"),
"talm.key is required to decrypt secrets.encrypted.yaml. Restore your backed-up key, or re-run `talm init` to regenerate (this writes new secrets — the old secrets.encrypted.yaml will not be decryptable without the original key).",
)
}

return nil, errors.Wrap(err, "read key file")
}

Expand Down
26 changes: 26 additions & 0 deletions pkg/age/contract_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@ import (
"strings"
"testing"

cerrors "github.com/cockroachdb/errors"
"github.com/cozystack/talm/pkg/age"
"gopkg.in/yaml.v3"
)
Expand Down Expand Up @@ -181,6 +182,31 @@ func TestContract_Age_LoadKey_MissingFileErrors(t *testing.T) {
}
}

// TestContract_Age_LoadKey_MissingFileSurfacesRecoveryHint pins the
// operator-facing recovery message attached to a missing talm.key.
// Without the hint, operators see a raw "open talm.key: no such
// file" stack-style line and assume it's a bug in talm. The hint
// must name BOTH recovery paths: restore from backup, or re-run
// `talm init` to regenerate (which writes new secrets — old
// secrets.encrypted.yaml becomes undecryptable without the
// original key).
func TestContract_Age_LoadKey_MissingFileSurfacesRecoveryHint(t *testing.T) {
dir := t.TempDir() // no talm.key inside

_, err := age.LoadKey(dir)
if err == nil {
t.Fatal("expected error for missing talm.key")
}

hints := cerrors.GetAllHints(err)
joinedLower := strings.ToLower(strings.Join(hints, "\n"))
for _, want := range []string{"talm.key", "restore", "talm init"} {
if !strings.Contains(joinedLower, want) {
t.Errorf("missing-key hint must mention %q so operator knows the recovery path; got hints:\n%s", want, strings.Join(hints, "\n"))
}
}
}

// === GetPublicKey / GetPublicKeyFromFile ===

// Contract: GetPublicKey returns the recipient string from an
Expand Down
Loading
Loading