Kubernetes Operator & CRD
Run Frontier on Kubernetes with a single FrontierCluster resource. The operator handles deployments, TLS secrets, services, probes, security context, and graceful shutdown.
1. Overview
The Frontier operator manages a two-tier deployment: Frontier (data plane, stateless edge gateway) and Frontlas (control plane, Redis-backed coordinator). Both are reconciled from a single namespaced custom resource FrontierCluster in the API group frontier.singchia.io/v1alpha1.
With one CR you get: two Deployments, three Services (servicebound, edgebound, controlplane), TLS material copied into operator-managed Secrets, sane production defaults (probes, preStop, non-root, drop-all caps, preferred anti-affinity), and a status that surfaces ready replicas and Conditions.
2. Installation
2.1 Install the CRD and operator
# Apply the CRD + operator deployment + RBAC in one shot
kubectl apply -f https://raw.githubusercontent.com/singchia/frontier/main/pkg/operator/dist/install.yaml
# Or from a local checkout
git clone https://github.com/singchia/frontier.git
kubectl apply -f frontier/pkg/operator/dist/install.yamlVerify the CRD and operator pod:
kubectl get crd frontierclusters.frontier.singchia.io
kubectl get all -n frontier-operator-system2.2 RBAC
The bundled install.yaml creates a ClusterRole granting the operator:
frontier.singchia.io/frontierclusters— full CRUD + statusapps/deployments— full CRUD (manages frontier & frontlas Deployments)core/services, secrets, pods— full CRUD (services, TLS material, pod inspection)core/events— create + patch (used by EventRecorder)
If you tighten this further, keep at least get;list;watch on those resources or reconcile will fail.
2.3 Alternative: install with Helm
Prefer Helm? The chart at dist/helm/ deploys both frontier and frontlas in one shot, with an optional bundled bitnami/redissubchart. Defaults track the operator's production-grade settings (non-root UID 65532, drop-all capabilities, preferred host anti-affinity, preStop sleep, configurable drain window, observability endpoints on 9091/9092). Pick this path if your platform standardizes on Helm/ArgoCD/Flux and you don't need the operator's reconcile-driven self-healing for TLS Secrets and Status conditions.
Quick install from the official chart repo (recommended):
helm repo add frontier https://singchia.github.io/frontier
helm repo update
helm install frontier frontier/frontier \
--namespace frontier --create-namespaceThe repo lives on the gh-pages branch and is served by GitHub Pages — same chart, same digest as the source tree under dist/helm/ in the repository.
Alternative install paths:
# A) Single .tgz from a GitHub Release (pinned to a specific PR)
helm install frontier \
https://github.com/singchia/frontier/releases/download/helm-chart-v1.2.5-rc1/frontier-1.2.5.tgz \
-n frontier --create-namespace
# B) From a local checkout (for chart development)
cd frontier/dist/helm
helm dependency update
helm install frontier . -n frontier --create-namespaceBring your own Redis (set redis.enabled: false and point Frontlas at it):
# my-values.yaml
redis:
enabled: false
frontlas:
externalRedis:
addrs:
- redis.shared:6379
redisType: standalone
passwordSecret:
name: redis-creds # must already exist in the release namespace
key: passwordhelm install frontier frontier/frontier -n frontier -f my-values.yamlCommon knobs in values.yaml (full listing: helm show values dist/helm/):
| Path | Default | Notes |
|---|---|---|
frontier.replicaCount / frontlas.replicaCount | 1 / 1 | Independent scaling per component |
frontier.image.tag / frontlas.image.tag | "" → Chart.AppVersion | Override to pin a specific binary version |
global.registry | singchia | Mirror to your private registry |
global.imagePullSecrets | [] | Private registry credentials |
frontier.service.edgebound.type | NodePort | Switch to LoadBalancer for cloud edge ingress |
frontier.podSecurityContext / containerSecurityContext | nonRoot UID 65532, drop ALL caps | Set to {} if your custom image needs root |
frontier.terminationGracePeriodSeconds / frontier.drainSeconds | 60 / 50 | Long-lived edge connections; drain < grace |
frontier.autoscaling.enabled | false | HPA on the frontier Deployment |
observability.frontier.enabled / observability.frontlas.enabled | true / true | Toggle the /healthz /readyz /metrics endpoints (ports 9091 / 9092) |
serviceMonitor.enabled | false | Opt in if prometheus-operator is installed |
redis.enabled | true | Set false to use external Redis (configure under frontlas.externalRedis) |
Operator vs Helm — pick one:
| Concern | Operator | Helm |
|---|---|---|
| Custom Resource (declarative) | ✅ FrontierCluster CR | ❌ values.yaml + Deployments directly |
| Status & Conditions per cluster | ✅ Available / Progressing / Degraded | ❌ Inspect underlying Deployments |
| Self-healing on TLS Secret rotation | ✅ Reconciler watches Secrets | ❌ Manual helm upgrade |
| Multiple clusters in one namespace | ✅ Each CR is isolated | ⚠️ Need separate releases |
| Bundled Redis | ❌ BYO | ✅ bitnami/redis subchart |
| ArgoCD / Flux GitOps | ✅ (commit the CR) | ✅ (commit values.yaml) |
| Initial install footprint | Operator pod + CRD + RBAC | No long-running operator |
You can also publish the chart for downstream consumers: helm package dist/helm/ -d /path/to/repo produces frontier-1.2.5.tgz; serve the directory with any HTTP server or helm push to OCI.
3. Quick start
Minimum viable cluster: 2 frontier replicas, 1 frontlas replica, external Redis. Save as frontiercluster.yaml:
apiVersion: frontier.singchia.io/v1alpha1
kind: FrontierCluster
metadata:
name: prod
namespace: frontier
spec:
frontier:
replicas: 2
frontlas:
replicas: 1
redis:
addrs:
- redis.frontier:6379
passwordSecret:
name: redis-creds
key: password
redisType: standalonekubectl create namespace frontier
kubectl -n frontier create secret generic redis-creds --from-literal=password=...
kubectl apply -f frontiercluster.yamlWait for it to come up, then check:
kubectl -n frontier get fc # short name 'fc' is registered
NAME PHASE FRONTIER FRONTLAS AGE
prod Running 2 1 47s
kubectl -n frontier describe fc prod # see Conditions + Events
kubectl -n frontier get pods # frontier + frontlas pods
kubectl -n frontier get svc # 3 services rendered4. CRD field reference
Everything below lives under spec. All fields outside frontier.servicebound, frontier.edgebound, frontlas.controlplane, and frontlas.redis are optional — sensible defaults apply.
4.1 spec.frontier
| Field | Type | Default | Notes |
|---|---|---|---|
replicas | int | 1 | Frontier pod count |
image | string | singchia/frontier:1.1.0 | Override to pin a specific tag |
servicebound.port | int | 30011 | Service-side TCP/gRPC port |
servicebound.service | string | <name>-servicebound-svc | Service name override |
servicebound.serviceType | string | ClusterIP | ClusterIP / NodePort / LoadBalancer |
edgebound.port | int | 30012 | Edge-side port (typically external) |
edgebound.serviceName | string | <name>-edgebound-svc | Service name override |
edgebound.serviceType | string | NodePort | Use LoadBalancer for cloud egress |
edgebound.tls.enabled | bool | false | Enable TLS on edgebound |
edgebound.tls.optional | bool | false | If true, both TLS and plain accepted |
edgebound.tls.mtls | bool | false | Enable client cert verification |
edgebound.tls.certificateKeySecretRef.name | string | — | User Secret with tls.crt, tls.key |
edgebound.tls.caCertificateSecretRef.name | string | — | User Secret with ca.crt (mTLS only) |
nodeAffinity | NodeAffinity | nil | Legacy — use pod.affinity instead |
pod | PodOverrides | see §4.3 | Production-grade overrides for the frontier pod |
4.2 spec.frontlas
| Field | Type | Default | Notes |
|---|---|---|---|
replicas | int | 1 | Frontlas pod count |
image | string | singchia/frontlas:1.1.0 | Image override |
controlplane.port | int | 40011 | Service-side control plane port |
controlplane.frontierPlanePort | int | 40012 | Port used by frontier nodes to talk to frontlas |
controlplane.service | string | <name>-frontlas-svc | Service name override |
controlplane.serviceType | string | ClusterIP | Internal only by default |
redis.addrs | []string | required | One or more Redis addrs |
redis.redisType | string | required | standalone / sentinel / cluster |
redis.db | int | 0 | DB index (standalone only) |
redis.user | string | "" | For Redis ACL |
redis.password | string | "" | Deprecated — use passwordSecret |
redis.passwordSecret | SecretKeySelector | nil | Recommended. Wins over password; injected via valueFrom.secretKeyRef |
redis.masterName | string | "" | Sentinel only |
nodeAffinity | NodeAffinity | nil | Legacy — use pod.affinity |
pod | PodOverrides | see §4.3 | Production-grade overrides for the frontlas pod |
4.3 spec.frontier.pod / spec.frontlas.pod (PodOverrides)
Every override is optional. When unset, the operator applies a production-grade default.
| Field | Type | Operator default | Use case |
|---|---|---|---|
resources | ResourceRequirements | nil (BestEffort QoS) | Set CPU/memory requests + limits for production |
nodeSelector | map[string]string | nil | Pin to nodes by label |
tolerations | []Toleration | nil | Run on tainted nodes |
topologySpreadConstraints | []TopologySpreadConstraint | nil | Cross-zone / cross-node spread |
affinity | Affinity | only PodAntiAffinity (preferred host spread) | Setting this fully replaces the default and the legacy nodeAffinity field |
priorityClassName | string | "" | Critical workload priority |
serviceAccountName | string | default | Bind workload identity |
imagePullSecrets | []LocalObjectReference | nil | Private registry credentials |
imagePullPolicy | string | IfNotPresent | Use Always in dev when pinning latest |
annotations | map[string]string | nil | Pod annotations — cert-manager, Prometheus scrape config, sidecar opt-in |
labels | map[string]string | app=… | Extra labels (merged with selector labels) |
podSecurityContext | PodSecurityContext | runAsNonRoot=true, UID/GID/FSGroup=65532, RuntimeDefault seccomp | Override only when an image needs root or a different UID |
containerSecurityContext | SecurityContext | drop ALL caps, AllowPrivilegeEscalation=false, runAsNonRoot=true | Override to add a specific capability back |
terminationGracePeriodSeconds | int64 | frontier=60, frontlas=30 | Long-lived edge connections need at least 60 |
livenessProbe | Probe | TCP socket on edge port (frontier) / control port (frontlas) | Replace with HTTP probe in M3+ |
readinessProbe | Probe | TCP socket on service port (frontier) / HTTP /cluster/v1/health (frontlas) | HTTP /readyz available since M3 |
lifecycle | Lifecycle | preStop: sleep 10 (frontier) / sleep 5 (frontlas) | Lets kube-proxy remove pod from Service Endpoints before SIGTERM |
5. Common scenarios
5.1 Edge mTLS
Provide both a server cert/key and a CA. The operator copies them into namespace-scoped Secrets and mounts them into the frontier pod at /app/conf/edgebound/tls/secret and /app/conf/edgebound/tls/ca.
apiVersion: v1
kind: Secret
metadata:
name: edge-server-cert
type: kubernetes.io/tls
data:
tls.crt: ... # PEM cert
tls.key: ... # PEM key
---
apiVersion: v1
kind: Secret
metadata:
name: edge-ca
data:
ca.crt: ... # PEM CA
---
apiVersion: frontier.singchia.io/v1alpha1
kind: FrontierCluster
metadata:
name: prod
spec:
frontier:
edgebound:
port: 8443
serviceType: LoadBalancer
tls:
enabled: true
mtls: true
certificateKeySecretRef:
name: edge-server-cert
caCertificateSecretRef:
name: edge-ca
frontlas: { ... }5.2 Production resources + scheduling
spec:
frontier:
replicas: 6
pod:
resources:
requests: { cpu: "500m", memory: "512Mi" }
limits: { cpu: "2", memory: "2Gi" }
tolerations:
- key: workload
operator: Equal
value: edge-gateway
effect: NoSchedule
topologySpreadConstraints:
- maxSkew: 1
topologyKey: topology.kubernetes.io/zone
whenUnsatisfiable: ScheduleAnyway
labelSelector:
matchLabels: { app: prod-frontier }
priorityClassName: frontend-critical
terminationGracePeriodSeconds: 1205.3 Private image registry
spec:
frontier:
image: my-registry.example.com/frontier:1.2.4
pod:
imagePullSecrets:
- name: my-registry-creds
imagePullPolicy: IfNotPresent
frontlas:
image: my-registry.example.com/frontlas:1.2.4
pod:
imagePullSecrets:
- name: my-registry-creds5.4 Annotations for Prometheus + cert-manager
spec:
frontier:
pod:
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "9091"
prometheus.io/path: "/metrics"
frontlas:
pod:
annotations:
cert-manager.io/inject-ca-from: frontier/frontier-ca5.5 Override SecurityContext for legacy images
If your custom frontier image needs root or a non-65532 UID, opt out of the default explicitly:
spec:
frontier:
pod:
podSecurityContext: {} # drop the default nonRoot/UID
containerSecurityContext:
runAsNonRoot: false
capabilities:
drop: [] # keep capabilities6. Status & Conditions
The CRD has a status subresource (read-only for users):
status:
phase: Running # Pending | Running | Failed (deprecated, kept for printcolumn)
message: "Good to go!"
observedGeneration: 7 # spec.generation that this status reflects
frontierReadyReplicas: 6
frontlasReadyReplicas: 2
conditions:
- type: Available
status: "True"
reason: AllComponentsReady
lastTransitionTime: "2026-05-02T01:23:45Z"
observedGeneration: 7
- type: Progressing
status: "False"
reason: ReconcileSucceeded
- type: Degraded
status: "False"Three conditions are maintained:
- Available — True when both Deployments report all replicas ready.
- Progressing — True while the operator is still reconciling toward desired state.
- Degraded — True when reconcile failed (TLS Secret missing, deployment error, etc.). Inspect
kubectl describe fcfor the Events stream.
7. Observability endpoints (since M3)
Both frontier and frontlas expose three HTTP endpoints on a separate port:
| Endpoint | Frontier port | Frontlas port | Semantics |
|---|---|---|---|
/healthz | 9091 | 9092 | Liveness — 200 if process responds |
/readyz | 9091 | 9092 | Readiness — 503 with details when not ready (e.g. Redis unreachable for frontlas) |
/metrics | 9091 | 9092 | Prometheus default registry — Go runtime + process metrics |
Configure via the observability block in frontier.yaml / frontlas.yaml:
observability:
enable: true
addr: 0.0.0.0:9091The default behavior is on; set enable: false to disable.
8. Common operations
# CRUD with the short name
kubectl get fc
kubectl describe fc prod
kubectl edit fc prod
kubectl delete fc prod
# Inspect Conditions
kubectl get fc prod -o jsonpath='{.status.conditions}' | jq
# Watch reconcile events
kubectl describe fc prod | tail -20
# Patch the replica count without an editor
kubectl patch fc prod --type=merge -p '{"spec":{"frontier":{"replicas":4}}}'9. Operator behavior
- Reconcile order. Service → TLS Secrets → Frontlas Deployment → (wait until ready) → Frontier Deployment.
- Owner references. Deployments + Services + operator-managed Secrets all carry the FrontierCluster as owner; deleting the CR cascades to all of them.
- Graceful shutdown. Frontier honors
FRONTIER_DRAIN_SECONDS(operator injectsterminationGracePeriodSeconds - 10): on SIGTERM it waits this many seconds before tearing connections down, letting kube-proxy fully drop the pod from Service Endpoints first. - Events. Each meaningful state transition emits a Kubernetes Event:
ServiceEnsureFailed,TLSEnsureFailed,DeploymentEnsureFailed,Available(one-shot when the cluster first becomes ready).
10. Troubleshooting
| Symptom | Likely cause | Where to look |
|---|---|---|
Frontier pod CrashLoopBackOff with connect: connection refused on the frontier-plane port | Frontlas not yet ready, or Redis unreachable from frontlas | kubectl describe fc Conditions; kubectl logs deploy/<name>-frontlas |
| Frontier pod fails to start: container can't run as nonRoot | Custom image without a non-root USER directive | Override spec.frontier.pod.podSecurityContext + containerSecurityContext, or use singchia/frontier:1.2.4+ which ships with USER 65532 |
| Status stays Pending for minutes | One of the Deployments not converging on ready replicas | kubectl describe fc + kubectl get pods + pod Events |
| TLS-enabled cluster can't serve mTLS | Missing ca.crt in the user CA Secret, or the operator-managed Secret was deleted manually | Operator log: Error ensuring tls secret; check user Secret keys exactly match tls.crt, tls.key, ca.crt |
| Cluster keeps re-reconciling but never settles | Some required spec field changed (e.g. ServiceType) and K8s rejects the update | Operator log + kubectl get events |
Redis password is visible in kubectl describe pod | Using deprecated spec.frontlas.redis.password instead of passwordSecret | Move to passwordSecret — injected via valueFrom.secretKeyRef with no plaintext leak |
11. Known limitations
- No
kubectl scale— the spec has two replica fields (frontier & frontlas) so the scale subresource isn't enabled. Patchspec.frontier.replicasdirectly. HPA targets the underlying Deployments instead. - v1alpha1 — no compatibility guarantees between alpha versions. The next bump goes to
v1beta1alongside conversion machinery. - Helm chart only ships frontier templates — the operator path (this page) is the recommended deployment route. Helm-only users should bring their own frontlas + Redis manifests until the chart catches up.
- No webhook validation — bad input (e.g. negative replicas, invalid
redisType) is caught at reconcile time, not atkubectl apply.
12. Roadmap
This page reflects RFC-001 “Cloud-native optimization” through M3 (observability) and M4 (Status conditions + EventRecorder + CRD ergonomics). Open RFC content lives at docs/rfc/RFC-001-cloud-native-optimization.md in the repository.