Kubernetes Operator & CRD

Run Frontier on Kubernetes with a single FrontierCluster resource. The operator handles deployments, TLS secrets, services, probes, security context, and graceful shutdown.

1. Overview

The Frontier operator manages a two-tier deployment: Frontier (data plane, stateless edge gateway) and Frontlas (control plane, Redis-backed coordinator). Both are reconciled from a single namespaced custom resource FrontierCluster in the API group frontier.singchia.io/v1alpha1.

With one CR you get: two Deployments, three Services (servicebound, edgebound, controlplane), TLS material copied into operator-managed Secrets, sane production defaults (probes, preStop, non-root, drop-all caps, preferred anti-affinity), and a status that surfaces ready replicas and Conditions.

2. Installation

2.1 Install the CRD and operator

# Apply the CRD + operator deployment + RBAC in one shot
kubectl apply -f https://raw.githubusercontent.com/singchia/frontier/main/pkg/operator/dist/install.yaml

# Or from a local checkout
git clone https://github.com/singchia/frontier.git
kubectl apply -f frontier/pkg/operator/dist/install.yaml

Verify the CRD and operator pod:

kubectl get crd frontierclusters.frontier.singchia.io
kubectl get all -n frontier-operator-system

2.2 RBAC

The bundled install.yaml creates a ClusterRole granting the operator:

frontier.singchia.io/frontierclusters — full CRUD + status
apps/deployments — full CRUD (manages frontier & frontlas Deployments)
core/services, secrets, pods — full CRUD (services, TLS material, pod inspection)
core/events — create + patch (used by EventRecorder)

If you tighten this further, keep at least get;list;watch on those resources or reconcile will fail.

2.3 Alternative: install with Helm

Prefer Helm? The chart at dist/helm/ deploys both frontier and frontlas in one shot, with an optional bundled bitnami/redissubchart. Defaults track the operator's production-grade settings (non-root UID 65532, drop-all capabilities, preferred host anti-affinity, preStop sleep, configurable drain window, observability endpoints on 9091/9092). Pick this path if your platform standardizes on Helm/ArgoCD/Flux and you don't need the operator's reconcile-driven self-healing for TLS Secrets and Status conditions.

Quick install from the official chart repo (recommended):

helm repo add frontier https://singchia.github.io/frontier
helm repo update
helm install frontier frontier/frontier \
  --namespace frontier --create-namespace

The repo lives on the gh-pages branch and is served by GitHub Pages — same chart, same digest as the source tree under dist/helm/ in the repository.

Alternative install paths:

# A) Single .tgz from a GitHub Release (pinned to a specific PR)
helm install frontier \
  https://github.com/singchia/frontier/releases/download/helm-chart-v1.2.5-rc1/frontier-1.2.5.tgz \
  -n frontier --create-namespace

# B) From a local checkout (for chart development)
cd frontier/dist/helm
helm dependency update
helm install frontier . -n frontier --create-namespace

Bring your own Redis (set redis.enabled: false and point Frontlas at it):

# my-values.yaml
redis:
  enabled: false

frontlas:
  externalRedis:
    addrs:
      - redis.shared:6379
    redisType: standalone
    passwordSecret:
      name: redis-creds      # must already exist in the release namespace
      key:  password

helm install frontier frontier/frontier -n frontier -f my-values.yaml

Common knobs in values.yaml (full listing: helm show values dist/helm/):

Path	Default	Notes
`frontier.replicaCount` / `frontlas.replicaCount`	1 / 1	Independent scaling per component
`frontier.image.tag` / `frontlas.image.tag`	`""` → `Chart.AppVersion`	Override to pin a specific binary version
`global.registry`	`singchia`	Mirror to your private registry
`global.imagePullSecrets`	`[]`	Private registry credentials
`frontier.service.edgebound.type`	NodePort	Switch to LoadBalancer for cloud edge ingress
`frontier.podSecurityContext` / `containerSecurityContext`	nonRoot UID 65532, drop ALL caps	Set to `{}` if your custom image needs root
`frontier.terminationGracePeriodSeconds` / `frontier.drainSeconds`	60 / 50	Long-lived edge connections; drain < grace
`frontier.autoscaling.enabled`	false	HPA on the frontier Deployment
`observability.frontier.enabled` / `observability.frontlas.enabled`	true / true	Toggle the `/healthz` `/readyz` `/metrics` endpoints (ports 9091 / 9092)
`serviceMonitor.enabled`	false	Opt in if prometheus-operator is installed
`redis.enabled`	true	Set false to use external Redis (configure under `frontlas.externalRedis`)

Operator vs Helm — pick one:

Concern	Operator	Helm
Custom Resource (declarative)	✅ FrontierCluster CR	❌ values.yaml + Deployments directly
Status & Conditions per cluster	✅ Available / Progressing / Degraded	❌ Inspect underlying Deployments
Self-healing on TLS Secret rotation	✅ Reconciler watches Secrets	❌ Manual `helm upgrade`
Multiple clusters in one namespace	✅ Each CR is isolated	⚠️ Need separate releases
Bundled Redis	❌ BYO	✅ `bitnami/redis` subchart
ArgoCD / Flux GitOps	✅ (commit the CR)	✅ (commit values.yaml)
Initial install footprint	Operator pod + CRD + RBAC	No long-running operator

You can also publish the chart for downstream consumers: helm package dist/helm/ -d /path/to/repo produces frontier-1.2.5.tgz; serve the directory with any HTTP server or helm push to OCI.

3. Quick start

Minimum viable cluster: 2 frontier replicas, 1 frontlas replica, external Redis. Save as frontiercluster.yaml:

apiVersion: frontier.singchia.io/v1alpha1
kind: FrontierCluster
metadata:
  name: prod
  namespace: frontier
spec:
  frontier:
    replicas: 2
  frontlas:
    replicas: 1
    redis:
      addrs:
        - redis.frontier:6379
      passwordSecret:
        name: redis-creds
        key: password
      redisType: standalone

kubectl create namespace frontier
kubectl -n frontier create secret generic redis-creds --from-literal=password=...
kubectl apply -f frontiercluster.yaml

Wait for it to come up, then check:

kubectl -n frontier get fc                  # short name 'fc' is registered
NAME   PHASE     FRONTIER   FRONTLAS   AGE
prod   Running   2          1          47s

kubectl -n frontier describe fc prod         # see Conditions + Events
kubectl -n frontier get pods                 # frontier + frontlas pods
kubectl -n frontier get svc                  # 3 services rendered

4. CRD field reference

Everything below lives under spec. All fields outside frontier.servicebound, frontier.edgebound, frontlas.controlplane, and frontlas.redis are optional — sensible defaults apply.

4.1 `spec.frontier`

Field	Type	Default	Notes
`replicas`	int	1	Frontier pod count
`image`	string	`singchia/frontier:1.1.0`	Override to pin a specific tag
`servicebound.port`	int	30011	Service-side TCP/gRPC port
`servicebound.service`	string	`<name>-servicebound-svc`	Service name override
`servicebound.serviceType`	string	ClusterIP	ClusterIP / NodePort / LoadBalancer
`edgebound.port`	int	30012	Edge-side port (typically external)
`edgebound.serviceName`	string	`<name>-edgebound-svc`	Service name override
`edgebound.serviceType`	string	NodePort	Use LoadBalancer for cloud egress
`edgebound.tls.enabled`	bool	false	Enable TLS on edgebound
`edgebound.tls.optional`	bool	false	If true, both TLS and plain accepted
`edgebound.tls.mtls`	bool	false	Enable client cert verification
`edgebound.tls.certificateKeySecretRef.name`	string	—	User Secret with `tls.crt`, `tls.key`
`edgebound.tls.caCertificateSecretRef.name`	string	—	User Secret with `ca.crt` (mTLS only)
`nodeAffinity`	NodeAffinity	nil	Legacy — use `pod.affinity` instead
`pod`	PodOverrides	see §4.3	Production-grade overrides for the frontier pod

4.2 `spec.frontlas`

Field	Type	Default	Notes
`replicas`	int	1	Frontlas pod count
`image`	string	`singchia/frontlas:1.1.0`	Image override
`controlplane.port`	int	40011	Service-side control plane port
`controlplane.frontierPlanePort`	int	40012	Port used by frontier nodes to talk to frontlas
`controlplane.service`	string	`<name>-frontlas-svc`	Service name override
`controlplane.serviceType`	string	ClusterIP	Internal only by default
`redis.addrs`	[]string	required	One or more Redis addrs
`redis.redisType`	string	required	`standalone` / `sentinel` / `cluster`
`redis.db`	int	0	DB index (standalone only)
`redis.user`	string	""	For Redis ACL
`redis.password`	string	""	Deprecated — use `passwordSecret`
`redis.passwordSecret`	SecretKeySelector	nil	Recommended. Wins over `password`; injected via `valueFrom.secretKeyRef`
`redis.masterName`	string	""	Sentinel only
`nodeAffinity`	NodeAffinity	nil	Legacy — use `pod.affinity`
`pod`	PodOverrides	see §4.3	Production-grade overrides for the frontlas pod

4.3 `spec.frontier.pod` / `spec.frontlas.pod` (PodOverrides)

Every override is optional. When unset, the operator applies a production-grade default.

Field	Type	Operator default	Use case
`resources`	ResourceRequirements	nil (BestEffort QoS)	Set CPU/memory requests + limits for production
`nodeSelector`	map[string]string	nil	Pin to nodes by label
`tolerations`	[]Toleration	nil	Run on tainted nodes
`topologySpreadConstraints`	[]TopologySpreadConstraint	nil	Cross-zone / cross-node spread
`affinity`	Affinity	only PodAntiAffinity (preferred host spread)	Setting this fully replaces the default and the legacy `nodeAffinity` field
`priorityClassName`	string	""	Critical workload priority
`serviceAccountName`	string	default	Bind workload identity
`imagePullSecrets`	[]LocalObjectReference	nil	Private registry credentials
`imagePullPolicy`	string	`IfNotPresent`	Use `Always` in dev when pinning `latest`
`annotations`	map[string]string	nil	Pod annotations — cert-manager, Prometheus scrape config, sidecar opt-in
`labels`	map[string]string	app=…	Extra labels (merged with selector labels)
`podSecurityContext`	PodSecurityContext	runAsNonRoot=true, UID/GID/FSGroup=65532, RuntimeDefault seccomp	Override only when an image needs root or a different UID
`containerSecurityContext`	SecurityContext	drop ALL caps, AllowPrivilegeEscalation=false, runAsNonRoot=true	Override to add a specific capability back
`terminationGracePeriodSeconds`	int64	frontier=60, frontlas=30	Long-lived edge connections need at least 60
`livenessProbe`	Probe	TCP socket on edge port (frontier) / control port (frontlas)	Replace with HTTP probe in M3+
`readinessProbe`	Probe	TCP socket on service port (frontier) / HTTP `/cluster/v1/health` (frontlas)	HTTP `/readyz` available since M3
`lifecycle`	Lifecycle	preStop: `sleep 10` (frontier) / `sleep 5` (frontlas)	Lets kube-proxy remove pod from Service Endpoints before SIGTERM

5. Common scenarios

5.1 Edge mTLS

Provide both a server cert/key and a CA. The operator copies them into namespace-scoped Secrets and mounts them into the frontier pod at /app/conf/edgebound/tls/secret and /app/conf/edgebound/tls/ca.

apiVersion: v1
kind: Secret
metadata:
  name: edge-server-cert
type: kubernetes.io/tls
data:
  tls.crt: ...    # PEM cert
  tls.key: ...    # PEM key
---
apiVersion: v1
kind: Secret
metadata:
  name: edge-ca
data:
  ca.crt: ...     # PEM CA
---
apiVersion: frontier.singchia.io/v1alpha1
kind: FrontierCluster
metadata:
  name: prod
spec:
  frontier:
    edgebound:
      port: 8443
      serviceType: LoadBalancer
      tls:
        enabled: true
        mtls: true
        certificateKeySecretRef:
          name: edge-server-cert
        caCertificateSecretRef:
          name: edge-ca
  frontlas: { ... }

5.2 Production resources + scheduling

spec:
  frontier:
    replicas: 6
    pod:
      resources:
        requests: { cpu: "500m", memory: "512Mi" }
        limits:   { cpu: "2",    memory: "2Gi" }
      tolerations:
        - key: workload
          operator: Equal
          value: edge-gateway
          effect: NoSchedule
      topologySpreadConstraints:
        - maxSkew: 1
          topologyKey: topology.kubernetes.io/zone
          whenUnsatisfiable: ScheduleAnyway
          labelSelector:
            matchLabels: { app: prod-frontier }
      priorityClassName: frontend-critical
      terminationGracePeriodSeconds: 120

5.3 Private image registry

spec:
  frontier:
    image: my-registry.example.com/frontier:1.2.4
    pod:
      imagePullSecrets:
        - name: my-registry-creds
      imagePullPolicy: IfNotPresent
  frontlas:
    image: my-registry.example.com/frontlas:1.2.4
    pod:
      imagePullSecrets:
        - name: my-registry-creds

5.4 Annotations for Prometheus + cert-manager

spec:
  frontier:
    pod:
      annotations:
        prometheus.io/scrape: "true"
        prometheus.io/port: "9091"
        prometheus.io/path: "/metrics"
  frontlas:
    pod:
      annotations:
        cert-manager.io/inject-ca-from: frontier/frontier-ca

5.5 Override SecurityContext for legacy images

If your custom frontier image needs root or a non-65532 UID, opt out of the default explicitly:

spec:
  frontier:
    pod:
      podSecurityContext: {}                    # drop the default nonRoot/UID
      containerSecurityContext:
        runAsNonRoot: false
        capabilities:
          drop: []                              # keep capabilities

6. Status & Conditions

The CRD has a status subresource (read-only for users):

status:
  phase: Running                # Pending | Running | Failed (deprecated, kept for printcolumn)
  message: "Good to go!"
  observedGeneration: 7         # spec.generation that this status reflects
  frontierReadyReplicas: 6
  frontlasReadyReplicas: 2
  conditions:
    - type: Available
      status: "True"
      reason: AllComponentsReady
      lastTransitionTime: "2026-05-02T01:23:45Z"
      observedGeneration: 7
    - type: Progressing
      status: "False"
      reason: ReconcileSucceeded
    - type: Degraded
      status: "False"

Three conditions are maintained:

Available — True when both Deployments report all replicas ready.
Progressing — True while the operator is still reconciling toward desired state.
Degraded — True when reconcile failed (TLS Secret missing, deployment error, etc.). Inspect kubectl describe fc for the Events stream.

7. Observability endpoints (since M3)

Both frontier and frontlas expose three HTTP endpoints on a separate port:

Endpoint	Frontier port	Frontlas port	Semantics
`/healthz`	9091	9092	Liveness — 200 if process responds
`/readyz`	9091	9092	Readiness — 503 with details when not ready (e.g. Redis unreachable for frontlas)
`/metrics`	9091	9092	Prometheus default registry — Go runtime + process metrics

Configure via the observability block in frontier.yaml / frontlas.yaml:

observability:
  enable: true
  addr: 0.0.0.0:9091

The default behavior is on; set enable: false to disable.

8. Common operations

# CRUD with the short name
kubectl get fc
kubectl describe fc prod
kubectl edit fc prod
kubectl delete fc prod

# Inspect Conditions
kubectl get fc prod -o jsonpath='{.status.conditions}' | jq

# Watch reconcile events
kubectl describe fc prod | tail -20

# Patch the replica count without an editor
kubectl patch fc prod --type=merge -p '{"spec":{"frontier":{"replicas":4}}}'

9. Operator behavior

Reconcile order. Service → TLS Secrets → Frontlas Deployment → (wait until ready) → Frontier Deployment.
Owner references. Deployments + Services + operator-managed Secrets all carry the FrontierCluster as owner; deleting the CR cascades to all of them.
Graceful shutdown. Frontier honors FRONTIER_DRAIN_SECONDS (operator injects terminationGracePeriodSeconds - 10): on SIGTERM it waits this many seconds before tearing connections down, letting kube-proxy fully drop the pod from Service Endpoints first.
Events. Each meaningful state transition emits a Kubernetes Event: ServiceEnsureFailed, TLSEnsureFailed, DeploymentEnsureFailed, Available (one-shot when the cluster first becomes ready).

10. Troubleshooting

Symptom	Likely cause	Where to look
Frontier pod CrashLoopBackOff with `connect: connection refused` on the frontier-plane port	Frontlas not yet ready, or Redis unreachable from frontlas	`kubectl describe fc` Conditions; `kubectl logs deploy/<name>-frontlas`
Frontier pod fails to start: container can't run as nonRoot	Custom image without a non-root USER directive	Override `spec.frontier.pod.podSecurityContext` + `containerSecurityContext`, or use `singchia/frontier:1.2.4+` which ships with USER 65532
Status stays Pending for minutes	One of the Deployments not converging on ready replicas	`kubectl describe fc` + `kubectl get pods` + pod Events
TLS-enabled cluster can't serve mTLS	Missing `ca.crt` in the user CA Secret, or the operator-managed Secret was deleted manually	Operator log: `Error ensuring tls secret`; check user Secret keys exactly match `tls.crt`, `tls.key`, `ca.crt`
Cluster keeps re-reconciling but never settles	Some required spec field changed (e.g. ServiceType) and K8s rejects the update	Operator log + `kubectl get events`
Redis password is visible in `kubectl describe pod`	Using deprecated `spec.frontlas.redis.password` instead of `passwordSecret`	Move to `passwordSecret` — injected via `valueFrom.secretKeyRef` with no plaintext leak

11. Known limitations

No kubectl scale — the spec has two replica fields (frontier & frontlas) so the scale subresource isn't enabled. Patch spec.frontier.replicas directly. HPA targets the underlying Deployments instead.
v1alpha1 — no compatibility guarantees between alpha versions. The next bump goes to v1beta1 alongside conversion machinery.
Helm chart only ships frontier templates — the operator path (this page) is the recommended deployment route. Helm-only users should bring their own frontlas + Redis manifests until the chart catches up.
No webhook validation — bad input (e.g. negative replicas, invalid redisType) is caught at reconcile time, not at kubectl apply.

12. Roadmap

This page reflects RFC-001 “Cloud-native optimization” through M3 (observability) and M4 (Status conditions + EventRecorder + CRD ergonomics). Open RFC content lives at docs/rfc/RFC-001-cloud-native-optimization.md in the repository.