Kubernetes Operator & CRD

Run Frontier on Kubernetes with a single FrontierCluster resource. The operator handles deployments, TLS secrets, services, probes, security context, and graceful shutdown.

1. Overview

The Frontier operator manages a two-tier deployment: Frontier (data plane, stateless edge gateway) and Frontlas (control plane, Redis-backed coordinator). Both are reconciled from a single namespaced custom resource FrontierCluster in the API group frontier.singchia.io/v1alpha1.

With one CR you get: two Deployments, three Services (servicebound, edgebound, controlplane), TLS material copied into operator-managed Secrets, sane production defaults (probes, preStop, non-root, drop-all caps, preferred anti-affinity), and a status that surfaces ready replicas and Conditions.

2. Installation

2.1 Install the CRD and operator

# Apply the CRD + operator deployment + RBAC in one shot
kubectl apply -f https://raw.githubusercontent.com/singchia/frontier/main/pkg/operator/dist/install.yaml

# Or from a local checkout
git clone https://github.com/singchia/frontier.git
kubectl apply -f frontier/pkg/operator/dist/install.yaml

Verify the CRD and operator pod:

kubectl get crd frontierclusters.frontier.singchia.io
kubectl get all -n frontier-operator-system

2.2 RBAC

The bundled install.yaml creates a ClusterRole granting the operator:

  • frontier.singchia.io/frontierclusters — full CRUD + status
  • apps/deployments — full CRUD (manages frontier & frontlas Deployments)
  • core/services, secrets, pods — full CRUD (services, TLS material, pod inspection)
  • core/events — create + patch (used by EventRecorder)

If you tighten this further, keep at least get;list;watch on those resources or reconcile will fail.

2.3 Alternative: install with Helm

Prefer Helm? The chart at dist/helm/ deploys both frontier and frontlas in one shot, with an optional bundled bitnami/redissubchart. Defaults track the operator's production-grade settings (non-root UID 65532, drop-all capabilities, preferred host anti-affinity, preStop sleep, configurable drain window, observability endpoints on 9091/9092). Pick this path if your platform standardizes on Helm/ArgoCD/Flux and you don't need the operator's reconcile-driven self-healing for TLS Secrets and Status conditions.

Quick install from the official chart repo (recommended):

helm repo add frontier https://singchia.github.io/frontier
helm repo update
helm install frontier frontier/frontier \
  --namespace frontier --create-namespace

The repo lives on the gh-pages branch and is served by GitHub Pages — same chart, same digest as the source tree under dist/helm/ in the repository.

Alternative install paths:

# A) Single .tgz from a GitHub Release (pinned to a specific PR)
helm install frontier \
  https://github.com/singchia/frontier/releases/download/helm-chart-v1.2.5-rc1/frontier-1.2.5.tgz \
  -n frontier --create-namespace

# B) From a local checkout (for chart development)
cd frontier/dist/helm
helm dependency update
helm install frontier . -n frontier --create-namespace

Bring your own Redis (set redis.enabled: false and point Frontlas at it):

# my-values.yaml
redis:
  enabled: false

frontlas:
  externalRedis:
    addrs:
      - redis.shared:6379
    redisType: standalone
    passwordSecret:
      name: redis-creds      # must already exist in the release namespace
      key:  password
helm install frontier frontier/frontier -n frontier -f my-values.yaml

Common knobs in values.yaml (full listing: helm show values dist/helm/):

PathDefaultNotes
frontier.replicaCount / frontlas.replicaCount1 / 1Independent scaling per component
frontier.image.tag / frontlas.image.tag""Chart.AppVersionOverride to pin a specific binary version
global.registrysingchiaMirror to your private registry
global.imagePullSecrets[]Private registry credentials
frontier.service.edgebound.typeNodePortSwitch to LoadBalancer for cloud edge ingress
frontier.podSecurityContext / containerSecurityContextnonRoot UID 65532, drop ALL capsSet to {} if your custom image needs root
frontier.terminationGracePeriodSeconds / frontier.drainSeconds60 / 50Long-lived edge connections; drain < grace
frontier.autoscaling.enabledfalseHPA on the frontier Deployment
observability.frontier.enabled / observability.frontlas.enabledtrue / trueToggle the /healthz /readyz /metrics endpoints (ports 9091 / 9092)
serviceMonitor.enabledfalseOpt in if prometheus-operator is installed
redis.enabledtrueSet false to use external Redis (configure under frontlas.externalRedis)

Operator vs Helm — pick one:

ConcernOperatorHelm
Custom Resource (declarative)✅ FrontierCluster CR❌ values.yaml + Deployments directly
Status & Conditions per cluster✅ Available / Progressing / Degraded❌ Inspect underlying Deployments
Self-healing on TLS Secret rotation✅ Reconciler watches Secrets❌ Manual helm upgrade
Multiple clusters in one namespace✅ Each CR is isolated⚠️ Need separate releases
Bundled Redis❌ BYObitnami/redis subchart
ArgoCD / Flux GitOps✅ (commit the CR)✅ (commit values.yaml)
Initial install footprintOperator pod + CRD + RBACNo long-running operator

You can also publish the chart for downstream consumers: helm package dist/helm/ -d /path/to/repo produces frontier-1.2.5.tgz; serve the directory with any HTTP server or helm push to OCI.

3. Quick start

Minimum viable cluster: 2 frontier replicas, 1 frontlas replica, external Redis. Save as frontiercluster.yaml:

apiVersion: frontier.singchia.io/v1alpha1
kind: FrontierCluster
metadata:
  name: prod
  namespace: frontier
spec:
  frontier:
    replicas: 2
  frontlas:
    replicas: 1
    redis:
      addrs:
        - redis.frontier:6379
      passwordSecret:
        name: redis-creds
        key: password
      redisType: standalone
kubectl create namespace frontier
kubectl -n frontier create secret generic redis-creds --from-literal=password=...
kubectl apply -f frontiercluster.yaml

Wait for it to come up, then check:

kubectl -n frontier get fc                  # short name 'fc' is registered
NAME   PHASE     FRONTIER   FRONTLAS   AGE
prod   Running   2          1          47s

kubectl -n frontier describe fc prod         # see Conditions + Events
kubectl -n frontier get pods                 # frontier + frontlas pods
kubectl -n frontier get svc                  # 3 services rendered

4. CRD field reference

Everything below lives under spec. All fields outside frontier.servicebound, frontier.edgebound, frontlas.controlplane, and frontlas.redis are optional — sensible defaults apply.

4.1 spec.frontier

FieldTypeDefaultNotes
replicasint1Frontier pod count
imagestringsingchia/frontier:1.1.0Override to pin a specific tag
servicebound.portint30011Service-side TCP/gRPC port
servicebound.servicestring<name>-servicebound-svcService name override
servicebound.serviceTypestringClusterIPClusterIP / NodePort / LoadBalancer
edgebound.portint30012Edge-side port (typically external)
edgebound.serviceNamestring<name>-edgebound-svcService name override
edgebound.serviceTypestringNodePortUse LoadBalancer for cloud egress
edgebound.tls.enabledboolfalseEnable TLS on edgebound
edgebound.tls.optionalboolfalseIf true, both TLS and plain accepted
edgebound.tls.mtlsboolfalseEnable client cert verification
edgebound.tls.certificateKeySecretRef.namestringUser Secret with tls.crt, tls.key
edgebound.tls.caCertificateSecretRef.namestringUser Secret with ca.crt (mTLS only)
nodeAffinityNodeAffinitynilLegacy — use pod.affinity instead
podPodOverridessee §4.3Production-grade overrides for the frontier pod

4.2 spec.frontlas

FieldTypeDefaultNotes
replicasint1Frontlas pod count
imagestringsingchia/frontlas:1.1.0Image override
controlplane.portint40011Service-side control plane port
controlplane.frontierPlanePortint40012Port used by frontier nodes to talk to frontlas
controlplane.servicestring<name>-frontlas-svcService name override
controlplane.serviceTypestringClusterIPInternal only by default
redis.addrs[]stringrequiredOne or more Redis addrs
redis.redisTypestringrequiredstandalone / sentinel / cluster
redis.dbint0DB index (standalone only)
redis.userstring""For Redis ACL
redis.passwordstring""Deprecated — use passwordSecret
redis.passwordSecretSecretKeySelectornilRecommended. Wins over password; injected via valueFrom.secretKeyRef
redis.masterNamestring""Sentinel only
nodeAffinityNodeAffinitynilLegacy — use pod.affinity
podPodOverridessee §4.3Production-grade overrides for the frontlas pod

4.3 spec.frontier.pod / spec.frontlas.pod (PodOverrides)

Every override is optional. When unset, the operator applies a production-grade default.

FieldTypeOperator defaultUse case
resourcesResourceRequirementsnil (BestEffort QoS)Set CPU/memory requests + limits for production
nodeSelectormap[string]stringnilPin to nodes by label
tolerations[]TolerationnilRun on tainted nodes
topologySpreadConstraints[]TopologySpreadConstraintnilCross-zone / cross-node spread
affinityAffinityonly PodAntiAffinity (preferred host spread)Setting this fully replaces the default and the legacy nodeAffinity field
priorityClassNamestring""Critical workload priority
serviceAccountNamestringdefaultBind workload identity
imagePullSecrets[]LocalObjectReferencenilPrivate registry credentials
imagePullPolicystringIfNotPresentUse Always in dev when pinning latest
annotationsmap[string]stringnilPod annotations — cert-manager, Prometheus scrape config, sidecar opt-in
labelsmap[string]stringapp=…Extra labels (merged with selector labels)
podSecurityContextPodSecurityContextrunAsNonRoot=true, UID/GID/FSGroup=65532, RuntimeDefault seccompOverride only when an image needs root or a different UID
containerSecurityContextSecurityContextdrop ALL caps, AllowPrivilegeEscalation=false, runAsNonRoot=trueOverride to add a specific capability back
terminationGracePeriodSecondsint64frontier=60, frontlas=30Long-lived edge connections need at least 60
livenessProbeProbeTCP socket on edge port (frontier) / control port (frontlas)Replace with HTTP probe in M3+
readinessProbeProbeTCP socket on service port (frontier) / HTTP /cluster/v1/health (frontlas)HTTP /readyz available since M3
lifecycleLifecyclepreStop: sleep 10 (frontier) / sleep 5 (frontlas)Lets kube-proxy remove pod from Service Endpoints before SIGTERM

5. Common scenarios

5.1 Edge mTLS

Provide both a server cert/key and a CA. The operator copies them into namespace-scoped Secrets and mounts them into the frontier pod at /app/conf/edgebound/tls/secret and /app/conf/edgebound/tls/ca.

apiVersion: v1
kind: Secret
metadata:
  name: edge-server-cert
type: kubernetes.io/tls
data:
  tls.crt: ...    # PEM cert
  tls.key: ...    # PEM key
---
apiVersion: v1
kind: Secret
metadata:
  name: edge-ca
data:
  ca.crt: ...     # PEM CA
---
apiVersion: frontier.singchia.io/v1alpha1
kind: FrontierCluster
metadata:
  name: prod
spec:
  frontier:
    edgebound:
      port: 8443
      serviceType: LoadBalancer
      tls:
        enabled: true
        mtls: true
        certificateKeySecretRef:
          name: edge-server-cert
        caCertificateSecretRef:
          name: edge-ca
  frontlas: { ... }

5.2 Production resources + scheduling

spec:
  frontier:
    replicas: 6
    pod:
      resources:
        requests: { cpu: "500m", memory: "512Mi" }
        limits:   { cpu: "2",    memory: "2Gi" }
      tolerations:
        - key: workload
          operator: Equal
          value: edge-gateway
          effect: NoSchedule
      topologySpreadConstraints:
        - maxSkew: 1
          topologyKey: topology.kubernetes.io/zone
          whenUnsatisfiable: ScheduleAnyway
          labelSelector:
            matchLabels: { app: prod-frontier }
      priorityClassName: frontend-critical
      terminationGracePeriodSeconds: 120

5.3 Private image registry

spec:
  frontier:
    image: my-registry.example.com/frontier:1.2.4
    pod:
      imagePullSecrets:
        - name: my-registry-creds
      imagePullPolicy: IfNotPresent
  frontlas:
    image: my-registry.example.com/frontlas:1.2.4
    pod:
      imagePullSecrets:
        - name: my-registry-creds

5.4 Annotations for Prometheus + cert-manager

spec:
  frontier:
    pod:
      annotations:
        prometheus.io/scrape: "true"
        prometheus.io/port: "9091"
        prometheus.io/path: "/metrics"
  frontlas:
    pod:
      annotations:
        cert-manager.io/inject-ca-from: frontier/frontier-ca

5.5 Override SecurityContext for legacy images

If your custom frontier image needs root or a non-65532 UID, opt out of the default explicitly:

spec:
  frontier:
    pod:
      podSecurityContext: {}                    # drop the default nonRoot/UID
      containerSecurityContext:
        runAsNonRoot: false
        capabilities:
          drop: []                              # keep capabilities

6. Status & Conditions

The CRD has a status subresource (read-only for users):

status:
  phase: Running                # Pending | Running | Failed (deprecated, kept for printcolumn)
  message: "Good to go!"
  observedGeneration: 7         # spec.generation that this status reflects
  frontierReadyReplicas: 6
  frontlasReadyReplicas: 2
  conditions:
    - type: Available
      status: "True"
      reason: AllComponentsReady
      lastTransitionTime: "2026-05-02T01:23:45Z"
      observedGeneration: 7
    - type: Progressing
      status: "False"
      reason: ReconcileSucceeded
    - type: Degraded
      status: "False"

Three conditions are maintained:

  • Available — True when both Deployments report all replicas ready.
  • Progressing — True while the operator is still reconciling toward desired state.
  • Degraded — True when reconcile failed (TLS Secret missing, deployment error, etc.). Inspect kubectl describe fc for the Events stream.

7. Observability endpoints (since M3)

Both frontier and frontlas expose three HTTP endpoints on a separate port:

EndpointFrontier portFrontlas portSemantics
/healthz90919092Liveness — 200 if process responds
/readyz90919092Readiness — 503 with details when not ready (e.g. Redis unreachable for frontlas)
/metrics90919092Prometheus default registry — Go runtime + process metrics

Configure via the observability block in frontier.yaml / frontlas.yaml:

observability:
  enable: true
  addr: 0.0.0.0:9091

The default behavior is on; set enable: false to disable.

8. Common operations

# CRUD with the short name
kubectl get fc
kubectl describe fc prod
kubectl edit fc prod
kubectl delete fc prod

# Inspect Conditions
kubectl get fc prod -o jsonpath='{.status.conditions}' | jq

# Watch reconcile events
kubectl describe fc prod | tail -20

# Patch the replica count without an editor
kubectl patch fc prod --type=merge -p '{"spec":{"frontier":{"replicas":4}}}'

9. Operator behavior

  • Reconcile order. Service → TLS Secrets → Frontlas Deployment → (wait until ready) → Frontier Deployment.
  • Owner references. Deployments + Services + operator-managed Secrets all carry the FrontierCluster as owner; deleting the CR cascades to all of them.
  • Graceful shutdown. Frontier honors FRONTIER_DRAIN_SECONDS (operator injects terminationGracePeriodSeconds - 10): on SIGTERM it waits this many seconds before tearing connections down, letting kube-proxy fully drop the pod from Service Endpoints first.
  • Events. Each meaningful state transition emits a Kubernetes Event: ServiceEnsureFailed, TLSEnsureFailed, DeploymentEnsureFailed, Available (one-shot when the cluster first becomes ready).

10. Troubleshooting

SymptomLikely causeWhere to look
Frontier pod CrashLoopBackOff with connect: connection refused on the frontier-plane portFrontlas not yet ready, or Redis unreachable from frontlaskubectl describe fc Conditions; kubectl logs deploy/<name>-frontlas
Frontier pod fails to start: container can't run as nonRootCustom image without a non-root USER directiveOverride spec.frontier.pod.podSecurityContext + containerSecurityContext, or use singchia/frontier:1.2.4+ which ships with USER 65532
Status stays Pending for minutesOne of the Deployments not converging on ready replicaskubectl describe fc + kubectl get pods + pod Events
TLS-enabled cluster can't serve mTLSMissing ca.crt in the user CA Secret, or the operator-managed Secret was deleted manuallyOperator log: Error ensuring tls secret; check user Secret keys exactly match tls.crt, tls.key, ca.crt
Cluster keeps re-reconciling but never settlesSome required spec field changed (e.g. ServiceType) and K8s rejects the updateOperator log + kubectl get events
Redis password is visible in kubectl describe podUsing deprecated spec.frontlas.redis.password instead of passwordSecretMove to passwordSecret — injected via valueFrom.secretKeyRef with no plaintext leak

11. Known limitations

  • No kubectl scale — the spec has two replica fields (frontier & frontlas) so the scale subresource isn't enabled. Patch spec.frontier.replicas directly. HPA targets the underlying Deployments instead.
  • v1alpha1 — no compatibility guarantees between alpha versions. The next bump goes to v1beta1 alongside conversion machinery.
  • Helm chart only ships frontier templates — the operator path (this page) is the recommended deployment route. Helm-only users should bring their own frontlas + Redis manifests until the chart catches up.
  • No webhook validation — bad input (e.g. negative replicas, invalid redisType) is caught at reconcile time, not at kubectl apply.

12. Roadmap

This page reflects RFC-001 “Cloud-native optimization” through M3 (observability) and M4 (Status conditions + EventRecorder + CRD ergonomics). Open RFC content lives at docs/rfc/RFC-001-cloud-native-optimization.md in the repository.