Kubernetes Deployment¶

Loom — Lightweight Orchestrated Operational Mesh

Overview¶

Loom ships with Kubernetes manifests in k8s/ that are ready for Minikube. The manifests deploy NATS, Valkey, the router, an orchestrator, and worker pods into a dedicated loom namespace.

Minikube Deployment¶

Start Minikube¶

minikube start --cpus=4 --memory=8192 --driver=docker
eval $(minikube docker-env)

Build Container Images¶

Build images inside Minikube's Docker daemon so they're available to pods without a registry:

docker build -f docker/Dockerfile.worker -t loom-worker:latest .
docker build -f docker/Dockerfile.router -t loom-router:latest .
docker build -f docker/Dockerfile.orchestrator -t loom-orchestrator:latest .
docker build -f docker/Dockerfile.workshop -t loom-workshop:latest .

Create Namespace and Secrets¶

kubectl create namespace loom
kubectl create secret generic loom-secrets \
  --namespace loom \
  --from-literal=anthropic-api-key="$ANTHROPIC_API_KEY"

Deploy¶

kubectl apply -k k8s/
kubectl get pods -n loom -w

Access Workshop¶

The Workshop is exposed via NodePort on port 30080:

# Minikube
minikube service loom-workshop -n loom

# Or access directly
open http://$(minikube ip):30080

Manifest Structure¶

k8s/
├── namespace.yaml              # loom namespace
├── nats-deployment.yaml        # NATS server
├── redis-deployment.yaml       # Valkey server
├── router-deployment.yaml      # Loom router
├── orchestrator-deployment.yaml # Loom orchestrator
├── worker-deployment.yaml      # Loom worker(s)
├── workshop-deployment.yaml    # Loom Workshop web UI (NodePort 30080)
└── kustomization.yaml          # Kustomize overlay

Ollama on Mac with Minikube¶

For local LLM inference, run Ollama natively on the host and point workers to the host address:

# On host
ollama serve &

# In worker config or environment
OLLAMA_URL=http://host.minikube.internal:11434

Environment Variables¶

Workers, router, and orchestrator containers use the following environment variables:

Variable	Required	Description
`WORKER_CONFIG`	Workers	Path to worker YAML config
`MODEL_TIER`	Workers	Model tier (local, standard, frontier)
`NATS_URL`	All	NATS server URL
`OLLAMA_URL`	Optional	Ollama API endpoint
`ANTHROPIC_API_KEY`	Optional	Anthropic API key (from secret)
`FRONTIER_MODEL`	Optional	Model name for frontier tier

Resource Requests and Limits¶

Configure resource requests and limits for each component type:

Component	CPU Request	CPU Limit	Memory Request	Memory Limit
Router	100m	500m	128Mi	256Mi
Orchestrator	200m	1000m	256Mi	512Mi
Worker (local)	200m	1000m	256Mi	512Mi
Worker (standard)	100m	500m	128Mi	256Mi
NATS	100m	500m	128Mi	256Mi
Valkey	100m	500m	128Mi	256Mi

Workers with local LLM backends (Ollama) need more resources because they proxy API calls. Workers using remote APIs (Anthropic) are lighter.

Example in a deployment spec:

resources:
  requests:
    cpu: "200m"
    memory: "256Mi"
  limits:
    cpu: "1000m"
    memory: "512Mi"

Health Checks¶

Loom actors are long-running async processes. Use liveness and readiness probes to detect stuck or unresponsive actors:

livenessProbe:
  exec:
    command: ["python", "-c", "import sys; sys.exit(0)"]
  initialDelaySeconds: 10
  periodSeconds: 30
  failureThreshold: 3
readinessProbe:
  exec:
    command: ["python", "-c", "import sys; sys.exit(0)"]
  initialDelaySeconds: 5
  periodSeconds: 10

For the router and orchestrator, consider adding NATS connectivity checks as part of the liveness probe.

Horizontal Scaling¶

Loom actors scale horizontally via NATS queue groups with zero code changes. Multiple replicas of the same actor type automatically load-balance.

# Scale workers manually
kubectl scale deployment/loom-worker --replicas=5 -n loom

HPA Auto-Scaling¶

Use Horizontal Pod Autoscaler for CPU-based scaling:

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: loom-worker-hpa
  namespace: loom
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: loom-worker
  minReplicas: 1
  maxReplicas: 10
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70

Pipeline orchestrators also support concurrent goal processing via max_concurrent_goals in config, which can complement horizontal scaling.

Persistent Volumes¶

Valkey requires persistent storage for checkpoint data:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: redis-data
  namespace: loom
spec:
  accessModes: [ReadWriteOnce]
  resources:
    requests:
      storage: 1Gi

Mount the PVC in the Valkey deployment's pod spec:

volumes:
  - name: redis-data
    persistentVolumeClaim:
      claimName: redis-data
containers:
  - name: redis
    volumeMounts:
      - name: redis-data
        mountPath: /data

For local development setup, see Getting Started. For architecture details, see Architecture.