Development Guide
Prerequisites
- Go 1.23+ (for controller development)
- Bun 1.0+ (for Web UI development)
- Access to a Kubernetes cluster
- Helm CLI (for provider installation)
- kubectl configured with cluster access
Quick Start
Web UI Development
# Install dependencies
bun install
# Start development servers (frontend + backend)
bun run dev
# Development mode:
# Frontend: http://localhost:5173 (Vite dev server, proxies API to backend)
# Backend: http://localhost:3001
#
# Production mode (compiled binary):
# Single server: http://localhost:3001 (frontend embedded in backend)
Controller Development
# Build the controller binary
make controller-build
# Run controller tests
make controller-test
# Run controller locally (uses your kubeconfig)
make controller-run
# Regenerate CRDs and deepcopy code after editing *_types.go files
make controller-generate
# Build the docker container image
make controller-docker-build CONTROLLER_IMG=<YOUR IMAGE>
# Defaults: PUSH=false and PLATFORM=linux/amd64
# Optional: push instead of load, or target a different platform
make controller-docker-build CONTROLLER_IMG=<YOUR IMAGE> PUSH=true PLATFORM=linux/amd64,linux/arm64
# Install CRDs into the cluster
make controller-install
# Deploy controller to cluster
make controller-deploy CONTROLLER_IMG=<YOUR IMAGE>
Important: After editing controller/api/v1alpha1/*_types.go files, always run:
cd controller && make manifests generate
Building a Single Binary
The project can be compiled to a standalone executable that includes both the backend API and embedded frontend assets:
# Compile to single binary (includes frontend)
bun run compile
# Run the binary (serves both API and frontend on port 3001)
./dist/airunway
# Check version info
curl http://localhost:3001/api/health/version
The compile process:
- Builds the frontend with Vite
- Generates native Bun file imports in
backend/src/embedded-assets.ts - Injects build-time constants (version, git commit, build time) via
--define - Compiles everything into a single executable using
bun build --compile --minify --sourcemap
The binary is completely self-contained with zero-copy file serving. The backend uses Hono on Bun for optimal performance.
Cross-Compilation
Build for multiple platforms:
# Build for all platforms
make compile-all
# Or individual targets
make compile-linux # linux-x64, linux-arm64
make compile-darwin # darwin-x64, darwin-arm64
make compile-windows # windows-x64
# With explicit version
VERSION=v1.0.0 bun run compile
Supported targets:
linux-x64,linux-arm64darwin-x64,darwin-arm64windows-x64
Controller Development
The controller is a Go-based Kubernetes operator built with Kubebuilder.
Project Structure
controller/
├── api/v1alpha1/ # CRD type definitions
│ ├── modeldeployment_types.go
│ └── inferenceproviderconfig_types.go
├── cmd/ # Main entrypoint
├── config/ # Kustomize manifests
│ ├── crd/ # Generated CRD YAMLs
│ ├── rbac/ # RBAC manifests
│ └── manager/ # Controller deployment
├── internal/
│ ├── controller/ # Reconciliation logic
│ └── webhook/ # Validation webhooks
└── Makefile # Build commands
CRDs
AI Runway defines two CRDs:
- ModelDeployment (namespaced) - User-facing API for deploying models
- InferenceProviderConfig (cluster-scoped) - Provider registration
After editing *_types.go files, regenerate code:
cd controller && make manifests generate
Reconciliation Flow
Core controller reconciliation steps:
- Receive ModelDeployment event
- Check for pause annotation (
airunway.ai/reconcile-paused: "true") — skip if paused - Select engine — use explicit
spec.engine.typeor auto-select from provider capabilities (filtered by GPU/CPU, serving mode, and engine GPU requirements) - Validate spec (engine/resource compatibility, required fields)
- Select provider — use explicit
spec.provider.nameor run auto-selection algorithm (CEL rules now see the resolved engine) - Set status —
status.engine,status.provider, conditions
The core controller stops here. Provider controllers then take over with their own sequence:
- Filter — only reconcile ModelDeployments where
status.provider.namematches - Validate compatibility — check engine/mode support for this provider
- Transform — convert ModelDeployment spec to provider-specific resource
- Create/Update — apply provider resource with owner references
- Sync status — map provider resource status back to ModelDeployment (phase, replicas, endpoint)
- Handle deletion — clean up provider resources via finalizers (5-minute timeout)
Observability
Controller metrics:
airunway_modeldeployment_total{namespace, phase}
airunway_reconciliation_duration_seconds{provider}
airunway_reconciliation_errors_total{provider, error_type}
airunway_provider_selection{provider, reason}
airunway_deployment_replicas{name, namespace, state}
airunway_deployment_phase{name, namespace, phase}
Events emitted:
Normal ProviderSelected Selected provider 'dynamo': default → dynamo (GPU inference default)
Normal ResourceCreated Created DynamoGraphDeployment 'my-llm'
Warning SecretNotFound Secret 'hf-token-secret' not found in namespace 'default'
Warning ProviderError Provider resource in error state: insufficient GPUs
Warning DriftDetected Provider resource was modified directly, reconciling
Warning FinalizerTimeout Finalizer removed after timeout, provider resource may be orphaned
Running Locally
# Install CRDs first
make controller-install
# Run controller (uses your kubeconfig)
make controller-run
Testing
# Run unit tests
make controller-test
# Run with verbose output
cd controller && go test -v ./...
Test categories:
- Unit tests — manifest transformation per provider, status mapping, provider selection algorithm, schema validation
- Integration tests — controller reconciliation with mock K8s API, owner references, finalizer behavior, drift detection, webhook validation
- E2E tests — full deployment lifecycle per provider, error recovery, controller restart resilience
Version Compatibility Matrix
| AI Runway Controller | Kubernetes | KAITO Operator | Dynamo Operator | KubeRay Operator |
|---|---|---|---|---|
| v0.1.x | 1.26-1.30 | v0.3.x | v1.0.x | v1.1.x |
| Provider | Minimum Version | CRD API Version | Notes |
|---|---|---|---|
| KAITO | v0.3.0 | kaito.sh/v1beta1 | Requires GPU operator for GPU workloads |
| Dynamo | v1.0.0 | nvidia.com/v1alpha1 | Requires NVIDIA GPU operator; CRDs are bundled in the platform chart |
| KubeRay | v1.1.0 | ray.io/v1 | Optional: KubeRay autoscaler for scaling |
Finalizer Handling
The controller uses finalizers to ensure provider resource cleanup on deletion:
- Controller attempts cleanup for 5 minutes
- After timeout, removes finalizer with warning event
- Orphaned provider resources may remain (logged for manual cleanup)
Manual escape (immediate — use when deletion is stuck):
kubectl patch modeldeployment my-llm --type=merge \
-p '{"metadata":{"finalizers":[]}}'
Provider Development
Provider controllers are independent operators in providers/<name>/:
# Build a provider binary (from provider directory)
cd providers/kaito && make build
cd providers/dynamo && make build
cd providers/kuberay && make build
cd providers/llmd && make build
# Build provider Docker image
cd providers/kaito && make docker-build IMG=<YOUR IMAGE>
cd providers/llmd && make docker-build IMG=<YOUR IMAGE>
# Defaults: PUSH=false and PLATFORM=linux/amd64
# Optional: push instead of load, or target a different platform
cd providers/llmd && make docker-build IMG=<YOUR IMAGE> PUSH=true PLATFORM=linux/amd64,linux/arm64
# Deploy provider to cluster
cd providers/kaito && make deploy IMG=<YOUR IMAGE>
cd providers/llmd && make deploy IMG=<YOUR IMAGE>
# Generate deploy manifest
cd providers/kaito && make generate-deploy-manifests
Environment Variables
Frontend (.env)
VITE_API_URL=http://localhost:3001
VITE_DEFAULT_NAMESPACE=airunway-system
VITE_DEFAULT_HF_SECRET=hf-token-secret
Backend (.env)
PORT=3001
DEFAULT_NAMESPACE=airunway-system
CORS_ORIGIN=http://localhost:5173
AUTH_ENABLED=false
Authentication
AI Runway supports optional authentication using Kubernetes OIDC tokens from your kubeconfig.
Enabling Authentication
Set the AUTH_ENABLED environment variable:
AUTH_ENABLED=true ./dist/airunway
Login Flow
-
Run the login command:
airunway loginThis extracts your OIDC token from kubeconfig and opens the browser with a magic link.
-
Alternative: Specify server URL:
airunway login --server https://airunway.example.com -
Use a specific kubeconfig context:
airunway login --context my-cluster
How It Works
- The CLI extracts the OIDC
id-tokenfrom your kubeconfig - Opens your browser with a URL containing the token in the fragment (
#token=...) - The frontend saves the token to localStorage
- All API requests include the token in the
Authorization: Bearerheader - The backend validates tokens using Kubernetes
TokenReviewAPI
Public Routes (No Auth Required)
These routes are accessible without authentication:
GET /api/health- Health checkGET /api/cluster/status- Cluster connection statusGET /api/settings- Settings (includesauth.enabledfor frontend)
CLI Commands
airunway # Start server (default)
airunway serve # Start server
airunway login # Login with kubeconfig credentials
airunway login --server URL # Login to specific server
airunway login --context X # Use specific kubeconfig context
airunway logout # Clear stored credentials
airunway version # Show version
airunway help # Show help
Project Commands
Root
bun run dev # Start both frontend and backend
bun run build # Build all packages
bun run compile # Build single binary (frontend + backend) to dist/airunway
bun run lint # Lint all packages
Controller (Go)
make controller-build # Build Go controller binary
make controller-test # Run controller tests
make controller-run # Run controller locally
make controller-generate # Regenerate CRDs and deepcopy code
make controller-install # Install CRDs into cluster
make controller-deploy # Deploy controller to cluster
Frontend
bun run dev:frontend # Start Vite dev server
bun run build:frontend # Build for production
Backend
bun run dev:backend # Start with watch mode
bun run build:backend # Compile TypeScript
bun run compile # Build single binary executable
The backend pins TypeScript to 5.3.3 to keep Bun/import-meta compilation behavior stable.
Do not widen that version without validating bun run build:backend and bun run compile.
Backend Testing
cd backend
bun test # Run all backend tests
bun test src/routes/autoscaler.test.ts # Run a specific test file
bun test --watch # Watch mode
bun test --coverage # With coverage report
Test organization:
src/routes/*.test.ts— Route-level tests using Hono'sapp.request()(exercises full middleware stack)src/services/*.test.ts— Service unit tests with mocked dependenciessrc/lib/*.test.ts— Utility/library unit testssrc/test/helpers.ts— Shared test utilities (mockServiceMethod,withTimeout)src/test/fixtures.ts— Reusable mock data for K8s resources
How mocking works: Tests import the Hono app directly and use app.request() to invoke routes in-process (no HTTP server needed). K8s-dependent services are mocked via property replacement on singleton instances. Tests that may hit K8s use withTimeout to gracefully skip when no cluster is available.
CI pipelines: The test.yml workflow runs all tests in an environment without a Kubernetes cluster (K8s-dependent tests gracefully skip via timeout). The e2e-backend.yml workflow runs the same tests against a real Kind cluster with KAITO and the controller deployed, where K8s-dependent tests execute fully.
Headlamp Plugin
cd plugins/headlamp
bun install # Install plugin dependencies
bun run build # Build plugin
bun run start # Development mode with auto-rebuild
bun run test # Run tests
bun run test:watch # Watch mode for tests
bun run lint # Lint code
bun run tsc # Type check only
Makefile Commands
make setup # Install deps, build, and deploy to Headlamp
make dev # Build and deploy for development
make build # Build only
make deploy # Deploy to Headlamp plugins directory
make clean # Remove build artifacts
Prerequisites for Headlamp Plugin
- Headlamp Desktop (v0.20+) or Headlamp running in-cluster
- AI Runway backend deployed or running locally
Configuring Backend URL
The plugin discovers the backend in this order:
- Plugin Settings: Configure in Headlamp → Settings → Plugins → AIRunway
- In-Cluster: Auto-discovers
airunway.<namespace>.svc - Default: Falls back to
http://localhost:3001
Testing with Headlamp Desktop
-
Build and deploy the plugin:
cd plugins/headlampmake setup -
Start AI Runway backend:
cd ../..bun run dev:backend -
Open Headlamp Desktop - the plugin should appear in the sidebar
Kubernetes Setup
Create HuggingFace Token Secret
kubectl create secret generic hf-token-secret \
--from-literal=HF_TOKEN="your-token" \
-n airunway
Install NVIDIA Dynamo (via Helm)
export NAMESPACE=dynamo-system
export RELEASE_VERSION=1.1.1
# The Dynamo platform chart bundles its CRDs
helm upgrade --install dynamo-platform \
https://helm.ngc.nvidia.com/nvidia/ai-dynamo/charts/dynamo-platform-${RELEASE_VERSION}.tgz \
--namespace ${NAMESPACE} \
--create-namespace \
--set-json global.grove.install=true
Adding a New Provider
Providers are independent out-of-tree Go operators in providers/<name>/. Each provider watches ModelDeployment resources and creates provider-specific resources.
There are two provider patterns:
Shim Providers (Adapter Pattern)
Use this when wrapping an existing inference operator that has its own CRD (e.g., KAITO Workspace, DynamoGraphDeployment, RayService). The provider translates ModelDeployment → upstream CRD and syncs status back.
ModelDeployment → Provider Controller → Upstream CRD → Upstream Operator → Pods/Services
↑ status sync
-
Create provider directory:
providers/<name>/├── cmd/main.go # Provider entrypoint├── controller.go # Reconciliation logic├── transformer.go # ModelDeployment → upstream CRD conversion├── status.go # Upstream CRD → ModelDeployment status mapping├── config.go # InferenceProviderConfig self-registration├── config/ # Kustomize deployment manifests├── Dockerfile # Container image├── go.mod # Independent Go module└── go.sum -
Implement the provider controller (see existing providers for examples):
controller.go: ReconcileModelDeploymentresources wherestatus.provider.namematchestransformer.go: ConvertModelDeploymentspec to upstream CRD resourcesstatus.go: Map upstream CRD status back toModelDeploymentstatusconfig.go: DefineInferenceProviderConfigSpecwithcapabilitiesand desired-stateselectionRules. Emit provider display metadata as annotations such asairunway.ai/display-name,airunway.ai/description,airunway.ai/default-namespace, andairunway.ai/documentation-url; providers may also mirror capabilities inairunway.ai/capabilitiesfor compatibility. Setairunway.ai/installationannotations for Helm/manual installation metadata;airunway.ai/documentationcan remain as a backward-compatible documentation fallback (see CRD Reference).
Native Providers (No Upstream CRD)
Use this when there is no upstream operator — the provider directly manages Kubernetes resources (Deployments, Services) from the ModelDeployment spec. No transformer or intermediate CRD is needed.
ModelDeployment → Provider Controller → Deployments/Services → Pods
↑ status sync
This works because the status.provider.resourceKind and resourceName fields are free-form strings — they can point at a Deployment just as easily as a Workspace. The core controller never inspects what the provider creates.
When to use this pattern:
- Building a new inference runtime with no pre-existing CRD
- A lightweight provider that runs vLLM/SGLang containers directly via Deployments
- A "generic" provider where an upstream CRD adds no value
Directory structure (no transformer.go needed):
providers/<name>/
├── cmd/main.go # Provider entrypoint
├── controller.go # Reconciliation logic (creates Deployments/Services directly)
├── status.go # Deployment/Pod → ModelDeployment status mapping
├── config.go # InferenceProviderConfig self-registration
├── config/ # Kustomize deployment manifests
├── Dockerfile
├── go.mod
└── go.sum
Example reconciliation (simplified):
func (r *Reconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
md := &v1alpha1.ModelDeployment{}
r.Get(ctx, req.NamespacedName, md)
// Build Deployment directly from ModelDeployment spec — no intermediate CRD
deploy := r.buildDeployment(md) // vllm container with model args
svc := r.buildService(md)
controllerutil.CreateOrUpdate(ctx, r.Client, deploy, func() error { return nil })
controllerutil.CreateOrUpdate(ctx, r.Client, svc, func() error { return nil })
// Sync status from Deployment
md.Status.Phase = phaseFromDeployment(deploy)
md.Status.Provider.ResourceName = deploy.Name
md.Status.Provider.ResourceKind = "Deployment"
md.Status.Replicas = replicasFromDeployment(deploy)
md.Status.Endpoint = endpointFromService(svc)
r.Status().Update(ctx, md)
return ctrl.Result{RequeueAfter: 30 * time.Second}, nil
}
The config.go for a native provider should define InferenceProviderConfigSpec.capabilities plus any selectionRules, and set display/default/documentation metadata through annotations such as airunway.ai/display-name, airunway.ai/description, airunway.ai/documentation-url, and airunway.ai/default-namespace. It can omit the airunway.ai/installation annotation when there is no upstream operator to install, or include it if the provider itself is installed via Helm. See CRD Reference and installation metadata for the annotation schemas.
Common Steps (Both Patterns)
-
Add Makefile targets in the root
Makefile:make <name>-provider-build # Build provider binarymake <name>-provider-docker-build # Build Docker imagemake <name>-provider-deploy # Deploy to cluster
Adding a New Model
Edit backend/src/data/models.json:
{
"models": [
{
"id": "org/model-name",
"name": "Model Display Name",
"description": "Brief description",
"size": "7B",
"task": "chat",
"contextLength": 32768,
"supportedEngines": ["vllm", "sglang"],
"minGpuMemory": "16GB"
}
]
}
Testing API Endpoints
# Health check
curl http://localhost:3001/api/health
# Cluster status
curl http://localhost:3001/api/cluster/status
# List models
curl http://localhost:3001/api/models
# List deployments
curl http://localhost:3001/api/deployments
# Create deployment (Dynamo/KubeRay)
curl -X POST http://localhost:3001/api/deployments \
-H "Content-Type: application/json" \
-d '{
"name": "test-deployment",
"namespace": "airunway-system",
"provider": "dynamo",
"modelId": "Qwen/Qwen3-0.6B",
"engine": "vllm",
"mode": "aggregated",
"replicas": 1,
"hfTokenSecret": "hf-token-secret",
"enforceEager": true
}'
# Create deployment (KAITO with premade model)
curl -X POST http://localhost:3001/api/deployments \
-H "Content-Type: application/json" \
-d '{
"name": "kaito-deployment",
"namespace": "kaito-workspace",
"provider": "kaito",
"modelSource": "premade",
"premadeModel": "llama3.2-1b",
"computeType": "cpu"
}'
# Create deployment (KAITO with HuggingFace GGUF - direct mode)
curl -X POST http://localhost:3001/api/deployments \
-H "Content-Type: application/json" \
-d '{
"name": "gemma-deployment",
"namespace": "kaito-workspace",
"provider": "kaito",
"modelSource": "huggingface",
"modelId": "bartowski/gemma-3-1b-it-GGUF",
"ggufFile": "gemma-3-1b-it-Q8_0.gguf",
"ggufRunMode": "direct",
"computeType": "cpu"
}'
# Create deployment (KAITO with vLLM for GPU inference)
curl -X POST http://localhost:3001/api/deployments \
-H "Content-Type: application/json" \
-d '{
"name": "vllm-deployment",
"namespace": "kaito-workspace",
"provider": "kaito",
"modelSource": "vllm",
"modelId": "Qwen/Qwen3-0.6B",
"hfTokenSecret": "hf-token-secret",
"resources": { "gpu": 1 }
}'
Accessing Deployed Models
After deployment is running:
# Port-forward to the service (check deployment details for exact service name)
# Dynamo/KubeRay deployments expose port 8000
kubectl port-forward svc/<deployment>-frontend 8000:8000 -n airunway-system
# KAITO deployments with vLLM expose port 8000
kubectl port-forward svc/<deployment-name> 8000:8000 -n kaito-workspace
# KAITO deployments with llama.cpp (premade/GGUF) expose port 5000
kubectl port-forward svc/<deployment-name> 5000:5000 -n kaito-workspace
# Test the model (OpenAI-compatible API)
# For vLLM (port 8000):
curl http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "Qwen/Qwen3-0.6B",
"messages": [{"role": "user", "content": "Hello!"}]
}'
# For llama.cpp (port 5000):
curl http://localhost:5000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "llama3.2-1b",
"messages": [{"role": "user", "content": "Hello!"}]
}'
GPU End-to-End Testing
make gpu-e2e runs a real-GPU end-to-end suite that deploys each inference
provider through a ModelDeployment, drives it to Running, and asserts that
inference actually serves through the inference gateway. Unlike the CPU/mocker
e2e lanes, it requires real GPU hardware and an already-provisioned cluster — it
never creates or deletes the cluster.
The harness (scripts/gpu-e2e.sh) builds and pushes the controller and provider
images, installs any missing upstream operator, deploys everything, then runs
the Go suite under test/e2e/gpu/. Providers covered: Dynamo, vLLM, KAITO
(KubeRay is not yet supported).
Cluster preconditions
The harness installs none of these (except a missing operator via setup-<p>):
- GPU nodes with the NVIDIA GPU Operator and NFD enabled, so nodes advertise
nvidia.com/gpuand thenvidia.com/gpu.present=truelabel. - An RWX-capable StorageClass. The Dynamo model-cache PVC defaults to
ReadWriteMany; Azure Disk classes areReadWriteOnceand will leave the PVCPending. The default isazurefile-premium; override with--storage-class. - The inference gateway (Gateway API CRDs + GAIE + Istio + BBR + a
Gatewaynamedinference-gateway). On a fresh clustermake -C providers/dynamo setup-dynamoinstalls it; otherwise it must already be present andProgrammed. The suite fails fast if it is missing. - Pull access to the pushed images. The manager manifests carry no
imagePullSecret, so the images must be public or the nodes must have pull access. New registry repositories often default to private — make them public once.
Running it
# All three providers, building+pushing images to your registry:
make gpu-e2e GPU_E2E_ARGS="--provider all --registry <your-registry>"
# A single provider:
make gpu-e2e GPU_E2E_ARGS="--provider vllm --registry <your-registry>"
# Re-test without rebuilding (requires an explicit, already-pushed tag):
make gpu-e2e GPU_E2E_ARGS="--provider dynamo --skip-build \
--registry <your-registry> --img-tag <tag>"
# Run the Go suite directly against an already-deployed cluster (no rebuild):
go test -C test/e2e/gpu -tags=e2e -v -run 'TestGPUProviders/vllm' ./
Flags are passed to the script via GPU_E2E_ARGS; pass them inside the quotes,
not as bare make arguments. See scripts/gpu-e2e.sh --help for the full list.
Key flags: --provider, --registry (required when building), --img-tag,
--storage-class, --skip-install, --skip-build, --keep.
Environment knobs
The script forwards these to the Go suite; you can also set them directly when
running go test:
| Variable | Meaning |
|---|---|
GPU_E2E_STORAGE_CLASS | RWX StorageClass injected into the Dynamo fixture and asserted on (default azurefile-premium). Set by --storage-class. |
GPU_E2E_KEEP | When true, leave ModelDeployments running after the test for inspection. Set by --keep. |
GPU_E2E_RESULTS_DIR | Optional override for where per-case result bundles are written (default test/e2e/gpu/gpu-e2e-results/<timestamp>/). |
GPU_E2E_RUN_TS | Optional fixed timestamp for the results directory name. |
Outcomes
Each case ends as PASS, FAIL, or SKIP. A SKIP means the cluster
lacks the capacity to schedule that case (more GPUs requested than any node has,
or no GPU free before the scheduling deadline) — it does not fail the run. Only a
genuine error (a broken deployment, failed inference, or orphaned resources after
delete) is a FAIL. Per-case logs and a result marker are written under the
results directory.
Troubleshooting
Controller not reconciling
- Check controller logs:
kubectl logs -n airunway-system deploy/airunway-controller-manager - Verify CRDs are installed:
kubectl get crd modeldeployments.airunway.ai - Check RBAC permissions for the controller service account
ModelDeployment stuck in Pending
- Check if any
InferenceProviderConfigresources exist:kubectl get inferenceproviderconfigs - Verify at least one provider has
status.ready: true - Check controller logs for provider selection errors
Backend can't connect to cluster
- Verify kubectl is configured:
kubectl cluster-info - Check KUBECONFIG environment variable
- Ensure proper RBAC permissions
Provider not detected as installed
- Check CRD exists:
- Dynamo:
kubectl get crd dynamographdeployments.nvidia.com - KubeRay:
kubectl get crd rayservices.ray.io - KAITO:
kubectl get crd workspaces.kaito.sh
- Dynamo:
- Check operator deployment:
- Dynamo:
kubectl get deployments -n dynamo-system - KubeRay:
kubectl get deployments -n ray-system - KAITO:
kubectl get deployments -n kaito-workspace
- Dynamo:
KAITO deployment stuck in Pending
- Check KAITO workspace status:
kubectl describe workspace <name> -n kaito-workspace - Verify node labels match labelSelector (default:
kubernetes.io/os: linux) - For vLLM mode, ensure GPU nodes are available
- Check events:
kubectl get events -n kaito-workspace --sort-by=.lastTimestamp
Metrics not available
- Metrics require AI Runway to run in-cluster
- Check deployment pods are running:
kubectl get pods -n <namespace> - Verify metrics endpoint is exposed (port 8000 for vLLM, port 5000 for llama.cpp)
Frontend can't reach backend
- Check CORS_ORIGIN matches frontend URL
- Verify backend is running on correct port
- Check browser console for errors
Headlamp Plugin Issues
Plugin not appearing in Headlamp
- Verify plugin was built:
cd plugins/headlamp && bun run build - Check plugin deployment location:
- macOS:
~/.config/Headlamp/plugins/airunway-headlamp-plugin - Linux:
~/.config/Headlamp/plugins/airunway-headlamp-plugin - Windows:
%APPDATA%/Headlamp/plugins/airunway-headlamp-plugin
- macOS:
- Restart Headlamp after deploying the plugin
Plugin can't connect to backend
- Check backend URL in Headlamp → Settings → Plugins → AIRunway
- Verify backend is running:
curl http://localhost:3001/api/health - For in-cluster deployments, ensure the service is accessible
- Check browser dev tools (Network tab) for connection errors
Plugin shows "Connection Failed" banner
- The plugin auto-discovers the backend; ensure it's running
- In-cluster: Deploy AI Runway backend to
airunway-systemnamespace - Local development: Start backend with
bun run dev:backend
Type errors after shared package changes
- Rebuild the shared package:
cd shared && bun run build - Rebuild the plugin:
cd plugins/headlamp && bun run build - Clear TypeScript cache:
rm -rf plugins/headlamp/node_modules/.cache