
AI Runway
Deploy and manage large language models on Kubernetes — no YAML required.
A web UI and unified ModelDeployment CRD on top of the leading Kubernetes inference providers. Browse HuggingFace, pick a model, click deploy.
Quick Start
Pick the flavor that matches how you work — both take less than a minute.
Run locally
Download the latest release and launch the dashboard against your current kubeconfig.
./airunway
# open http://localhost:3001
Deploy to Kubernetes
Apply the controller and optional dashboard manifests directly into your cluster.
kubectl apply -f https://raw.githubusercontent.com/kaito-project/airunway/main/deploy/controller.yaml
kubectl apply -f https://raw.githubusercontent.com/kaito-project/airunway/main/deploy/dashboard.yaml
kubectl port-forward -n airunway-system svc/airunway 3001:80
Highlights
One-Click Deploy
Browse models, check GPU fit, and deploy from the web UI — no YAML required.
Unified CRD
A single ModelDeployment API works across every supported provider and engine.
Multiple Engines
vLLM, SGLang, TensorRT-LLM, and llama.cpp — picked automatically per workload.
Live Monitoring
Real-time status, log streaming, and Prometheus metrics built into the dashboard.
Cost Estimation
Surface GPU pricing and capacity guidance before you commit to a deployment.
Gateway Integration
Auto-detected Gateway API Inference Extension setup with a unified inference endpoint.
Supported Providers
Bring the inference stack you already know — AI Runway gives them a common API.
See it in action
A two-minute tour of deploying a model end to end.
Join the community
Issues, ideas, and pull requests are all welcome.