Licensed to be used in conjunction with basebox, only.
basebox.ai Helm
Overview
The basebox umbrella chart is a Helm chart that bundles all basebox AI platform services into a single deployment. It orchestrates the deployment of frontend, backend services, databases, identity provider, and inference services.
Architecture
The umbrella chart deploys the following services:
| Service | Purpose | Port | Dependencies |
|---|---|---|---|
| Frontend | Vue 3 web application | 3000 | AISRV, IDP |
| IDP | Keycloak identity provider | 8080 | PostgreSQL (idp-db) |
| AISRV | Main API server (GraphQL) | 8888 | PostgreSQL (aisrv-db), IDP, Inference, RAGSRV |
| STORESRV | Store server (GraphQL) | 8889 | PostgreSQL (storesrv-db), IDP |
| RAGSRV | RAG vector database service | 3001 | PostgreSQL (ragsrv-db), RAGSRV-Support, GPU |
| RAGSRV-Support | RAG model inference service | 8000 | GPU |
| Inference | LLM inference server | 8080 | GPU |
| CNPG Operator | CloudNativePG database operator | N/A | - |
Prerequisites
Kubernetes Cluster Requirements
- Kubernetes Version: 1.23 or later
- Helm Version: 3.x
- GPU Support: NVIDIA GPU Operator or similar (for inference and RAG services)
- Ingress Controller: Any
- Storage: Dynamic volume provisioning with a default storage class
Resource Requirements
Minimum Production Deployment: - CPU: 5+ cores - Memory: 12GB+ RAM - GPU: 1+ NVIDIA GPUs (1 for inference) - Storage: 200GB+ for databases and model caching
Recommended Production Deployment: - CPU: 10+ cores - Memory: 32GB+ RAM - GPU: 2+ NVIDIA GPUs (1 for inference, 1 for RAGSRV) - Storage: 500GB+ SSD/NVMe
External Dependencies
- Image Registry: Access to
gitea.basebox.health/basebox.distribution/ - Domain/DNS: Configured DNS for the application domain
- TLS Certificates: Valid certificates for HTTPS (via cert-manager pre-created or other means)
Global Configuration
Service Configuration
CloudNativePG Operator
| Parameter | Default | Description |
|---|---|---|
cnpg-operator.fullnameOverride |
cnpg |
Name override for CNPG operator |
IDP (Keycloak)
| Parameter | Default | Description |
|---|---|---|
idp.ingress.enabled |
true |
Enable ingress for IDP |
idp.ingress.className |
nginx |
Ingress class name |
idp.ingress.hosts[].host |
Required | Hostname for IDP |
idp.ingress.hosts[].paths[].path |
/auth |
Base path for Keycloak |
idp.ingress.tls |
[] |
TLS configuration |
Key Annotations: - Proxy buffer sizes configured for large headers - CORS enabled for cross-origin requests - Cache disabled for authentication flows - Long timeouts for SSO operations (86400s)
AISRV
| Parameter | Default | Description |
|---|---|---|
aisrv.database.enabled |
true |
Enable PostgreSQL database |
aisrv.database.host |
aisrv-db-rw |
Database host |
aisrv.database.port |
5432 |
Database port |
aisrv.database.user |
aisrv |
Database username |
aisrv.database.password |
Required | Database password (use secure value) |
aisrv.database.dbname |
aisrv |
Database name |
aisrv.ingress.enabled |
true |
Enable ingress |
Exposed Paths:
- /rest - REST API endpoints
- /graphql - GraphQL API endpoint
- /subscriptions - GraphQL subscriptions (WebSocket)
- /media - Media file serving
STORESRV
| Parameter | Default | Description |
|---|---|---|
storesrv.logLevel |
trace |
Logging level |
storesrv.host |
0.0.0.0 |
Server bind address |
storesrv.port |
8889 |
Server port |
storesrv.oauth.IdpUrl |
http://idp:8080/realms/master |
OAuth IDP URL |
storesrv.oauth.IdpAud |
account |
OAuth audience |
storesrv.database.enabled |
true |
Enable PostgreSQL database |
storesrv.graphql.allowIntrospection |
true |
Allow GraphQL introspection |
storesrv.graphql.graphiql |
true |
Enable GraphiQL interface |
RAGSRV
| Parameter | Default | Description |
|---|---|---|
ragsrv.enabled |
true |
Enable RAGSRV service |
ragsrv.resources.requests.nvidia.com/gpu |
1 |
Number of GPUs to request |
ragsrv.resources.limits.nvidia.com/gpu |
1 |
Maximum GPUs |
ragsrv.env.COMPUTE |
gpu |
Compute mode (gpu or cpu) |
Note: RAGSRV-Support is automatically enabled when RAGSRV is enabled.
RAGSRV-Support
| Parameter | Description |
|---|---|
This service is deployed automatically with RAGSRV.
Inference
| Parameter | Default | Description |
|---|---|---|
inference.enabled |
true |
Enable inference service |
inference.resources.requests.nvidia.com/gpu |
1 |
Number of GPUs to request |
inference.resources.limits.nvidia.com/gpu |
1 |
Maximum GPUs |
Frontend
| Parameter | Default | Description |
|---|---|---|
frontend.ingress.enabled |
true |
Enable ingress |
frontend.ingress.hosts[].path |
/ |
Serve frontend at root path |
The frontend is configured with the same proxy and CORS settings as other services.
Ingress Configuration
Common Ingress Annotations
All services use consistent nginx ingress annotations:
Proxy Configuration
nginx.ingress.kubernetes.io/proxy-body-size: "20m"
nginx.ingress.kubernetes.io/proxy-buffer-size: "128k"
nginx.ingress.kubernetes.io/proxy-buffers-number: "4"
nginx.ingress.kubernetes.io/proxy-busy-buffers-size: "256k"
nginx.ingress.kubernetes.io/client-header-buffer-size: "16k"
nginx.ingress.kubernetes.io/large-client-header-buffers: "4 16k"
nginx.ingress.kubernetes.io/proxy-read-timeout: "86400"
nginx.ingress.kubernetes.io/proxy-send-timeout: "86400"
nginx.ingress.kubernetes.io/proxy-buffering: "off"
Purpose: - Large body size (20MB) for file uploads - Large buffer sizes for authentication headers (JWT tokens) - Long timeouts (24h) for WebSocket connections - Buffering disabled for streaming responses
Cache Control
nginx.ingress.kubernetes.io/Cache-Control: "no-store, no-cache, must-revalidate, proxy-revalidate"
nginx.ingress.kubernetes.io/Pragma: "no-cache"
nginx.ingress.kubernetes.io/Expires: "0"
nginx.ingress.kubernetes.io/Surrogate-Control: "no-store"
Purpose: Prevent caching of API responses and authentication flows
CORS Configuration
nginx.ingress.kubernetes.io/enable-cors: "true"
nginx.ingress.kubernetes.io/cors-allow-origin: "*"
nginx.ingress.kubernetes.io/cors-allow-methods: "GET, POST, OPTIONS"
nginx.ingress.kubernetes.io/cors-allow-headers: "Authorization, Content-Type, X-Realm"
nginx.ingress.kubernetes.io/cors-allow-credentials: "true"
nginx.ingress.kubernetes.io/cors-max-age: "86400"
Purpose: Enable cross-origin requests for SPA frontend
Configuration Examples
Minimal Production Configuration
cnpg-operator:
fullnameOverride: cnpg
idp:
ingress:
enabled: true
className: "nginx"
annotations:
cert-manager.io/cluster-issuer: "letsencrypt-prod"
hosts:
- host: auth.company.com
paths:
- path: /auth
pathType: Prefix
tls:
- hosts:
- company.com
secretName: company-tls
aisrv:
database:
enabled: true
password: "<generate-secure-password>"
ingress:
enabled: true
className: "nginx"
annotations:
cert-manager.io/cluster-issuer: "letsencrypt-prod"
hosts:
- host: api.company.com
paths:
- path: /rest
pathType: Prefix
- path: /graphql
pathType: Exact
- path: /subscriptions
pathType: Exact
- path: /media
pathType: Prefix
tls:
- hosts:
- company.com
secretName: company-tls
storesrv:
logLevel: info
database:
enabled: true
password: "<generate-secure-password>"
graphql:
allowIntrospection: false
graphiql: false
ragsrv:
enabled: true
resources:
requests:
nvidia.com/gpu: 1
limits:
nvidia.com/gpu: 1
env:
COMPUTE: gpu
ragsrv-support:
inference:
enabled: true
resources:
requests:
nvidia.com/gpu: 1
limits:
nvidia.com/gpu: 1
frontend:
ingress:
enabled: true
className: "nginx"
annotations:
cert-manager.io/cluster-issuer: "letsencrypt-prod"
hosts:
- host: app.company.com
paths:
- path: /
tls:
- hosts:
- company.com
secretName: company-tls
Installation
Prepare Values File
Critical values to change: - All database passwords - Domain names - TLS secret names - Image pull secrets - GPU resource allocations - Storage class names
Install the Chart
# Install basebox umbrella chart
helm install basebox ./basebox.ai \
--values values-production.yaml \
--namespace basebox \
--create-namespace \
--timeout 30m
# Watch the deployment
kubectl get pods -n basebox -w
Verify Installation
# Check all pods are running
kubectl get pods -n basebox
# Check database clusters
kubectl get cluster -n basebox
# Check ingress
kubectl get ingress -n basebox
# Check GPU allocation
kubectl describe nodes | grep -A5 "Allocated resources"
Upgrade
Standard Upgrade
# Update values file
vim values-production.yaml
# Upgrade chart
helm upgrade --install basebox basebox/basebox.ai \
--values values-production.yaml \
--namespace basebox \
--timeout 30m
Rolling Upgrade Strategy
For zero-downtime upgrades:
- Update databases first (if schema changes)
- Update backend services (AISRV, STORESRV, RAGSRV)
- Update IDP (authentication service)
- Update inference services
- Update frontend last
Uninstall
Complete Removal
# Uninstall chart
helm uninstall basebox --namespace basebox
# Delete PVCs (databases - DATA LOSS!)
kubectl delete pvc -n basebox --all
# Delete namespace
kubectl delete namespace basebox
Preserve Data
To preserve databases when uninstalling:
# Uninstall chart
helm uninstall basebox --namespace basebox
# PVCs remain - do not delete them
# To reinstall with existing data:
helm install basebox ./basebox.ai \
--values values-production.yaml \
--namespace basebox
Database Management
Backup All Databases
# IDP database
kubectl exec -n basebox idp-db-1 -- \
pg_dump -U idp idp > idp-backup.sql
# AISRV database
kubectl exec -n basebox aisrv-db-1 -- \
pg_dump -U aisrv aisrv > aisrv-backup.sql
# STORESRV database
kubectl exec -n basebox storesrv-db-1 -- \
pg_dump -U storesrv storesrv > storesrv-backup.sql
# RAGSRV database
kubectl exec -n basebox ragsrv-db-1 -- \
pg_dump -U ragsrv ragsrv > ragsrv-backup.sql
CloudNativePG Backups
Configure automated backups in values:
aisrv:
aisrv-db:
cluster:
backup:
barmanObjectStore:
destinationPath: s3://backups/aisrv
s3Credentials:
accessKeyId:
name: backup-s3-creds
key: ACCESS_KEY_ID
secretAccessKey:
name: backup-s3-creds
key: SECRET_ACCESS_KEY
retentionPolicy: "30d"
Monitoring
Health Checks
# Check all service health endpoints
kubectl get pods -n basebox -o wide
# Port-forward and test endpoints
kubectl port-forward -n basebox svc/aisrv 8888:8888
curl http://localhost:8888/health
kubectl port-forward -n basebox svc/inference 8080:8080
curl http://localhost:8080/health
Metrics
Enable Prometheus monitoring:
Logs
# View logs from all AISRV pods
kubectl logs -n basebox -l app.kubernetes.io/name=aisrv --tail=100 -f
# View logs from inference service
kubectl logs -n basebox -l app.kubernetes.io/name=inference --tail=100 -f
# View logs from specific pod
kubectl logs -n basebox <pod-name> -f
Troubleshooting
Common Issues
Pods Stuck in Pending:
- Check GPU availability: kubectl describe nodes | grep nvidia.com/gpu
- Check storage: kubectl get pvc -n basebox
- Check resource requests vs cluster capacity
Database Connection Failures:
- Verify database pods are running: kubectl get pods -n basebox | grep db
- Check database secrets: kubectl get secrets -n basebox
- Review database logs: kubectl logs -n basebox <db-pod-name>
Ingress Not Working:
- Verify ingress controller is running
- Check ingress resources: kubectl get ingress -n basebox
- Verify DNS points to ingress controller
- Check TLS certificates: kubectl get certificates -n basebox
GPU Not Available:
- Verify GPU operator: kubectl get pods -n gpu-operator
- Check node labels: kubectl get nodes -L nvidia.com/gpu
- Verify GPU scheduling: kubectl describe nodes | grep -A5 "nvidia.com/gpu"
Debug Commands
# Get all resources in namespace
kubectl get all -n basebox
# Describe pod for events and status
kubectl describe pod -n basebox <pod-name>
# Check resource usage
kubectl top pods -n basebox
kubectl top nodes
# View events
kubectl get events -n basebox --sort-by='.lastTimestamp'
# Test internal connectivity
kubectl run -it --rm debug --image=busybox --restart=Never -n basebox -- sh
# Inside pod:
wget -O- http://aisrv:8888/health
wget -O- http://inference:8080/health
Security Considerations
Resource Allocation
Backend Services (AISRV, STORESRV):
GPU Services (Inference, RAGSRV):
resources:
requests:
cpu: 4000m
memory: 32Gi
nvidia.com/gpu: 1
limits:
cpu: 8000m
memory: 64Gi
nvidia.com/gpu: 1
Databases:
Database Performance
- Use read replicas (increase
instances) - Use fast storage (SSD/NVMe)
- Tune PostgreSQL parameters
- Regular VACUUM and ANALYZE