Copyright © 2022- basebox GmbH, all rights reserved.
Licensed to be used in conjunction with basebox, only.

basebox.ai Helm

Overview

The basebox umbrella chart is a Helm chart that bundles all basebox AI platform services into a single deployment. It orchestrates the deployment of frontend, backend services, databases, identity provider, and inference services.

Architecture

The umbrella chart deploys the following services:

Service	Purpose	Port	Dependencies
Frontend	Vue 3 web application	3000	AISRV, IDP
IDP	Keycloak identity provider	8080	PostgreSQL (idp-db)
AISRV	Main API server (GraphQL)	8888	PostgreSQL (aisrv-db), IDP, Inference, RAGSRV
STORESRV	Store server (GraphQL)	8889	PostgreSQL (storesrv-db), IDP
RAGSRV	RAG vector database service	3001	PostgreSQL (ragsrv-db), RAGSRV-Support, GPU
RAGSRV-Support	RAG model inference service	8000	GPU
Inference	LLM inference server	8080	GPU
CNPG Operator	CloudNativePG database operator	N/A	-

Prerequisites

Kubernetes Cluster Requirements

Kubernetes Version: 1.23 or later
Helm Version: 3.x
GPU Support: NVIDIA GPU Operator or similar (for inference and RAG services)
Ingress Controller: Any
Storage: Dynamic volume provisioning with a default storage class

Resource Requirements

Minimum Production Deployment: - CPU: 5+ cores - Memory: 12GB+ RAM - GPU: 1+ NVIDIA GPUs (1 for inference) - Storage: 200GB+ for databases and model caching

Recommended Production Deployment: - CPU: 10+ cores - Memory: 32GB+ RAM - GPU: 2+ NVIDIA GPUs (1 for inference, 1 for RAGSRV) - Storage: 500GB+ SSD/NVMe

External Dependencies

Image Registry: Access to gitea.basebox.health/basebox.distribution/
Domain/DNS: Configured DNS for the application domain
TLS Certificates: Valid certificates for HTTPS (via cert-manager pre-created or other means)

Global Configuration

Service Configuration

CloudNativePG Operator

Parameter	Default	Description
`cnpg-operator.fullnameOverride`	`cnpg`	Name override for CNPG operator

IDP (Keycloak)

Parameter	Default	Description
`idp.ingress.enabled`	`true`	Enable ingress for IDP
`idp.ingress.className`	`nginx`	Ingress class name
`idp.ingress.hosts[].host`	Required	Hostname for IDP
`idp.ingress.hosts[].paths[].path`	`/auth`	Base path for Keycloak
`idp.ingress.tls`	`[]`	TLS configuration

Key Annotations: - Proxy buffer sizes configured for large headers - CORS enabled for cross-origin requests - Cache disabled for authentication flows - Long timeouts for SSO operations (86400s)

AISRV

Parameter	Default	Description
`aisrv.database.enabled`	`true`	Enable PostgreSQL database
`aisrv.database.host`	`aisrv-db-rw`	Database host
`aisrv.database.port`	`5432`	Database port
`aisrv.database.user`	`aisrv`	Database username
`aisrv.database.password`	Required	Database password (use secure value)
`aisrv.database.dbname`	`aisrv`	Database name
`aisrv.ingress.enabled`	`true`	Enable ingress

Exposed Paths: - /rest - REST API endpoints - /graphql - GraphQL API endpoint - /subscriptions - GraphQL subscriptions (WebSocket) - /media - Media file serving

STORESRV

Parameter	Default	Description
`storesrv.logLevel`	`trace`	Logging level
`storesrv.host`	`0.0.0.0`	Server bind address
`storesrv.port`	`8889`	Server port
`storesrv.oauth.IdpUrl`	`http://idp:8080/realms/master`	OAuth IDP URL
`storesrv.oauth.IdpAud`	`account`	OAuth audience
`storesrv.database.enabled`	`true`	Enable PostgreSQL database
`storesrv.graphql.allowIntrospection`	`true`	Allow GraphQL introspection
`storesrv.graphql.graphiql`	`true`	Enable GraphiQL interface

RAGSRV

Parameter	Default	Description
`ragsrv.enabled`	`true`	Enable RAGSRV service
`ragsrv.resources.requests.nvidia.com/gpu`	`1`	Number of GPUs to request
`ragsrv.resources.limits.nvidia.com/gpu`	`1`	Maximum GPUs
`ragsrv.env.COMPUTE`	`gpu`	Compute mode (`gpu` or `cpu`)

Note: RAGSRV-Support is automatically enabled when RAGSRV is enabled.

RAGSRV-Support

Parameter	Description

This service is deployed automatically with RAGSRV.

Inference

Parameter	Default	Description
`inference.enabled`	`true`	Enable inference service
`inference.resources.requests.nvidia.com/gpu`	`1`	Number of GPUs to request
`inference.resources.limits.nvidia.com/gpu`	`1`	Maximum GPUs

Frontend

Parameter	Default	Description
`frontend.ingress.enabled`	`true`	Enable ingress
`frontend.ingress.hosts[].path`	`/`	Serve frontend at root path

The frontend is configured with the same proxy and CORS settings as other services.

Ingress Configuration

Common Ingress Annotations

All services use consistent nginx ingress annotations:

Proxy Configuration

nginx.ingress.kubernetes.io/proxy-body-size: "20m"
nginx.ingress.kubernetes.io/proxy-buffer-size: "128k"
nginx.ingress.kubernetes.io/proxy-buffers-number: "4"
nginx.ingress.kubernetes.io/proxy-busy-buffers-size: "256k"
nginx.ingress.kubernetes.io/client-header-buffer-size: "16k"
nginx.ingress.kubernetes.io/large-client-header-buffers: "4 16k"
nginx.ingress.kubernetes.io/proxy-read-timeout: "86400"
nginx.ingress.kubernetes.io/proxy-send-timeout: "86400"
nginx.ingress.kubernetes.io/proxy-buffering: "off"

Purpose: - Large body size (20MB) for file uploads - Large buffer sizes for authentication headers (JWT tokens) - Long timeouts (24h) for WebSocket connections - Buffering disabled for streaming responses

Cache Control

nginx.ingress.kubernetes.io/Cache-Control: "no-store, no-cache, must-revalidate, proxy-revalidate"
nginx.ingress.kubernetes.io/Pragma: "no-cache"
nginx.ingress.kubernetes.io/Expires: "0"
nginx.ingress.kubernetes.io/Surrogate-Control: "no-store"

Purpose: Prevent caching of API responses and authentication flows

CORS Configuration

nginx.ingress.kubernetes.io/enable-cors: "true"
nginx.ingress.kubernetes.io/cors-allow-origin: "*"
nginx.ingress.kubernetes.io/cors-allow-methods: "GET, POST, OPTIONS"
nginx.ingress.kubernetes.io/cors-allow-headers: "Authorization, Content-Type, X-Realm"
nginx.ingress.kubernetes.io/cors-allow-credentials: "true"
nginx.ingress.kubernetes.io/cors-max-age: "86400"

Purpose: Enable cross-origin requests for SPA frontend

Configuration Examples

Minimal Production Configuration

cnpg-operator:
  fullnameOverride: cnpg

idp:
  ingress:
    enabled: true
    className: "nginx"
    annotations:
      cert-manager.io/cluster-issuer: "letsencrypt-prod"
    hosts:
      - host: auth.company.com
        paths:
          - path: /auth
            pathType: Prefix
    tls:
      - hosts:
          - company.com
        secretName: company-tls

aisrv:
  database:
    enabled: true
    password: "<generate-secure-password>"
  ingress:
    enabled: true
    className: "nginx"
    annotations:
      cert-manager.io/cluster-issuer: "letsencrypt-prod"
    hosts:
      - host: api.company.com
        paths:
          - path: /rest
            pathType: Prefix
          - path: /graphql
            pathType: Exact
          - path: /subscriptions
            pathType: Exact
          - path: /media
            pathType: Prefix
    tls:
      - hosts:
          - company.com
        secretName: company-tls

storesrv:
  logLevel: info
  database:
    enabled: true
    password: "<generate-secure-password>"
  graphql:
    allowIntrospection: false
    graphiql: false

ragsrv:
  enabled: true
  resources:
    requests:
      nvidia.com/gpu: 1
    limits:
      nvidia.com/gpu: 1
  env:
    COMPUTE: gpu

ragsrv-support:

inference:
  enabled: true
  resources:
    requests:
      nvidia.com/gpu: 1
    limits:
      nvidia.com/gpu: 1

frontend:
  ingress:
    enabled: true
    className: "nginx"
    annotations:
      cert-manager.io/cluster-issuer: "letsencrypt-prod"
    hosts:
      - host: app.company.com
        paths:
          - path: /
    tls:
      - hosts:
          - company.com
        secretName: company-tls

Installation

Prepare Values File

Critical values to change: - All database passwords - Domain names - TLS secret names - Image pull secrets - GPU resource allocations - Storage class names

Install the Chart

# Install basebox umbrella chart
helm install basebox ./basebox.ai \
  --values values-production.yaml \
  --namespace basebox \
  --create-namespace \
  --timeout 30m

# Watch the deployment
kubectl get pods -n basebox -w

Verify Installation

# Check all pods are running
kubectl get pods -n basebox

# Check database clusters
kubectl get cluster -n basebox

# Check ingress
kubectl get ingress -n basebox

# Check GPU allocation
kubectl describe nodes | grep -A5 "Allocated resources"

Upgrade

Standard Upgrade

# Update values file
vim values-production.yaml

# Upgrade chart
helm upgrade --install basebox basebox/basebox.ai \
  --values values-production.yaml \
  --namespace basebox \
  --timeout 30m

Rolling Upgrade Strategy

For zero-downtime upgrades:

Update databases first (if schema changes)
Update backend services (AISRV, STORESRV, RAGSRV)
Update IDP (authentication service)
Update inference services
Update frontend last

Uninstall

Complete Removal

# Uninstall chart
helm uninstall basebox --namespace basebox

# Delete PVCs (databases - DATA LOSS!)
kubectl delete pvc -n basebox --all

# Delete namespace
kubectl delete namespace basebox

Preserve Data

To preserve databases when uninstalling:

# Uninstall chart
helm uninstall basebox --namespace basebox

# PVCs remain - do not delete them
# To reinstall with existing data:
helm install basebox ./basebox.ai \
  --values values-production.yaml \
  --namespace basebox

Database Management

Backup All Databases

# IDP database
kubectl exec -n basebox idp-db-1 -- \
  pg_dump -U idp idp > idp-backup.sql

# AISRV database
kubectl exec -n basebox aisrv-db-1 -- \
  pg_dump -U aisrv aisrv > aisrv-backup.sql

# STORESRV database
kubectl exec -n basebox storesrv-db-1 -- \
  pg_dump -U storesrv storesrv > storesrv-backup.sql

# RAGSRV database
kubectl exec -n basebox ragsrv-db-1 -- \
  pg_dump -U ragsrv ragsrv > ragsrv-backup.sql

CloudNativePG Backups

Configure automated backups in values:

aisrv:
  aisrv-db:
    cluster:
      backup:
        barmanObjectStore:
          destinationPath: s3://backups/aisrv
          s3Credentials:
            accessKeyId:
              name: backup-s3-creds
              key: ACCESS_KEY_ID
            secretAccessKey:
              name: backup-s3-creds
              key: SECRET_ACCESS_KEY
        retentionPolicy: "30d"

Monitoring

Health Checks

# Check all service health endpoints
kubectl get pods -n basebox -o wide

# Port-forward and test endpoints
kubectl port-forward -n basebox svc/aisrv 8888:8888
curl http://localhost:8888/health

kubectl port-forward -n basebox svc/inference 8080:8080
curl http://localhost:8080/health

Metrics

Enable Prometheus monitoring:

aisrv:
  metrics:
    enabled: true
    port: 9090

idp:
  idp-db:
    cluster:
      monitoring:
        enablePodMonitor: true

Logs

# View logs from all AISRV pods
kubectl logs -n basebox -l app.kubernetes.io/name=aisrv --tail=100 -f

# View logs from inference service
kubectl logs -n basebox -l app.kubernetes.io/name=inference --tail=100 -f

# View logs from specific pod
kubectl logs -n basebox <pod-name> -f

Troubleshooting

Common Issues

Pods Stuck in Pending: - Check GPU availability: kubectl describe nodes | grep nvidia.com/gpu - Check storage: kubectl get pvc -n basebox - Check resource requests vs cluster capacity

Database Connection Failures: - Verify database pods are running: kubectl get pods -n basebox | grep db - Check database secrets: kubectl get secrets -n basebox - Review database logs: kubectl logs -n basebox <db-pod-name>

Ingress Not Working: - Verify ingress controller is running - Check ingress resources: kubectl get ingress -n basebox - Verify DNS points to ingress controller - Check TLS certificates: kubectl get certificates -n basebox

GPU Not Available: - Verify GPU operator: kubectl get pods -n gpu-operator - Check node labels: kubectl get nodes -L nvidia.com/gpu - Verify GPU scheduling: kubectl describe nodes | grep -A5 "nvidia.com/gpu"

Debug Commands

# Get all resources in namespace
kubectl get all -n basebox

# Describe pod for events and status
kubectl describe pod -n basebox <pod-name>

# Check resource usage
kubectl top pods -n basebox
kubectl top nodes

# View events
kubectl get events -n basebox --sort-by='.lastTimestamp'

# Test internal connectivity
kubectl run -it --rm debug --image=busybox --restart=Never -n basebox -- sh
# Inside pod:
wget -O- http://aisrv:8888/health
wget -O- http://inference:8080/health

Security Considerations

Resource Allocation

Backend Services (AISRV, STORESRV):

resources:
  requests:
    cpu: 1000m
    memory: 2Gi
  limits:
    cpu: 2000m
    memory: 4Gi

GPU Services (Inference, RAGSRV):

resources:
  requests:
    cpu: 4000m
    memory: 32Gi
    nvidia.com/gpu: 1
  limits:
    cpu: 8000m
    memory: 64Gi
    nvidia.com/gpu: 1

Databases:

cluster:
  instances: 3
  storage:
    size: 50Gi
    storageClass: fast-ssd

Database Performance

Use read replicas (increase instances)
Use fast storage (SSD/NVMe)
Tune PostgreSQL parameters
Regular VACUUM and ANALYZE