Skip to content

AISRV

Overview

AISRV is the main AI server that provides a GraphQL API for the basebox AI application. It handles user authentication via OpenID Connect (OIDC/Keycloak), connects to a PostgreSQL database, and integrates with LLM providers, RAG services, Whisper transcription, and other external services.

Important: The server is designed to run behind a reverse proxy (e.g., nginx) and does not provide TLS support on its own. Configure your reverse proxy or ingress controller to handle TLS/SSL termination.

Deployment

AISRV is deployed via Helm chart to Kubernetes clusters with an integrated PostgreSQL database managed by CloudNativePG.

Helm Chart Configuration

Basic Settings

Parameter Default Description
replicaCount 1 Number of AISRV pod replicas
image.repository gitea.basebox.health/basebox-distribution/aisrv Container image repository
image.pullPolicy IfNotPresent Image pull policy
image.tag latest Image tag to deploy
fullnameOverride aisrv Override the full name of the deployment

Service Configuration

Parameter Default Description
service.type ClusterIP Kubernetes service type
service.port 8888 Service port

Resource Management

Parameter Description
resources.requests CPU/memory resource requests
resources.limits CPU/memory resource limits
autoscaling.enabled Enable horizontal pod autoscaling
autoscaling.minReplicas Minimum number of replicas
autoscaling.maxReplicas Maximum number of replicas
autoscaling.targetCPUUtilizationPercentage Target CPU for scaling

Health Checks

Parameter Description
livenessProbe Liveness probe configuration
readinessProbe Readiness probe configuration

Database Configuration

Database Settings

Parameter Default Description
database.enabled true Enable database creation
database.imageName ghcr.io/cloudnative-pg/postgresql:16-standard-bookworm PostgreSQL image
database.host aisrv-db-rw Database host (read-write service)
database.port 5432 Database port
database.user aisrv Database username
database.password <secure-password> Database password
database.name aisrv Database name
database.sslMode disable Database SSL mode (disable, require, verify-full)

CloudNativePG Cluster Settings

Parameter Default Description
aisrv-db.cluster.instances 1 Number of PostgreSQL instances
aisrv-db.cluster.storage.size 10Gi Storage size for database
aisrv-db.cluster.storage.storageClass default Storage class to use
aisrv-db.cluster.monitoring.enablePodMonitor false Enable Prometheus monitoring

Migration Settings

Variable Default Description
AISRV_DB_MIGRATE false Automatic database migration on startup
AISRV_DB_MIGRATE_BACKUP false Backup database before migrations
AISRV_DB_MIGRATE_BACKUP_DIR migrations-backups Backup directory path
AISRV_DB_MIGRATE_RUN_ONLY false Run only migrations then exit

Environment Variables

Server Configuration

Variable Default Description
AISRV_HOST 0.0.0.0 Host or IP address to listen on
AISRV_PORT 8888 Port to listen on
AISRV_LOG_LEVEL info Log level (trace, debug, info, warn, error)
AISRV_SET_CORS_HEADERS true Set CORS headers (disable when using reverse proxy)
AISRV_DEBUG_MODE false Debug mode: send errors verbatim to client
AISRV_ON_PREMISE false On-premise deployment mode
AISRV_METRICS_PORT None Metrics port (Prometheus)

Database Connection (from Secrets)

Variable Source Description
AISRV_DB_HOST aisrv-database secret Database hostname
AISRV_DB_PORT aisrv-database secret Database port
AISRV_DB_USER aisrv-database secret Database username
AISRV_DB_PASSWORD aisrv-database secret Database password
AISRV_DB_NAME aisrv-database secret Database name
AISRV_DB_SSL_MODE Configuration SSL mode for database connection

OIDC/Authentication Configuration

Variable Description
AISRV_OIDC_IDP_URL Base URL of IdP (Keycloak) server (without realm)
AISRV_OIDC_ISSUER_URL Optional issuer URL for token validation
AISRV_OIDC_AUD OIDC audience field contents
AISRV_OIDC_JWKS_FILEPATH Path to OIDC JWKS file
AISRV_OIDC_SUPER_ADMIN_USER Super admin username
AISRV_OIDC_SUPER_ADMIN_PASSWORD Super admin password
AISRV_OIDC_SUPER_ADMIN_CLIENT_ID Super admin client ID
AISRV_OIDC_SUPER_ADMIN_CLIENT_SECRET Super admin client secret
AISRV_BASE_DOMAIN Base domain for organizations (e.g., basebox.ai)
AISRV_USE_TLS Assume HTTPS URLs (true for production)

GraphQL Configuration

Variable Default Description
AISRV_QUERY_DEPTH_LIMIT 6 GraphQL query depth limit
AISRV_QUERY_COMPLEXITY_LIMIT 20 GraphQL query complexity limit
AISRV_GRAPHQL_ALLOW_INTROSPECTION false Allow introspection queries
AISRV_GRAPHQL_APOLLO_TRACING false Enable Apollo tracing
AISRV_GRAPHQL_GRAPHIQL false Enable GraphiQL interface

LLM Configuration

Variable Description
AISRV_LLM_URL URL of the LLM server
AISRV_LLM_CHAT_ENDPOINT Chat completions endpoint path
AISRV_LLM_API_KEY API key for LLM service
AISRV_LLM_MODEL LLM model identifier
AISRV_LLM_PROVIDER LLM provider (determines auth scheme)
AISRV_LLM_CONTEXT_SIZE Maximum context size in tokens
AISRV_LLM_WORD_LIMIT Estimated word limit per request
AISRV_LLM_MAX_TOKENS Maximum output tokens (default: 8000)
AISRV_LLM_TEMPERATURE Sampling temperature (0-1)
AISRV_LLM_TOP_P Nucleus sampling parameter (0-1)
AISRV_LLM_REPETITION_PENALTY Repetition penalty (>1.0 discourages repetition)

RAG Configuration

Variable Description
AISRV_ENABLE_RAG_API Enable RAG endpoints
AISRV_RAG_URL Base URL of RAG server API
AISRV_RAG_DOWNLOAD_URL URL for RAG server to download files
AISRV_RAG_API_KEY API key for RAG service
AISRV_MAX_RAG_FILE_SIZE Maximum file size for uploads
AISRV_MAX_RAG_FILES_PER_APP Maximum files per application

Whisper Configuration

Variable Default Description
AISRV_WHISPER_URL http://localhost:6000/inference Whisper service endpoint
AISRV_WHISPER_API_KEY None API key for Whisper service

SMTP Configuration

Variable Description
AISRV_SMTP_HOST SMTP server hostname
AISRV_SMTP_PORT SMTP server port
AISRV_SMTP_USERNAME SMTP username
AISRV_SMTP_PASSWORD SMTP password
AISRV_SMTP_TLS_TYPE SMTP connection type (tls, starttls)
AISRV_SENDER_EMAIL From-email address

Media Storage Configuration

Variable Description
AISRV_STORE_URL URL for store server
AISRV_MEDIA_ROOT Root directory for uploaded files
AISRV_MEDIA_URL URL to media server root
AISRV_ENABLE_STATIC_FILES Enable static file server

API Configuration

Variable Default Description
AISRV_ENABLE_REST_API false Enable REST API and token management

Configuration Examples

Production Configuration

# values-production.yaml
replicaCount: 3

image:
  repository: gitea.basebox.health/basebox-distribution/aisrv
  tag: "v1.2.3"
  pullPolicy: IfNotPresent

service:
  type: ClusterIP
  port: 8888

ingress:
  enabled: true
  className: "nginx"
  annotations:
    cert-manager.io/cluster-issuer: "letsencrypt-prod"
    nginx.ingress.kubernetes.io/ssl-redirect: "true"
  hosts:
    - host: company.com
      paths:
        - path: /graphql
          pathType: Prefix
  tls:
    - secretName: aisrv-tls
      hosts:
        - company.com

resources:
  requests:
    cpu: 1000m
    memory: 2Gi
  limits:
    cpu: 2000m
    memory: 4Gi

autoscaling:
  enabled: true
  minReplicas: 3
  maxReplicas: 10
  targetCPUUtilizationPercentage: 70

livenessProbe:
  httpGet:
    path: /health
    port: http
  initialDelaySeconds: 30
  periodSeconds: 10

readinessProbe:
  httpGet:
    path: /health
    port: http
  initialDelaySeconds: 15
  periodSeconds: 5

database:
  enabled: true
  host: aisrv-db-rw
  port: 5432
  user: aisrv
  password: "<generate-secure-password>"
  name: aisrv
  sslMode: "require"

aisrv-db:
  cluster:
    instances: 3
    storage:
      size: 50Gi
      storageClass: fast-ssd
    monitoring:
      enablePodMonitor: true

env:
  # Server
  AISRV_HOST: "0.0.0.0"
  AISRV_PORT: "8888"
  AISRV_LOG_LEVEL: "info"
  AISRV_SET_CORS_HEADERS: "false"  # Using ingress
  AISRV_DEBUG_MODE: "false"
  AISRV_METRICS_PORT: "9090"

  # Database (from secrets)
  AISRV_DB_HOST:
    valueFrom:
      secretKeyRef:
        name: aisrv-database
        key: host
  AISRV_DB_PORT:
    valueFrom:
      secretKeyRef:
        name: aisrv-database
        key: port
  AISRV_DB_USER:
    valueFrom:
      secretKeyRef:
        name: aisrv-database
        key: username
  AISRV_DB_PASSWORD:
    valueFrom:
      secretKeyRef:
        name: aisrv-database
        key: password
  AISRV_DB_NAME:
    valueFrom:
      secretKeyRef:
        name: aisrv-database
        key: name
  AISRV_DB_SSL_MODE: "require"

  # Migrations
  AISRV_DB_MIGRATE: "true"
  AISRV_DB_MIGRATE_BACKUP: "true"
  AISRV_DB_MIGRATE_BACKUP_DIR: "/backups"

  # OIDC
  AISRV_OIDC_IDP_URL: "http://idp:8080"
  AISRV_OIDC_ISSUER_URL: "https://auth.company.com"
  AISRV_OIDC_AUD: "aisrv"
  AISRV_OIDC_SUPER_ADMIN_USER:
    valueFrom:
      secretKeyRef:
        name: aisrv-admin
        key: username
  AISRV_OIDC_SUPER_ADMIN_PASSWORD:
    valueFrom:
      secretKeyRef:
        name: aisrv-admin
        key: password
  AISRV_BASE_DOMAIN: "company.com"
  AISRV_USE_TLS: "true"

  # GraphQL
  AISRV_QUERY_DEPTH_LIMIT: "8"
  AISRV_QUERY_COMPLEXITY_LIMIT: "50"
  AISRV_GRAPHQL_ALLOW_INTROSPECTION: "false"
  AISRV_GRAPHQL_GRAPHIQL: "false"

  # LLM
  AISRV_LLM_URL: "http://inference:8080"
  AISRV_LLM_CHAT_ENDPOINT: "/v1/chat/completions"
  AISRV_LLM_API_KEY:
    valueFrom:
      secretKeyRef:
        name: llm-credentials
        key: api-key
  AISRV_LLM_MODEL: "meta-llama/Llama-3.1-8B-Instruct"
  AISRV_LLM_PROVIDER: "openai-compatible"
  AISRV_LLM_CONTEXT_SIZE: "16000"
  AISRV_LLM_MAX_TOKENS: "4096"
  AISRV_LLM_TEMPERATURE: "0.7"

  # RAG
  AISRV_ENABLE_RAG_API: "true"
  AISRV_RAG_URL: "http://ragsrv:3001"
  AISRV_RAG_API_KEY:
    valueFrom:
      secretKeyRef:
        name: rag-credentials
        key: api-key

  # Whisper
  AISRV_WHISPER_URL: "http://whisper:6000/inference"

  # SMTP
  AISRV_SMTP_HOST: "smtp.company.com"
  AISRV_SMTP_PORT: "587"
  AISRV_SMTP_USERNAME:
    valueFrom:
      secretKeyRef:
        name: smtp-credentials
        key: username
  AISRV_SMTP_PASSWORD:
    valueFrom:
      secretKeyRef:
        name: smtp-credentials
        key: password
  AISRV_SMTP_TLS_TYPE: "starttls"
  AISRV_SENDER_EMAIL: "noreply@company.com"

Installation

Prerequisites

  • Kubernetes cluster (1.23+)
  • Helm 3.x
  • CloudNativePG operator installed
  • Storage provisioner
  • OIDC/Keycloak instance configured

Install CloudNativePG Operator

helm repo add cnpg https://cloudnative-pg.github.io/charts
helm upgrade --install cnpg \
  --namespace cnpg-system \
  --create-namespace \
  cnpg/cloudnative-pg

Install AISRV

# Install with custom values
helm install aisrv oci://hub.basebox.ai/helm/aisrv \
  --values values-production.yaml \
  --namespace basebox \
  --create-namespace

# Verify installation
kubectl get pods -n basebox -l app.kubernetes.io/name=aisrv
kubectl get cluster -n basebox aisrv-db

Upgrade

helm upgrade aisrv oci://hub.basebox.ai/helm/aisrv \
  --values values-production.yaml \
  --namespace basebox

Uninstall

helm uninstall aisrv --namespace basebox

# Delete PVCs if needed
kubectl delete pvc -n basebox -l cnpg.io/cluster=aisrv-db

Migrations

How Migrations Work

  • Migration Tracking: _migrations_history table tracks applied migrations
  • Numbered Format: Vnnn__<migration_name> (double underscores)
  • Sequential Execution: Migrations run in numerical order
  • Checksum Validation: Detects alterations to migration files
  • Embedded: All migrations embedded in application binary

Running Migrations

Automatic on Startup:

env:
  AISRV_DB_MIGRATE: "true"
  AISRV_DB_MIGRATE_BACKUP: "true"

Migrations Only (No Server Start):

env:
  AISRV_DB_MIGRATE: "true"
  AISRV_DB_MIGRATE_RUN_ONLY: "true"

Verification

Check Deployment

# Check pods
kubectl get pods -n basebox -l app.kubernetes.io/name=aisrv

# Check database
kubectl get cluster -n basebox aisrv-db

# View logs
kubectl logs -n basebox -l app.kubernetes.io/name=aisrv --tail=100

Test GraphQL API

# Port forward
kubectl port-forward -n basebox svc/aisrv 8888:8888

# Test query
curl -X POST http://localhost:8888/graphql \
  -H "Content-Type: application/json" \
  -d '{"query": "{ __typename }"}'

Integration with Other Services

IDP (Keycloak)

env:
  AISRV_OIDC_IDP_URL: "http://idp:8080"

Inference Server

env:
  AISRV_LLM_URL: "http://inference:8080"
  AISRV_LLM_CHAT_ENDPOINT: "/v1/chat/completions"

RAGSRV

env:
  AISRV_ENABLE_RAG_API: "true"
  AISRV_RAG_URL: "http://ragsrv:3001"

STORESRV

env:
  AISRV_STORE_URL: "http://storesrv:8889"

Performance Tuning

Resource Allocation

resources:
  requests:
    cpu: 1000m
    memory: 2Gi
  limits:
    cpu: 2000m
    memory: 4Gi

Database Performance

  • Use read replicas (increase aisrv-db.cluster.instances)
  • Fast storage (SSD/NVMe)
  • Connection pooling (configured in application)

Monitoring

Metrics

Enable Prometheus metrics:

env:
  AISRV_METRICS_PORT: "9090"

Health Checks

livenessProbe:
  httpGet:
    path: /health
    port: http
  initialDelaySeconds: 30
  periodSeconds: 10

readinessProbe:
  httpGet:
    path: /health
    port: http
  initialDelaySeconds: 15
  periodSeconds: 5

Database Monitoring

Enable CloudNativePG monitoring:

aisrv-db:
  cluster:
    monitoring:
      enablePodMonitor: true