Why Containerize Your Streaming Infrastructure?

Live streaming software has traditionally been installed directly on bare-metal servers or VMs, manually configured, manually updated, and difficult to reproduce. Containerization changes this by packaging Vajra Cast with all its dependencies into a portable, versioned image that runs identically everywhere.

Benefits for streaming workloads:

  • Reproducible deployments. The same image runs in development, staging, and production.
  • Fast rollbacks. Bad update? Roll back to the previous image tag in seconds.
  • Infrastructure as code. Your entire streaming setup is defined in version-controlled manifests.
  • Scaling. Spin up additional instances to handle more streams.
  • Isolation. Each Vajra Cast instance runs in its own container with defined resource limits.

Docker Image

Vajra Cast is distributed as a Docker image through a private registry. After signing up for a free trial, you receive registry credentials to pull the image:

docker login registry.vajracast.com
docker pull registry.vajracast.com/vajracast:latest

The image is based on a minimal Linux base and includes all required dependencies: FFmpeg libraries, SRT library, PostgreSQL client, and the Vajra Cast binary.

Image Tags

TagDescription
latestMost recent stable release
x.y.zSpecific version (e.g., 1.5.0)
x.yLatest patch for a minor version (e.g., 1.5)

For production, always pin to a specific version tag. Using latest in production risks unintended updates.

Docker Compose

For single-server deployments, Docker Compose is the simplest way to run Vajra Cast with PostgreSQL:

# docker-compose.yml
version: "3.8"

services:
  vajracast:
    image: registry.vajracast.com/vajracast:1.5.0
    ports:
      - "8080:8080"       # Web UI and API
      - "9000-9010:9000-9010/udp"  # SRT ports
      - "1935:1935"       # RTMP
    devices:
      - /dev/dri:/dev/dri  # GPU for hardware transcoding
    group_add:
      - video
      - render
    environment:
      DATABASE_URL: postgres://vajracast:changeme@postgres:5432/vajracast
    depends_on:
      postgres:
        condition: service_healthy
    restart: unless-stopped
    healthcheck:
      test: ["CMD", "wget", "-q", "--spider", "http://localhost:8080/api/health"]
      interval: 10s
      timeout: 5s
      retries: 3

  postgres:
    image: postgres:16-alpine
    volumes:
      - pgdata:/var/lib/postgresql/data
    environment:
      POSTGRES_USER: vajracast
      POSTGRES_PASSWORD: changeme
      POSTGRES_DB: vajracast
    healthcheck:
      test: ["CMD-ARGS", "pg_isready", "-U", "vajracast"]
      interval: 5s
      timeout: 3s
      retries: 5
    restart: unless-stopped

volumes:
  pgdata:

Start everything with:

docker compose up -d

Port Ranges

SRT uses UDP ports, and you need one port per SRT listener. The example above exposes ports 9000-9010, giving you 11 SRT listener ports. Adjust the range based on the number of concurrent SRT ingests you need.

RTMP uses TCP port 1935 by default. If you only use SRT, you can omit this port.

GPU Access

The devices and group_add sections pass the Intel GPU to the container for hardware transcoding. If you are not using hardware transcoding, you can omit these lines.

On systems without Intel integrated graphics, remove the /dev/dri device mapping to avoid errors.

Kubernetes Deployment

For multi-server, high-availability deployments, Kubernetes provides orchestration, scaling, and self-healing.

Deployment Manifest

# vajracast-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: vajracast
  labels:
    app: vajracast
spec:
  replicas: 1
  selector:
    matchLabels:
      app: vajracast
  template:
    metadata:
      labels:
        app: vajracast
    spec:
      containers:
        - name: vajracast
          image: registry.vajracast.com/vajracast:1.5.0
          ports:
            - containerPort: 8080
              name: http
          env:
            - name: DATABASE_URL
              valueFrom:
                secretKeyRef:
                  name: vajracast-secrets
                  key: database-url
          resources:
            requests:
              cpu: "1"
              memory: "2Gi"
            limits:
              cpu: "4"
              memory: "8Gi"
              gpu.intel.com/i915: "1"  # Intel GPU plugin
          livenessProbe:
            httpGet:
              path: /api/health
              port: 8080
            initialDelaySeconds: 10
            periodSeconds: 10
          readinessProbe:
            httpGet:
              path: /api/ready
              port: 8080
            initialDelaySeconds: 5
            periodSeconds: 5

Service for SRT (UDP)

SRT uses UDP, which requires a NodePort or LoadBalancer service:

# vajracast-srt-service.yaml
apiVersion: v1
kind: Service
metadata:
  name: vajracast-srt
spec:
  type: NodePort
  selector:
    app: vajracast
  ports:
    - name: srt-9000
      protocol: UDP
      port: 9000
      targetPort: 9000
      nodePort: 30900

For cloud deployments, use a UDP-capable load balancer (AWS NLB, GCP Network LB) instead of NodePort.

Service for HTTP (Web UI, API, Metrics)

# vajracast-http-service.yaml
apiVersion: v1
kind: Service
metadata:
  name: vajracast-http
spec:
  type: ClusterIP
  selector:
    app: vajracast
  ports:
    - name: http
      port: 8080
      targetPort: 8080

Expose via an Ingress controller for external access to the web UI and API.

Secrets

Store database credentials in a Kubernetes Secret:

kubectl create secret generic vajracast-secrets \
  --from-literal=database-url='postgres://vajracast:changeme@postgres-service:5432/vajracast'

PostgreSQL in Kubernetes

For production Kubernetes deployments, use a managed database service (AWS RDS, Google Cloud SQL, Azure Database for PostgreSQL) rather than running PostgreSQL inside the cluster. This gives you automated backups, replication, and maintenance.

If you must run PostgreSQL in the cluster, use a Kubernetes operator like CloudNativePG or Zalando’s postgres-operator for proper lifecycle management.

Terraform

For infrastructure-as-code deployments on cloud providers, Terraform can provision the underlying infrastructure (VMs, networks, load balancers) and then deploy Vajra Cast containers.

A simplified example for AWS:

resource "aws_ecs_task_definition" "vajracast" {
  family                   = "vajracast"
  network_mode             = "awsvpc"
  requires_compatibilities = ["FARGATE"]
  cpu                      = "2048"
  memory                   = "4096"

  container_definitions = jsonencode([
    {
      name  = "vajracast"
      image = "registry.vajracast.com/vajracast:1.5.0"
      portMappings = [
        { containerPort = 8080, protocol = "tcp" },
        { containerPort = 9000, protocol = "udp" }
      ]
      environment = [
        { name = "DATABASE_URL", value = var.database_url }
      ]
    }
  ])
}

Note: For hardware transcoding on cloud infrastructure, you need instances with Intel GPUs (e.g., AWS instances with Intel Flex GPUs, or bare-metal instances with integrated graphics). Standard cloud VMs typically do not have Intel integrated graphics available.

Health Checks

Vajra Cast exposes two health check endpoints, each serving a different purpose:

Liveness: /api/health

Returns HTTP 200 if the application process is running and can reach the database. Use this for:

  • Docker healthcheck
  • Kubernetes livenessProbe
  • Load balancer health checks

If this endpoint fails, the process is unhealthy and should be restarted.

Readiness: /api/ready

Returns HTTP 200 only after the application has fully initialized: database migrations complete, all routes loaded, all listeners bound. Use this for:

  • Kubernetes readinessProbe
  • Load balancer backend health (to avoid routing traffic to a starting instance)

During startup, /api/ready returns HTTP 503 until initialization completes. This prevents premature traffic routing.

Scaling Considerations

Live streaming workloads are stateful. Each Vajra Cast instance manages its own set of routes and SRT connections. Scaling is not as simple as increasing the replica count. Consider these patterns:

  • Vertical scaling: Give each instance more CPU and memory to handle more streams.
  • Sharded horizontal scaling: Run multiple instances, each handling a subset of routes, with a load balancer routing SRT traffic to the correct instance based on port.
  • Active/standby: Run a secondary instance with the same configuration (pointing to the same database) that takes over if the primary fails.

Next Steps