Streaming Server Requirements: Hardware Guide for Broadcast Infrastructure

Sizing Your Streaming Server

The most common question when building streaming infrastructure is: what hardware do I need? The answer depends entirely on what you are asking the server to do. A server that passthrough-routes 20 SRT streams has very different requirements than one that transcodes 5 streams with hardware acceleration.

This guide breaks down the hardware requirements for each component of a streaming workflow, from minimal setups to enterprise-scale deployments. We will cover CPU, RAM, GPU, network, and storage, with specific recommendations for running Vajra Cast.

Understanding Your Workload

Before choosing hardware, classify your workload. Streaming servers do three fundamentally different things, each with different resource profiles:

1. Passthrough Routing

The server receives a stream and retransmits it to one or more destinations without modifying the video or audio. The data flows through: network in, memory, network out.

Resource profile:

  • CPU: Very low (protocol handling only)
  • RAM: Low (buffer space per stream)
  • GPU: Not used
  • Network: The bottleneck
  • Disk: Not used (unless recording)

This is what happens with Vajra Cast’s zero-copy distribution. The CPU cost is near-zero per additional output because the stream data is shared in memory.

2. Transcoding (Software)

The server decodes the incoming video, re-encodes it at a different resolution, bitrate, or codec, and sends the result to outputs. This is computationally expensive.

Resource profile:

  • CPU: Very high (the bottleneck)
  • RAM: Moderate
  • GPU: Not used
  • Network: Moderate
  • Disk: Low (unless recording)

Software transcoding (x264, x265) is flexible but CPU-intensive. A single 1080p60 H.264 transcode at good quality can saturate 8+ CPU cores.

3. Transcoding (Hardware-Accelerated)

The server offloads encoding and decoding to dedicated hardware (Intel QSV, NVIDIA NVENC, AMD VCE). The GPU handles the computationally expensive work.

Resource profile:

  • CPU: Low to moderate (managing the pipeline)
  • RAM: Moderate
  • GPU: The bottleneck
  • Network: Moderate
  • Disk: Low (unless recording)

Hardware transcoding is 5-10x faster and far more power-efficient than software transcoding, at the cost of slightly lower quality at the same bitrate (the gap has narrowed significantly with modern hardware).

CPU Requirements

For Passthrough Routing

Passthrough routing is not CPU-bound. The CPU handles protocol parsing (SRT, RTMP, HLS packetization), encryption/decryption (AES), and system overhead. Modern CPUs handle this trivially.

Streams (passthrough)CPU RecommendationNotes
1-102 cores, any modern CPUMac Mini M2 handles this easily
10-304 coresIntel i5 / AMD Ryzen 5 class
30-504-8 coresServer-class Intel Xeon / AMD EPYC
50-1008 coresNetwork becomes the bottleneck first

For Software Transcoding

Software transcoding is entirely CPU-bound. The x264 encoder uses all available cores. The quality preset determines how many cores are needed per stream:

x264 PresetCores per 1080p60 StreamQualityLatency
ultrafast2-3LowLowest
superfast3-4Below averageLow
veryfast4-6GoodLow
faster6-8GoodModerate
medium8-12HighModerate
slow12-16Very highHigh

For live streaming, veryfast or faster presets are the norm. They balance quality and CPU load at acceptable latency. The medium preset produces noticeably better quality but may require dedicated cores per stream.

Practical sizing for software transcoding:

WorkloadCPU Recommendation
1 stream, 1080p60Intel i7 / Ryzen 7 (8 cores)
2-3 streams, 1080p60Intel i9 / Ryzen 9 (16 cores)
4-8 streams, 1080p60Dual Xeon / EPYC (32+ cores)
More than 8 simultaneous transcodesUse hardware transcoding instead

For Hardware Transcoding

When using Intel QSV or NVIDIA NVENC, the CPU requirement drops dramatically because the encode/decode work happens on the GPU. The CPU manages the pipeline, handles audio processing, and runs the gateway application:

WorkloadCPU Recommendation
1-5 hardware transcodes4 cores (Intel i5 class)
5-15 hardware transcodes4-8 cores (Intel i7/Xeon class)
15+ hardware transcodes8+ cores (server class)

The GPU becomes the bottleneck before the CPU does.

GPU Requirements

Intel Quick Sync Video (QSV)

Vajra Cast supports Intel QSV for hardware-accelerated H.264 and HEVC encoding and decoding. QSV is integrated into Intel CPUs (no discrete GPU needed) and is available on most Intel processors from 6th generation (Skylake) onward.

Intel GenerationQSV CapabilitySimultaneous 1080p60 TranscodesNotes
6th-7th Gen (Skylake/Kaby Lake)H.264 encode/decode3-5Adequate for small deployments
8th-10th Gen (Coffee Lake - Ice Lake)H.264 + HEVC5-10Good general-purpose choice
11th-12th Gen (Rocket Lake - Alder Lake)H.264 + HEVC + AV18-15Recommended for new deployments
13th-14th Gen (Raptor Lake)H.264 + HEVC + AV110-20High throughput
Intel Arc (discrete)H.264 + HEVC + AV115-30+Dedicated media engine

For Linux deployments, Vajra Cast also supports VAAPI (Video Acceleration API), which works with Intel and AMD GPUs. The auto-fallback path is QSV first, then VAAPI, ensuring hardware acceleration is used whenever available.

NVIDIA GPUs

While Vajra Cast’s primary hardware acceleration path is Intel QSV/VAAPI, NVIDIA NVENC is supported through FFmpeg’s NVENC integration. NVIDIA GPUs offer high-density transcoding:

GPUNVENC SessionsNotes
GTX 1650-16603 (consumer limit)NVIDIA limits consumer cards
RTX 3060-40605 (consumer limit)Better quality engine
Tesla T4UnlimitedDatacenter GPU, no display
A100/H100UnlimitedOverkill for streaming

Important: NVIDIA consumer GPUs (GeForce) are limited to 3-5 simultaneous NVENC sessions by driver policy. For production use requiring more sessions, use datacenter GPUs (Tesla, A-series) or use Intel QSV instead, which has no artificial session limits.

Apple Silicon (M-Series)

On macOS, Vajra Cast leverages Apple’s VideoToolbox for hardware acceleration. Apple Silicon provides excellent media performance:

ChipCapabilitySimultaneous 1080p60 Transcodes
M1H.264 + HEVC3-5
M2H.264 + HEVC5-8
M2 ProH.264 + HEVC8-12
M2 MaxH.264 + HEVC + ProRes12-20
M3/M4 seriesH.264 + HEVC + AV1 (decode)Similar to M2 equivalents

The Mac Mini M2 with 16GB RAM is an excellent cost-effective streaming server for small to medium deployments. It is silent, draws 10-15W at idle, and handles 10-20 passthrough streams or 3-5 hardware transcodes.

RAM Requirements

RAM usage in a streaming server scales with the number of concurrent streams and the buffer size per stream.

Per-Stream Memory Usage

ComponentMemory per StreamNotes
SRT input buffer2-8 MBDepends on latency setting
SRT output buffer2-8 MBPer output destination
Transcode decode buffer50-100 MBFrame buffers for decode
Transcode encode buffer50-150 MBFrame buffers for encode
HLS segment buffer20-50 MBIn-memory segments
Recording buffer10-20 MBWrite buffer

Total RAM Recommendations

WorkloadRAMNotes
1-10 passthrough streams4 GBMinimal overhead
10-30 passthrough streams8 GBComfortable headroom
1-5 transcoding streams8 GBDecode + encode buffers
5-15 transcoding streams16 GBRecommended minimum for production
15-30 transcoding streams32 GBServer-class deployment
30+ transcoding streams64 GBEnterprise scale

Best practice: Allocate at least 2x your calculated minimum. Streaming workloads are bursty, and running close to memory limits risks OOM (Out of Memory) kills, which crash your streams.

ECC vs. Non-ECC

For production broadcast servers running 24/7, ECC (Error-Correcting Code) memory is strongly recommended. ECC detects and corrects single-bit memory errors that can corrupt stream data or crash the application. The cost premium for ECC is small relative to the cost of an on-air failure.

Intel Xeon and AMD EPYC processors support ECC natively. Consumer Intel Core and AMD Ryzen processors generally do not (with some exceptions in AMD’s lineup).

Network Requirements

Network is the most commonly underestimated resource in streaming server planning.

Bandwidth Calculation

Calculate your total bandwidth requirement:

Total bandwidth = (Sum of input bitrates) + (Sum of output bitrates) + (SRT overhead)

Example: 5 SRT inputs at 10 Mbps each, distributed to 3 RTMP outputs and 2 SRT outputs each:

Inputs: 5 × 10 Mbps = 50 Mbps
Outputs: 5 × 5 × 10 Mbps = 250 Mbps (5 streams × 5 outputs each)
SRT overhead: ~25% on input = 12.5 Mbps
Total: ~312.5 Mbps

This requires at least a 1 Gbps network interface with comfortable headroom.

Network Interface Recommendations

Total ThroughputInterfaceNotes
< 100 Mbps1 GbpsAmple for small setups
100-500 Mbps1 GbpsWatch utilization, plan upgrade
500 Mbps - 2 Gbps2.5 or 10 GbpsProduction minimum for mid-scale
2-10 Gbps10 GbpsStandard for broadcast facilities
> 10 Gbps25/100 Gbps or bonded 10 GbpsEnterprise CDN ingest

Network Considerations

Jumbo frames: Enable jumbo frames (9000 MTU) on your local network for SRT traffic. This reduces CPU overhead for packet processing and improves throughput on 10 Gbps links.

Separate management and media traffic: Use dedicated network interfaces for streaming data and management/monitoring traffic. This prevents monitoring API calls from competing with live stream data.

Symmetric bandwidth: Streaming servers need upload bandwidth equal to or greater than download. Many consumer internet connections are asymmetric (high download, low upload). Use business-class or datacenter connectivity with symmetric bandwidth.

Storage Requirements

Storage is only needed if you are recording streams or writing HLS segments to disk.

Recording

Calculate storage for recording:

Storage per hour = (Bitrate in Mbps × 3600) / 8 = MB per hour
BitratePer HourPer 8 HoursNotes
5 Mbps2.25 GB18 GBLow-quality web stream
10 Mbps4.5 GB36 GBStandard HD
15 Mbps6.75 GB54 GBHigh-quality HD
25 Mbps11.25 GB90 GB4K or high-bitrate HD

For ISO recording of 6 cameras at 10 Mbps each plus program output:

7 streams × 4.5 GB/hour = 31.5 GB/hour
8-hour event = 252 GB

Use NVMe SSDs for recording when writing multiple simultaneous streams. Spinning disks can handle 2-3 simultaneous streams but struggle with 5+ due to seek latency. NVMe eliminates this bottleneck entirely.

HLS Segments

If writing HLS segments to disk (rather than serving from memory), the storage requirement is small but the I/O requirement is high. HLS segments are small files written frequently (one per segment duration per quality level):

4-second segments, 4 quality levels = 1 file write per second sustained

Use an SSD for the HLS working directory. The total space is modest (a few GB for the live window), but the I/O pattern demands fast random write performance.

Vajra Cast System Requirements

Here are the specific requirements for running Vajra Cast:

Minimum Requirements

ComponentSpecification
OSmacOS 12+ (Apple Silicon or Intel) or Linux (Ubuntu 22.04+, Debian 12+)
CPU4 cores
RAM8 GB
Disk10 GB free (plus recording space)
Network1 Gbps Ethernet
GPUIntel QSV (for hardware transcode on Linux) or Apple VideoToolbox (macOS)
ComponentSpecification
OSLinux (Ubuntu 22.04 LTS) or macOS (Sonoma+)
CPU8+ cores (Intel 12th+ Gen for QSV, or Apple M2+)
RAM16 GB (ECC preferred on Linux servers)
DiskNVMe SSD, 500 GB+ for recording
Network10 Gbps Ethernet
GPUIntel integrated QSV (Linux) or Apple Silicon (macOS)

Docker Deployment

For containerized deployments, Vajra Cast runs in Docker with these considerations:

# docker-compose.yml excerpt
services:
  vajracast:
    image: vajracast/vajracast:latest
    network_mode: host  # Required for SRT/UDP
    devices:
      - /dev/dri:/dev/dri  # Intel QSV access
    volumes:
      - ./data:/data
      - ./recordings:/recordings
    deploy:
      resources:
        limits:
          memory: 8G
        reservations:
          memory: 4G

Key Docker considerations:

  • network_mode: host is required for SRT (UDP) performance. Docker’s bridge networking adds latency and complicates port mapping for SRT listeners.
  • /dev/dri device pass-through is required for Intel QSV hardware acceleration in Linux containers.
  • Memory limits: Set the container memory limit to at least 2x your expected working set to prevent OOM kills during traffic bursts.

Kubernetes Deployment

For Kubernetes orchestration, Vajra Cast supports multi-instance deployment with PostgreSQL as the shared database:

  • Use hostNetwork: true for pods handling SRT traffic
  • Request GPU resources through the Intel device plugin for QSV
  • Deploy PostgreSQL as a separate StatefulSet or use a managed database service
  • Use persistent volumes for recording storage

Scaling Guidelines

Vertical Scaling (Bigger Server)

Add more CPU, RAM, or a better GPU to handle more streams on a single server:

UpgradeEffect
More CPU coresMore simultaneous software transcodes
Better Intel GPUMore simultaneous hardware transcodes
More RAMMore concurrent streams with large buffers
Faster NICMore total throughput for outputs
NVMe storageMore simultaneous recordings

Vertical scaling is simpler to manage but has a ceiling. A single server maxes out at around 50-100 streams depending on workload.

Horizontal Scaling (More Servers)

Deploy multiple Vajra Cast instances behind a load distribution strategy:

Source → Gateway 1 (streams 1-20)  → Outputs
      → Gateway 2 (streams 21-40) → Outputs
      → Gateway 3 (streams 41-60) → Outputs

Each gateway operates independently with its own set of streams. A central orchestrator (Kubernetes, Terraform, or custom automation) manages deployment and configuration.

Horizontal scaling has no theoretical ceiling and provides natural fault isolation: if Gateway 2 fails, Gateways 1 and 3 continue operating.

Reference Builds

Build 1: Budget Streaming Server

Use case: Small production, 5-10 passthrough streams, 1-2 transcodes

ComponentSpecificationApprox. Cost
PlatformMac Mini M2$600
RAM16 GB (included)
Storage256 GB SSD + 1 TB external$80
NetworkBuilt-in Gigabit
Total~$680

Build 2: Mid-Range Production Server

Use case: 15-30 streams, 5-10 transcodes, ISO recording

ComponentSpecificationApprox. Cost
CPUIntel Core i7-13700 (16 cores, QSV)$350
MotherboardB760 or Z790 (server-grade if ECC needed)$150-250
RAM32 GB DDR5$100
Storage1 TB NVMe (OS + HLS) + 4 TB NVMe (recording)$250
NetworkIntel X710 10 Gbps NIC$80
Case + PSUQuiet tower, 500W 80+ Gold$150
Total~$1,100-1,200

Build 3: Enterprise Broadcast Server

Use case: 50+ streams, heavy transcoding, 24/7 operation

ComponentSpecificationApprox. Cost
CPUIntel Xeon w5-2465X (16 cores, QSV, ECC)$1,000
MotherboardW790 workstation board$500
RAM64 GB DDR5 ECC$300
Storage2 TB NVMe (OS) + RAID array for recording$800
NetworkDual 10 Gbps NIC$150
GPUIntel Arc A380 (additional QSV capacity)$130
Case + PSURackmount 2U, redundant PSU$400
Total~$3,300

Bottom Line

Streaming server hardware selection comes down to understanding your workload: passthrough routing needs network bandwidth, software transcoding needs CPU cores, and hardware transcoding needs the right GPU. Start by classifying your workload, calculate your requirements using the tables above, and build or procure with at least 50% headroom for growth and peak loads.

For most deployments, a mid-range Intel system with QSV (or a Mac Mini M2 for smaller operations) provides the best balance of performance, cost, and power efficiency. Scale horizontally with Docker or Kubernetes when a single server is no longer sufficient.

Explore the SRT Streaming Gateway guide for the software architecture that runs on this hardware, and see video failover best practices for ensuring your infrastructure remains online when components fail.