HLS Adaptive Streaming: Complete Setup Guide for Live and VOD

What is HLS and Why Does It Dominate?

HLS (HTTP Live Streaming) is the most widely deployed streaming protocol on the planet. Created by Apple in 2009, it delivers video over standard HTTP infrastructure, which means it works with every CDN, every web server, every proxy, and every firewall on Earth without special configuration.

The protocol is simple in concept: the encoder chops the video into small files (segments), writes a playlist file (manifest) that lists those segments, and the player downloads segments sequentially while playing them back. Because everything is standard HTTP, the entire global CDN infrastructure built for web content delivery works for HLS out of the box.

For live streaming and video on demand, HLS provides three capabilities that matter most:

  1. Adaptive bitrate (ABR): Multiple quality levels, with the player automatically selecting the best one for the viewer’s connection
  2. Massive scalability: HTTP caching at every layer (CDN edge, ISP cache, browser) enables millions of concurrent viewers
  3. Universal compatibility: iOS, Android, macOS, Windows, Linux, Smart TVs, game consoles. HLS plays everywhere

This guide covers how to set up HLS for both live streaming and VOD, with specific configuration for adaptive bitrate ladders, segment tuning, CDN integration, and low-latency modes.

How HLS Works Under the Hood

Understanding the mechanics helps you make better configuration decisions.

The Master Playlist

The master playlist (often called the multivariant playlist) is the entry point. It lists all available quality levels (renditions):

#EXTM3U
#EXT-X-VERSION:6

#EXT-X-STREAM-INF:BANDWIDTH=6000000,RESOLUTION=1920x1080,FRAME-RATE=60.000,CODECS="avc1.640028,mp4a.40.2"
1080p60/index.m3u8

#EXT-X-STREAM-INF:BANDWIDTH=3000000,RESOLUTION=1280x720,FRAME-RATE=30.000,CODECS="avc1.4d401f,mp4a.40.2"
720p30/index.m3u8

#EXT-X-STREAM-INF:BANDWIDTH=1500000,RESOLUTION=854x480,FRAME-RATE=30.000,CODECS="avc1.4d401e,mp4a.40.2"
480p30/index.m3u8

#EXT-X-STREAM-INF:BANDWIDTH=600000,RESOLUTION=640x360,FRAME-RATE=30.000,CODECS="avc1.42c015,mp4a.40.2"
360p30/index.m3u8

The player reads this playlist, estimates available bandwidth, and selects the appropriate rendition. As network conditions change, the player switches between renditions automatically.

Media Playlists

Each rendition has its own media playlist listing the actual video segments:

#EXTM3U
#EXT-X-VERSION:6
#EXT-X-TARGETDURATION:4
#EXT-X-MEDIA-SEQUENCE:1547

#EXTINF:4.000,
segment-1547.ts
#EXTINF:4.000,
segment-1548.ts
#EXTINF:4.000,
segment-1549.ts

For live streams, the playlist is a sliding window: old segments are removed as new ones are added. The player stays near the “live edge” (the most recent segments).

For VOD, the playlist contains all segments and includes the #EXT-X-ENDLIST tag to indicate the content is complete.

Segment Files

Segments are the actual media data, typically in MPEG-TS (.ts) or fragmented MP4 (.m4s) container format. Each segment contains a self-contained chunk of video and audio:

  • MPEG-TS segments: The traditional format. Each segment starts with a PAT/PMT and can be played independently. Widely compatible.
  • fMP4 segments: The modern format. More efficient (less overhead per segment), supports features like CMAF and HEVC in HLS. Required for HLS with HEVC or low-latency HLS.

Designing Your Adaptive Bitrate Ladder

The bitrate ladder defines which quality levels you offer viewers. A well-designed ladder provides good quality for fast connections and usable quality for slow ones, without wasting bandwidth on renditions nobody will use.

For general-purpose live streaming (sports, events, entertainment):

RenditionResolutionFrame RateVideo BitrateAudio BitrateTotal Bitrate
1080p601920x108060 fps5,500 Kbps192 Kbps~5,700 Kbps
1080p301920x108030 fps3,500 Kbps128 Kbps~3,650 Kbps
720p301280x72030 fps2,000 Kbps128 Kbps~2,150 Kbps
480p30854x48030 fps1,000 Kbps96 Kbps~1,100 Kbps
360p30640x36030 fps500 Kbps64 Kbps~570 Kbps
240p30426x24030 fps250 Kbps64 Kbps~320 Kbps

Ladder Design Principles

1. Minimum 1.5x bitrate gap between renditions. If two renditions are too close in bitrate, the player will oscillate between them on marginal connections, causing visible quality switches. A 1.5x gap gives the ABR algorithm a clear decision boundary.

2. Always include a low-bitrate rendition. The 240p or 360p rendition serves viewers on very slow connections (2G cellular, congested Wi-Fi). Without it, those viewers get buffering instead of low-quality playback.

3. Match the top rendition to your source quality. There is no benefit to offering a 4K rendition if your source is 1080p. The top rendition should match your source resolution and frame rate.

4. Consider your encoding budget. Each rendition requires a separate encode. Six renditions means six simultaneous encodes. With hardware transcoding (Intel QSV), this is manageable. With software encoding, it may require a powerful server. See the streaming server requirements guide for hardware sizing.

Encoding Settings per Rendition

For H.264 encoding of each rendition:

# 1080p60 rendition
-c:v h264_qsv -b:v 5500k -maxrate 6000k -bufsize 11000k
-profile:v high -level 4.2 -g 120 -keyint_min 120
-c:a aac -b:a 192k -ar 48000

# 720p30 rendition
-c:v h264_qsv -b:v 2000k -maxrate 2200k -bufsize 4000k
-profile:v main -level 3.1 -g 60 -keyint_min 60
-c:a aac -b:a 128k -ar 48000

# 480p30 rendition
-c:v h264_qsv -b:v 1000k -maxrate 1100k -bufsize 2000k
-profile:v main -level 3.0 -g 60 -keyint_min 60
-c:a aac -b:a 96k -ar 48000

Critical: Keyframe alignment. All renditions must produce keyframes at the exact same intervals (every 2 or 4 seconds). Misaligned keyframes cause playback glitches when the player switches between renditions. Set -g (GOP size) to frame rate multiplied by segment duration.

Segment Duration: The Latency vs. Reliability Trade-off

Segment duration is the single most impactful HLS configuration parameter. It directly controls both latency and reliability.

How Segment Duration Affects Latency

The minimum theoretical latency for standard HLS is:

Minimum latency = (segment_duration × playlist_depth) + network_delay + player_buffer

With 6-second segments and a playlist depth of 3:

Minimum latency = (6 × 3) + 1 + 2 = ~21 seconds

With 2-second segments and a playlist depth of 3:

Minimum latency = (2 × 3) + 1 + 2 = ~9 seconds

Shorter segments mean lower latency.

How Segment Duration Affects Reliability

Shorter segments have drawbacks:

  • More HTTP requests per second: The player and CDN must handle more requests, which increases overhead and the chance of a missed segment
  • Less efficient encoding: Shorter segments mean more keyframes relative to content, reducing compression efficiency
  • More sensitive to network jitter: A brief network hiccup has a higher chance of delaying a short segment past its deadline
Use CaseSegment DurationPlaylist DepthExpected Latency
Low-latency live2 seconds38-12 seconds
Standard live4 seconds315-20 seconds
Reliable live (poor networks)6 seconds525-35 seconds
VOD6 secondsFullN/A (not live)

For most live streaming, 4-second segments with a playlist depth of 3-5 segments provides a good balance. For sports and other latency-sensitive content, use 2-second segments and accept the trade-offs.

Configuring HLS Output in Vajra Cast

Vajra Cast includes a built-in HLS server that can generate adaptive streams directly, without requiring a separate media server.

Basic HLS Output

To create an HLS output from an existing ingest:

  1. Create a new output and select HLS as the protocol
  2. Configure the segment duration (default: 4 seconds)
  3. Set the playlist depth (default: 3 segments)
  4. Choose passthrough (no transcode) or select encoding profiles for adaptive output

For a single-rendition HLS output (no adaptive bitrate), the stream is served directly from the ingest without transcoding:

Input (SRT 1080p60 8 Mbps) → HLS output (passthrough)
Available at: http://your-server:port/hls/stream-name/index.m3u8

Adaptive Multi-Rendition HLS

For adaptive bitrate output, configure multiple renditions on the same HLS output. Vajra Cast creates the master playlist and per-rendition media playlists automatically:

  1. Create the HLS output
  2. Add renditions:
    • 1080p60 at 5,500 Kbps (hardware transcode)
    • 720p30 at 2,000 Kbps (hardware transcode)
    • 480p30 at 1,000 Kbps (hardware transcode)
    • 360p30 at 500 Kbps (hardware transcode)
  3. Enable the output

With Intel QSV hardware transcoding, four renditions from a single 1080p60 source uses minimal CPU. The QSV engine handles all four encodes simultaneously.

The HLS endpoint serves both the master playlist and all rendition playlists and segments.

Live Viewer Count

Vajra Cast’s built-in HLS server tracks active viewers in real-time by monitoring playlist requests. This viewer count is available in the dashboard and via the Prometheus metrics endpoint, giving you live audience analytics without external tools.

CDN Integration

For production deployments serving more than a few dozen concurrent viewers, you need a CDN between your HLS origin (Vajra Cast) and your viewers.

Origin Configuration

Vajra Cast acts as the origin server. The CDN pulls segments from Vajra Cast and caches them at edge locations worldwide:

Vajra Cast (origin) → CDN (edge cache) → Viewers

Configure your CDN to pull from Vajra Cast’s HLS endpoint:

Origin URL: http://vajracast-server:port/hls/

CDN Caching Strategy

HLS caching is straightforward because segments are immutable (a segment’s content never changes once written) and the playlist is the only file that changes:

File TypeCache TTLNotes
Master playlist (.m3u8)No cache or 1 secondRarely changes (static rendition list)
Media playlist (.m3u8)0.5 × segment durationMust refresh to see new segments
Segments (.ts / .m4s)24 hours or longerImmutable once created

Media playlist cache TTL is critical. If the CDN caches the media playlist too aggressively, viewers will not see new segments and the stream will stall. Set the TTL to half the segment duration (e.g., 2 seconds for 4-second segments).

CDN Selection

Any HTTP CDN works for HLS. Popular choices:

CDNNotes
CloudflareGood free tier, global edge network
AWS CloudFrontTight integration with AWS infrastructure
FastlyLow-latency edge compute, popular for video
AkamaiIndustry standard for large-scale video
Bunny CDNCost-effective, good for mid-scale

For testing and small audiences (under 100 concurrent viewers), you can serve HLS directly from Vajra Cast without a CDN. For anything larger, use a CDN to offload bandwidth and ensure global performance.

Low-Latency HLS (LL-HLS)

Standard HLS has inherent latency due to segment-based delivery. Low-Latency HLS, defined in the HLS specification revision in 2020, reduces latency to 2-5 seconds while maintaining HLS’s compatibility advantages.

How LL-HLS Works

LL-HLS introduces two key concepts:

1. Partial segments: Instead of waiting for a full segment to complete, the server publishes partial segments (typically 200-300ms each) as they are generated. The player can start downloading a partial segment while the full segment is still being encoded.

2. Blocking playlist reload: Instead of the player polling for playlist updates on a timer, the player sends a blocking request that the server holds open until a new segment or partial segment is available. This eliminates polling delay.

The result: latency drops from 15-30 seconds to 2-5 seconds, competitive with RTMP-based delivery.

LL-HLS Requirements

LL-HLS requires:

  • Server support: The origin server must generate partial segments and support blocking playlist responses
  • CDN support: The CDN must support HTTP/2 push or fast propagation of partial segments
  • Player support: The video player must implement the LL-HLS extensions

Browser support is broad: Safari (native), hls.js (the dominant web HLS library), and most native mobile players support LL-HLS.

When to Use LL-HLS

ScenarioRecommendation
Sports streamingLL-HLS recommended (2-4s latency)
News/eventsLL-HLS recommended
Entertainment/musicStandard HLS is fine (latency less important)
VODNot applicable (not live)
Very poor viewer networksStandard HLS more reliable

LL-HLS adds server-side complexity and is more sensitive to CDN configuration. Use it when latency matters; use standard HLS when it does not.

Player Compatibility

HLS works on virtually every platform, but the specifics vary.

Native HLS Support

PlatformNative HLSNotes
iOS SafariYesApple’s reference implementation
macOS SafariYesFull LL-HLS support
Android (ExoPlayer)YesGoogle’s recommended player
Smart TVsYesMost brands via built-in browser
RokuYesNative HLS support
Apple TVYestvOS player

Browser HLS Support (via hls.js)

Browsers other than Safari do not natively support HLS. The standard approach is to use hls.js, a JavaScript library that parses HLS manifests and feeds segments to the browser’s Media Source Extensions (MSE) API:

<video id="video" controls></video>
<script src="https://cdn.jsdelivr.net/npm/hls.js@latest"></script>
<script>
  const video = document.getElementById('video');
  if (Hls.isSupported()) {
    const hls = new Hls({
      lowLatencyMode: true,
      liveSyncDurationCount: 3,
      liveMaxLatencyDurationCount: 5,
    });
    hls.loadSource('https://your-cdn.com/hls/stream/index.m3u8');
    hls.attachMedia(video);
  } else if (video.canPlayType('application/vnd.apple.mpegurl')) {
    // Safari native HLS
    video.src = 'https://your-cdn.com/hls/stream/index.m3u8';
  }
</script>

hls.js configuration options for live streaming:

OptionRecommended ValuePurpose
lowLatencyModetrueEnable LL-HLS support
liveSyncDurationCount3Number of segments behind live edge
liveMaxLatencyDurationCount5Maximum segments behind before catching up
maxBufferLength30Maximum buffer ahead in seconds
maxMaxBufferLength60Absolute maximum buffer

Vajra Cast Built-in Web Player

Vajra Cast includes a built-in web player for each HLS output, accessible directly from the dashboard. This player displays:

  • The live video stream
  • A latency indicator showing the viewer’s delay from live
  • Stream metadata (resolution, codec, bitrate)

This is useful for quick monitoring and for sharing a direct viewing link with stakeholders who need to preview the stream without setting up their own player.

HLS for VOD (Video on Demand)

While this guide focuses on live streaming, HLS is equally effective for VOD delivery. The main differences:

VOD Playlist

A VOD playlist includes all segments and the #EXT-X-ENDLIST tag:

#EXTM3U
#EXT-X-VERSION:6
#EXT-X-TARGETDURATION:6
#EXT-X-PLAYLIST-TYPE:VOD

#EXTINF:6.000,
segment-0000.ts
#EXTINF:6.000,
segment-0001.ts
...
#EXTINF:4.200,
segment-0847.ts
#EXT-X-ENDLIST

VOD Encoding

For VOD, you can use slower encoding presets (medium or slow) since there is no real-time constraint. This produces significantly better quality at the same bitrate compared to live encoding. Pre-encode all renditions and upload to your CDN or storage.

DVR / Timeshift

A hybrid approach between live and VOD is DVR mode (also called timeshift). The server keeps old segments available beyond the normal live window, allowing viewers to pause and rewind live content:

Live window: 3 most recent segments (play near-live)
DVR window: Last 2 hours of segments (pause, rewind, seek)

This requires more storage at the origin and CDN, but provides a better viewer experience for long events.

Troubleshooting Common HLS Issues

Buffering

Symptom: The player stops and shows a loading spinner.

Causes and solutions:

  1. Insufficient CDN bandwidth: Scale your CDN or upgrade your plan
  2. Origin too slow: Segments are not being generated fast enough. Check server CPU load.
  3. Segment duration too short for the viewer’s network: Increase segment duration or add a lower-bitrate rendition
  4. Player buffer too small: Increase maxBufferLength in hls.js

Quality Switching (Visible)

Symptom: The viewer sees noticeable quality jumps (sharp to blurry and back).

Solutions:

  1. Increase the bitrate gap between adjacent renditions (minimum 1.5x)
  2. Tune the player’s ABR algorithm to be less aggressive with abrBandWidthFactor in hls.js
  3. Ensure keyframes are aligned across all renditions

Audio/Video Desync

Symptom: Audio leads or trails video.

Causes:

  1. Encoding pipeline introduces variable delay between audio and video tracks
  2. Segment boundaries misalign audio and video
  3. Player-side synchronization issue

Solution: Ensure your encoder produces synchronized audio and video timestamps. Use fMP4 segments (instead of MPEG-TS) for more precise timing.

Stale Playlist (Stream Appears Frozen)

Symptom: The player stops receiving new segments and the stream freezes at a frame.

Causes:

  1. CDN caching the media playlist too aggressively. Reduce playlist cache TTL.
  2. Origin stopped generating segments. Check the encoding pipeline.
  3. Network issue between origin and CDN. Check origin connectivity.

Bottom Line

HLS adaptive streaming is the foundation of modern video delivery. Its combination of adaptive bitrate, CDN compatibility, and universal player support makes it the default choice for reaching large audiences on any device.

The keys to a well-configured HLS setup are: design your bitrate ladder with clear gaps between renditions, choose your segment duration based on your latency requirements, align keyframes across all renditions, and configure CDN caching correctly for the playlist files.

Vajra Cast’s built-in HLS server simplifies the origin side, handling segment generation, playlist management, and adaptive rendition creation with hardware-accelerated transcoding. For contribution and ingest, pair it with SRT for reliable, encrypted transport from the field. See the SRT Streaming Gateway guide for the complete ingest architecture, and the multi-destination streaming guide for combining HLS output with RTMP platform delivery.