WIRELESS NETWORKING, 430.752B, 2020 SPRING SEOUL NATIONAL UNIVERSITY
Technical Issues in Wireless Networking - Mobility Support for Streaming-
Kyunghan Lee
Networked Computing Lab (NXC Lab)
Department of Electrical and Computer Engineering Seoul National University
https://nxc.snu.ac.kr
kyunghanlee@snu.ac.kr
We want the highest video quality (“bit rate”)
(c) copyright 2008, Blender Foundation / www.bigbuckbunny.org, CC-BY-3.0
We want the highest video quality (“bit rate”)
Blender Foundation / www.bigbuckbunny.org, CC-BY-3.0
WIRELESS NETWORKING, 430.752B, 2020 SPRING SEOUL NATIONAL UNIVERSITY
WIRELESS NETWORKING, 430.752B, 2020 SPRING SEOUL NATIONAL UNIVERSITY
Without seeing this… (rebuffering)
Without seeing this … (called a “rebuffer”)
3
□ Akamai conducted a user study across 1200 viewers
§ Skin conductance (related to sweat and focus level)
§ Facial coding (capturing emotions)
§ Survey
□ “Higher bit rates can boost viewer engagement by 10.4%”
□ “2-second rebuffer is enough for disgust to jump 9%”
Quality à User experience à Revenue
WIRELESS NETWORKING, 430.752B, 2020 SPRING SEOUL NATIONAL UNIVERSITY
WIRELESS NETWORKING, 430.752B, 2020 SPRING SEOUL NATIONAL UNIVERSITY
Video Streaming (and CDNs)
§ Video traffic: major consumer of Internet bandwidth
§ Netflix, YouTube: 37%, 16% of downstream residential ISP traffic (2015)
§ ~1B YouTube users, ~75M Netflix users
§ Challenge: mobility support
§ Streaming may be disconnected while being mobile
§ Challenge: scalability - how to reach ~1B users?
§ Single mega-video server won’t work (why?)
§ Challenge: heterogeneity
§ Different users have different capabilities
(e.g., wired versus mobile; bandwidth rich versus bandwidth poor)
!4
Why do we care?
The owner of NBA'sDallas Mavericks
Multimedia: Video
□ Video: sequence of images displayed at constant rate
§ e.g., 24 images/sec
□ Digital image: array of pixels
§ each pixel represented by bits
□ Encoding: use redundancy within and between images to decrease # bits used to encode image
§ spatial (within image)
§ temporal (from one image to next)
………..
spatial coding example: instead of sending N values of same color (all purple), send only two values:
color value (purple) and number of repeated values (N)
……….…….
frame i frame i+1
temporal coding example:
instead of sending complete frame at i+1, send only differences from frame i
§ CBR: (constant bit rate):
video encoding rate fixed
§ VBR: (variable bit rate):
video encoding rate changes as amount of spatial, temporal coding changes
•MPEG1 (CD-ROM) 1.5 Mbps
•MPEG2 (DVD) 3-6 Mbps
•MPEG4 (used in Internet, < 1 Mbps)
•MPEG4 AVC (Bluray) ~ 36Mbps
WIRELESS NETWORKING, 430.752B, 2020 SPRING SEOUL NATIONAL UNIVERSITY
WIRELESS NETWORKING, 430.752B, 2020 SPRING SEOUL NATIONAL UNIVERSITY
Video Encoding and Streaming
□ Video Encoding: Per-frame vs. GOP (group of pictures)
I: Intra-Coded P: Predicted
B: Bidirectionally Predicted
16
A Guide to MPEG Fundamentals and Protocol Analysis
www.tektronix.com/video_audio/
Figure 2-14 introduces the concept of the GOP or Group Of Pictures. The GOP begins with an I picture and then has P pictures spaced throughout. The
remaining pictures are B pictures. The GOP is defined as ending at the last pic- ture before the next I picture. The GOP length is flexible, but 12 or 15 pictures is a common value. Clearly, if data for B pictures are to be taken from a future picture, that data must already be available at the decoder. Consequently, bidi- rectional coding requires that picture data is sent out of sequence and tem- porarily stored. Figure 2-14 also shows that the P-picture data are sent before the B-picture data. Note that the last B pictures in the GOP cannot be transmit- ted until after the I picture of the next GOP since this data will be needed to bidirectionally decode them. In order to return pictures to their correct sequence, a temporal reference is included with each picture. As the picture rate is also embedded periodically in headers in the bit stream, an MPEG file may be displayed by a personal computer, for example, in the correct order and timescale.
Sending picture data out of sequence requires additional memory at the encoder and decoder and also causes delay. The number of bidirectionally coded pictures between intra- or forward-predicted pictures must be restricted to reduce cost and minimize delay, if delay is an issue.
Figure 2-15 shows the tradeoff that must be made between compression factor and coding delay. For a given quality, sending only I pictures requires more than twice the bit rate of an IBBP sequence. Where the ability to edit is impor- tant, an IB sequence is a useful compromise.
Figure 2-15.
Bit Rate MBit/Sec
Lower Quality
Higher
Quality Constant
Quality Curve 10 –
20 – 30 – 40 – 50 –
I IB IBBP
Figure 2-14.
Rec 601 Video Frames
Elementary Stream temporal_reference
I B B P B B P B B I B B P
Frame Frame FrameFrameFrameFrame Frame Frame Frame Frame Frame Frame Frame
I P B B P B B I B B P B B
0 3 1 2 6 4 5 0 7 8 3 1 2
Streaming sequenceGOP-based encoding
A Naïve approach – One video for all
Naive approach
6
[bitmovin.com][bitmovin.com]
WIRELESS NETWORKING, 430.752B, 2020 SPRING SEOUL NATIONAL UNIVERSITY
WIRELESS NETWORKING, 430.752B, 2020 SPRING SEOUL NATIONAL UNIVERSITY
□ Encode video in multiple bitrates
□ Replicate using a content delivery network
□ Video player (i.e., receiver) picks bitrate adaptively
§ Estimate connection’s available bandwidth
§ Pick a bitrate ≤ available bandwidth
è DASH (dynamic adaptive streaming over HTTP)
Broad Solution Approach
Smooth streaming
HTTP dynamic streaming
HTTP live streaming
2011
Streaming multimedia: DASH
□ DASH: Dynamic, Adaptive Streaming over HTTP
□ Server:
§ divides video file into multiple chunks
§ each chunk stored, encoded at different rates
§ manifest file: provides URLs for different chunks
□ Client:
§ periodically measures server-to-client bandwidth
§ consulting manifest, requests one chunk at a time
• chooses maximum coding rate sustainable given current bandwidth
• can choose different coding rates at different points in time
(depending on available bandwidth at time)
WIRELESS NETWORKING, 430.752B, 2020 SPRING SEOUL NATIONAL UNIVERSITY
WIRELESS NETWORKING, 430.752B, 2020 SPRING SEOUL NATIONAL UNIVERSITY
Streaming multimedia: DASH
□ “intelligence” at client: client determines
§ when to request chunk (so that buffer starvation, or overflow does not occur)
§ what encoding rate to request (higher quality when more bandwidth available)
§ where to request chunk (can request from URL server that is
“close” to client or has high available bandwidth)
“Chunks” of video at each bitrate
8
…
…
Time
1s 2s
[netflix.com]
“Chunks” of video at each bitrate
WIRELESS NETWORKING, 430.752B, 2020 SPRING SEOUL NATIONAL UNIVERSITY
WIRELESS NETWORKING, 430.752B, 2020 SPRING SEOUL NATIONAL UNIVERSITY
Client gets metadata about chunks via “Manifest”
[witestlab.poly.edu] 9
Client gets Metadata about Chunks via “Manifest”
Client can fetch Chunks of Different Qualities
Video and audio chunks can be mixed differently.
WIRELESS NETWORKING, 430.752B, 2020 SPRING SEOUL NATIONAL UNIVERSITY
WIRELESS NETWORKING, 430.752B, 2020 SPRING SEOUL NATIONAL UNIVERSITY
Client can fetch Chunks of Different Qualities
WIRELESS NETWORKING, 430.752B, 2020 SPRING SEOUL NATIONAL UNIVERSITY
WIRELESS NETWORKING, 430.752B, 2020 SPRING SEOUL NATIONAL UNIVERSITY
Estimating available capacity
1615
A Buffer-Based Approach to Rate Adaptation:
Evidence from a Large Video Streaming Service
Te-Yuan Huang, Ramesh Johari, Nick McKeown, Matthew Trunnell
⇤, Mark Watson
⇤Stanford University, Netflix
⇤{huangty,rjohari,nickm}@stanford.edu, {mtrunnell,watsonm}@netflix.com
ABSTRACT
Existing ABR algorithms face a significant challenge in esti- mating future capacity: capacity can vary widely over time, a phenomenon commonly observed in commercial services.
In this work, we suggest an alternative approach: rather than presuming that capacity estimation is required, it is perhaps better to begin by using only the bu↵er, and then ask when capacity estimation is needed. We test the viabil- ity of this approach through a series of experiments spanning millions of real users in a commercial service. We start with a simple design which directly chooses the video rate based on the current bu↵er occupancy. Our own investigation re- veals that capacity estimation is unnecessary in steady state;
however using simple capacity estimation (based on immedi- ate past throughput) is important during the startup phase, when the bu↵er itself is growing from empty. This approach allows us to reduce the rebu↵er rate by 10–20% compared to Netflix’s then-default ABR algorithm, while delivering a similar average video rate, and a higher video rate in steady state.
Categories and Subject Descriptors
C.2.0 [Computer Systems Organization]: Computer- Communication Networks—General
Keywords
HTTP-based Video Streaming, Video Rate Adaptation Al- gorithm
1. INTRODUCTION
During the evening peak hours (8pm–1am EDT), well over 50% of US Internet traffic is video streamed from Netflix and YouTube [16, 17]. Unlike traditional video downloads that must complete fully before playback can begin, streaming video starts playing within seconds. Each video is encoded at a number of di↵erent rates (typically 235kb/s standard definition to 5Mb/s high definition) and stored on servers
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from permissions@acm.org.
SIGCOMM’14, August 17–22, 2014, Chicago, Illinois, USA.
Copyright is held by the owner/author(s). Publication rights licensed to ACM.
0 500 1000 1500 2000 2500
Time (s) 0
2000 4000 6000 8000 10000 12000 14000 16000 18000
Figure 1: Video streaming clients experience highly variable end-to-end throughput.
as separate files. The video client—running on a home TV, game console, web browser, DVD player, etc.—chooses which video rate to stream by monitoring network condi- tions and estimating the available network capacity. This process is referred to as adaptive bit rate selection or ABR.
ABR algorithms used by such services balance two over- arching goals. On one hand, they try to maximize the video quality by picking the highest video rate the network can support. On the other hand, they try to minimize rebu↵er- ing events which cause the video to halt if the client’s play- back bu↵er goes empty.
It is easy for a streaming service to meet either one of the objectives on its own. To maximize video quality, a service could just stream at the maximum video rate R
maxall the time. Of course, this would risk extensive rebu↵ering. On the other hand, to minimize rebu↵ering, the service could just stream at the minimum video rate R
minall the time—
but this extreme would lead to low video quality. The design goal of an ABR algorithm is to simultaneously obtain high performance on both metrics in order to give users a good viewing experience [7].
One approach is to pick a video rate by estimating fu- ture capacity from past observations. In an environment with constant throughput, past observations are reliable to predict future capacity. However, in an environment with highly variable throughput, although past observations still provide valuable ballpark figures, accurate estimation of fu- ture capacity becomes challenging. Figure 1 is a sample
Avg. throughput over chunk download (kbps)
Time(s)
[A Buffer-Based Approach to Rate Adaptation: Evidence from a Large Video Streaming Service, Huang et al., ACM SIGCOMM 2014]
[A Buffer-Based Approach to Rate Adaptation: Evidence from a Large Video Streaming Service, Huang et al., ACM SIGCOMM 2014]
“A random sample of 300,000 Netflix sessions shows that roughly 10% of sessions experience a median throughput less than half of the 95
thpercentile throughput.”
“20-30% of rebuffers are unnecessary”
Estimating available capacity
WIRELESS NETWORKING, 430.752B, 2020 SPRING SEOUL NATIONAL UNIVERSITY
WIRELESS NETWORKING, 430.752B, 2020 SPRING SEOUL NATIONAL UNIVERSITY
Nearly full buffer è high bitrate Nearly empty buffer è low bitrate
How about Buffer-based Adaptation?
WIRELESS NETWORKING, 430.752B, 2020 SPRING SEOUL NATIONAL UNIVERSITY
WIRELESS NETWORKING, 430.752B, 2020 SPRING SEOUL NATIONAL UNIVERSITY
0 2 4 6 8 10 12 14 16 18 20 22
Hours in GMT 40
50 60 70 80 90 100 110 120
Normalized Number of Rebuffers per Hour (%)
Peak Hours
Performance:"Rebuffer"Rate"
10&–&20%&less&rebuffer&than&Control&&
BBA&
Lower&bound&
33"
Control&
21
Normalized number of rebuffers per hour (%)
Hours in GMT
[A Buffer-Based Approach to Rate Adaptation: Evidence from a Large Video Streaming Service, Huang et al., ACM SIGCOMM 2014]
Buffer-based adaptation
22
the startup phase, the playback bu↵er starts out empty and carries no useful information to help us choose a video rate.
BBA-1 follows the usual chunk map, starting out with a low video rate since the bu↵er level is low. It gradually increases the rate as the bu↵er fills, as shown by the red line in Figure 16. BBA-1 is too conservative during startup.
The network can sustain a much higher video rate, but the algorithm is just not aware of it yet.
In this section, we test the following hypothesis. During the startup, we can improve the video rate by entering into the risky area; in the steady state, we can improve both video rate and rebu↵er rate by using a chunk map. Our next algorithm, BBA-2, tries to be more aggressive during the startup phase, by incorporating a simple capacity estimation into the startup behavior. When possible, BBA-2 ramps up quickly and fills the bu↵er with a much higher rate than what the map suggests.
This two phases of operation can be found in many net- work protocols, such as the slow-start and congestion avoid- ance phases in TCP. For TCP, when a connection starts, the congestion control algorithm knows nothing about net- work conditions from the sending window, and the window is quickly opened to use available capacity until packet losses are induced. Similar to TCP, ABR algorithms get little or no information from the playback bu↵er at the beginning of a session. However, while ABR algorithms also ramp up the video rate quickly, unlike TCP, they need to do it in a controlled manner to prevent unnecessary rebu↵ers.
From Figure 11, we know that the change of the bu↵er, B = V (ChunkSize/c[k]), captures the di↵erence be- tween the instantaneous video rate and system capacity.
Now, assuming the current video rate is Ri, to safely step up a rate, c[k] needs to be at least Ri+1 to avoid rebu↵ers.
In other words, we require B V (ChunkSize/Ri+1).
Further, since videos are encoded in VBR, the instantaneous video rate can be much higher than the nominal rate. Let the max-to-average ratio in a VBR stream be e, so that eRi+1 represents the maximum instantaneous video rate in Ri+1. When the player first starts up, since there is no bu↵er to absorb the variation, c[k] needs to be at least larger than eRi+1 in order to safely step up a rate. In other words, when considering VBR and the bu↵er is empty, B needs to be larger than V (ChunkSize/(eRi+1)) for the algorithm to safely step up from Ri to Ri+1. According to Figure 10, the max-to-average ratio e is around 2 in our system. Since e = 2, Ri/Ri+1 ⇠ 2, and a chunk size can be smaller than half the average chunk size (ChunkSize 0.5V Ri), B needs to be larger than 0.875V s to safely step up a rate when the bu↵er is empty in our system.
Based on the preceding observation, BBA-2 works as fol- lows. At time t = 0, since the bu↵er is empty, BBA-2 only picks the next highest video rate, if the B increases by more than 0.875V s. Since B = V ChunkSize/c[k], B > 0.875V also means that the chunk is downloaded eight times faster than it is played. As the bu↵er grows, we use the accumulated bu↵er to absorb the chunk size variation and we let BBA-2 increase the video rate faster. Whereas at the start, BBA-2 only increases the video rate if the chunk downloads eight times faster than it is played, by the time
6Note that the startup phase does not refer to the join delay.
Figure 17: BBA-2 achieved a similar video rates to the Control algorithm overall.
0 2 4 6 8 10 12 14 16 18 20 22
Hours in GMT 100
150 100 50 0 50 100
(kb/s)
Peak Hours
Figure 18: BBA-2 achieved better video rate at the steady state. The steady state is approximated as the period after the first two minutes in each session.
it fills the cushion, BBA-2 is prepared to step up the video rate if the chunk downloads twice as fast as it is played. The threshold decreases linearly, from the first chunk until the cushion is full. The blue line in Figure 16 shows BBA-2 ramping up faster. BBA-2 continues to use this startup al- gorithm until (1) the bu↵er is decreasing, or (2) the chunk map suggests a higher rate. Afterwards, we use the f (B) defined in the BBA-1 algorithm to pick a rate.
Note that BBA-2 is using B during startup, which en- codes a simple capacity estimate: the throughput of the last chunk. This design helps make the algorithm more aggres- sive at a point when the bu↵er has not yet accumulated enough information to accurately determine the video rate to use. Nevertheless, note that our use of capacity estima- tion is restrained. We only look at the throughput of the last chunk, and crucially, once the bu↵er is built up and the chunk map starts to suggest a higher rate, BBA-2 be- comes bu↵er-based—it picks a rate from the chunk map, in- stead of using B. In this way, BBA-2 enables us to enjoy the improved steady-state performance of the bu↵er-based approach, without sacrificing overall bitrate due to a slow startup ramp.
6.1 Results
We ran our experiments during the same time period and with the same pool of users as the previously described ex-
Video rate
difference (kbps)
Hours in GMT
Control algorithm
BBA
[A Buffer-Based Approach to Rate Adaptation: Evidence from a Large Video Streaming Service, Huang et al., ACM SIGCOMM 2014]
BBA: Less rebuffering with higher bitrate
compared to baseline DASH
How about Buffer-based Adaptation?
WIRELESS NETWORKING, 430.752B, 2020 SPRING SEOUL NATIONAL UNIVERSITY
WIRELESS NETWORKING, 430.752B, 2020 SPRING SEOUL NATIONAL UNIVERSITY
Content distribution networks
□ Challenge: how to stream content (selected from millions of videos) to hundreds of thousands of simultaneous users?
□ option 1: single, large “mega-server”
§ single point of failure
§ point of network congestion
§ long path to distant clients
§ multiple copies of video sent over outgoing link
….quite simply: this solution doesn't scale
Content distribution networks
□ Challenge: how to stream content (selected from millions of videos) to hundreds of thousands of simultaneous users?
□ option 2: store/serve multiple copies of videos at multiple geographically distributed sites (CDN)
§ Push CDN servers deep into many access networks (closer to users)
• used by Akamai, 1700 locations
§ Smaller number (10’s) of larger clusters in POPs (points of presence) near (but not within) access networks
• used by Limelight
• From which CDN node to retrieve content?
• What content to place in which CDN node?
§ More challenges
WIRELESS NETWORKING, 430.752B, 2020 SPRING SEOUL NATIONAL UNIVERSITY
WIRELESS NETWORKING, 430.752B, 2020 SPRING SEOUL NATIONAL UNIVERSITY
CDN content access: a closer look
Bob (client) requests video (http://netcinema.com/6Y7B23V)
§ Video stored in CDN at http://KingCDN.com/NetC6y&B23V
netcinema.com
KingCDN.com 1
1. Bob gets URL for video http://netcinema.com/6Y7B23V from netcinema.com web page
2
2. resolve http://netcinema.com/6Y7B23V via Bob’s local DNS
netcinema’s
authoratative DNS 3
3. netcinema’s DNS returns URL
http://KingCDN.com/NetC6y&B23V 4 4&5. Resolve
http://KingCDN.com/NetC6y&B23 via KingCDN’s authoritative DNS, which returns IP address of KingCDN server with video
6. request video from 5 KINGCDN server, streamed via HTTP
KingCDN
authoritative DNS Bob’s
local DNS server