What you need to know about Live and OTT Video Delivery
Charlie Kraus: Hello, welcome to episode two of the bit by bit webinar series getting started with live video. This is the second episode of the series. We're going to have two presenters, I'm Charlie Kraus in product marketing, focused on video solutions, and we're also having Ernest Russell on the call as well, and we'll share the presenting duties here. So for episode two, what this is going to focus on is the various live delivery workflows. So this is for people that already are delivering live video and wanting to improve their workflow and their experiences to deliver to audiences, or if you're just thinking about it and don't have any firm plans yet, but just want to learn about this. This session will cover a long range of topics. So we'll talk about different workflows codecs that you're using in cameras. Various technologies for chunk video streaming that can deal with various latencies. We'll talk about the spectrum of use cases that are involved with live video, and it's quite extensive, and we'll follow it up with a summary on this and an introduction to other bit by bit webinars. We'll begin on the technical aspects, and I'm going to turn it over to Ernest.
Ernest Russel: Thanks, Charlie. And so how do we get started with live video streaming? Well, from the first place we start is by choosing our camera, right? And usually, that's based on, it may be based on budget. It may be based on what you already have, but one important factor when you're selecting a camera is to make sure that the codecs and protocols that you want to support in your live video workflow are supported by that camera. And we'll talk a little bit about those codecs and protocols in this webinar. And so one of the challenges that we have is the video must be prepared not just for multiple devices but also for different network conditions, some having more bandwidth than others. And also, we'll need to determine what our latency requirements are. And usually, the more interactivity that we require, the lower our latency needs to be.
And so now we'll break down the live video distribution workflow into five simple steps. The first being live video capture, which is capturing video from your camera, and that's step one. Step two, after we've captured that live video is encoding that video into what we call a contribution protocol. And that protocol is to ingest the video into our media server or into our video distribution workflow. Also, part of that process, or calling it step three here, is video packaging. And that includes converting the video for those different devices that we want to deliver to, and also packaging for adaptive bit rate ABR. And so in this industry, we use a good amount of acronyms. And so as I used them, I'll try to make sure that I described what they mean. The next step is network delivery, and we want to use a content delivery network, or CDN are for the best performance and scale when delivering video.
And then finally step five, the last step is when that video reaches its destination, which is our video player. And that video player could be a smartphone, a tablet, a computer, or even an OTT device, or over the top device. So now, let's move into step one, which includes capturing our video, right? And so for that video capture, a part of what goes on in the background is the compression and decompression of video, which is what codex stands for the compression and decompression. So two popular forms of compression, decompression, are H264 and also H265, H265 being a newer standard high-efficiency video coding, and H264 is advanced video coding. Those are definitely two of the most popular codecs that we see in the market today, but also there's a V1, which is an open-source, royalty-free codec. And then we have as popular audio codecs: AAC and MP3. AAC, Advanced Audio Codec has higher sound quality at about the same bit rate as MP3.
The next step is encoding our video for the first mile of delivery. And during that step, I talked about contribution protocols. So contribution protocols are used for ingests or first mile of delivery. And a couple of popular formats for those are protocols are the RTMP or real-time messaging protocol, and there's SRT, which is secure, reliable transport, which is a newer protocol that focuses on delivering good quality video over poor, bad connections. And then, there's RTP and RTSP real-time streaming protocol and real-time transfer port protocol. And these two are often used in conjunction together, but RTMP is definitely one of the more popular protocols. And so how do you decide which protocol is right for you? And again, it goes back to what protocol does your camera or encoder support and also what format is expected by the transcoder, and that's the next step.
So our next step is our transcoder our transmux process when we do our video packaging, and a lot of times, that's packaged into different formats for the different devices on the other end. So, some of the most popular are definitely HLS, Apple HLS, and impact DASH and HLS is HTTP live streaming. DASH it's made by the moving pictures expert group; that's the mpeg. And then, DASH is dynamic adaptive streaming over HTTP. And then we have Microsoft smooth streaming and also RTMP real-time messaging protocol, and finally, a webRTC, web real-time communication, which I'll talk about again a little bit later in the webinar. And so I'm becoming a very new way of delivering my video.
All right! So now, let's look at the whole picture of delivering live video and starting on the left here, you see that we have our ingest, which is in our contribution protocol here is showing an RTMP input. So it's showing our video going from our camera to our encoder in an RTMP protocol. And that goes into our convert here for transcode and transmux. And so there's our video packaging where we transcode and transmux on to some of the different formats HLS, DASH, Microsoft's move streaming, even HDS, one of the older protocols. A video can be packaged into these different formats to get playback on the different devices out there. And one key part to delivering successful live video is also the monitoring and analytics, right? So if we can glean some of that, those analytics from the video player, that's always good.
Next, let's talk a little bit about latency. So different live streaming cases will require different latency. So typically, our traditional live stream latency is anywhere between 30 seconds and one minute that standard HLS standard DASH without any tuning; we'll talk about what tuning is in a couple of slides here. And then we have, so after we Tune Chunk Streaming, we can get that number down to as low as 6 seconds, usually between 6 and 10 seconds for tune chunk streaming. And then we have CMAF, which comes in even lower at sub 5 seconds. And so if, especially when we're trying to beat traditional broadcast, which is usually around 6 seconds, then we that's when we want to get into the lower latency protocols, Tune Chunk Streaming, CMAF, and realtime streaming, which is sub-second streaming.
Now let's get into delivering over the network. Usually, video is delivered by what we call it chunked video streaming. That's where the video is broken into video segments, along with a manifest that describes those video segments. And I have it listed as an XML manifest here, but we can see different file extensions for those depending on what format is used. And for example, an HLS uses a .M3U8 extension for the manifest and while DASH uses a MPD manifest, but both work very similarly in that the manifest describes those video segments, and also those video segments need to be buffered before playback by the video player. I'll show you here an example of how those chunks are buffered.
So now to tuning chunk sizes, and I talked about this a little bit earlier, but let's see, let's say an example of tuning chunk sizes. And so most video players need to buffer three chunks of video before playback begins. So by reducing the chunk size of our HTTP packets, we can reduce the amount of data that's being buffered by the player. And so here we have the example of 10 seconds of video and our standard 10 second chunks of video, right? So if that chunk is 10 seconds and we're buffering three of those segments, we have 30 seconds of video data that has to be buffered by the player versus if we're able to shorten those chunks, where we have 6 seconds of tune video chunks. Now, when we buffer three of those, we have 18 seconds of video that needs to be buffered. And so not only are we reducing the chunk size, we're also able to output to deliver information faster. And so there is a balance though over reducing segments can cause rebuffering, so there's no universal chunk size setting. This needs to be tuned for your workflow. We have been able to tune down to 1 second video chunks.
Now let's look at another form of video delivery and that CMAF and low latency CMAF. So CMAF is supported by major industry players like Apple and Opera, so CMAF is very well supported in the industry. One of the big advantages to CMAF is that it simplifies the workflow by allowing HLS and DASH manifest to be accessed simultaneously. And so now we, instead of having a separate workflow for HLS video and DASH video, now we can collapse those workflows and only store one set of encodings, which is definitely a more cost-effective. And so CMAF delivers micro chunks that can be decoded and played back before the entire segment is received. And that's one of the biggest difference between tuned delivery and CMAF is that with, with CMAF, not only are we delivering smaller chunks, but we're able to output those chunks from the encoder more often.
Now that brings me to the final method of delivery that we'll talk about here, which is webRTC. And so web RTC or web real-time communications was originally used for browser to browser communication; unlike chunk streaming with TCP/IP, webRTC broadcast over UDP and therefore allows sub-second streaming. But when you look at the workflow here, it works in a very similar way where we have our contributing protocol coming in on RTMP, and that is converted to webRTC and delivered out to the edge to delivery for multiple devices and a webRTC plays, natively and most modern browsers. So it's a good option for live streaming.
So to wrap up here, when choosing a camera and encoder and the network you want to choose a camera and encoder and network that supports your chosen protocols are for live video delivery. So that supports adaptive bit rate that supports delivery to multiple devices, including over the top devices. And also, we want to be prepared to deliver not just to multiple devices, but under varied network conditions and a flow where latency matters to you choose a protocol that supports that lower latest latency like CMAF or real-time streaming. And that wraps it up. I'll hand back to Charlie to tell you what's next.
Charlie Kraus: Okay. I just want to add a little bit more to a summary and sort of some key points as well. So you've seen that there is wide variety of live streaming use cases that we've talked about and several flexible workflows to support them. And that leveraging the latest CDN video workflows will help you deliver the highest quality online video and also help you keep up with technology advances because the workflow will be upgraded in the CDNs. We talked about some sub-second streaming with interactivity, that'll enable new revenue-generating business models with that, and that live online streaming audience expectations for low latency can be met with either HLS and DASH small chunks with new CMAF, low latency with micro chunks, and as well as the even newer webRTC technology. So we hope that this webinar was informative to you and put in a little pitch here for the next in the series, which is getting started with video on demand. And that would be the next one in the series. So we thank you for joining us, and we look forward to having you on another future webinar.