Technology
Home  >>  Technology

Technology

Bada Networks Technology Primer:

Real-Time Video Processing

There are several challenging technical issues in delivering real-time video over an IP network. The technology we have developed and implemented in our products addresses these issues to provide the best possible video quality over any IP network. More specifically, we have developed a Multimedia Application Routing Server (MARS™) that solves all these problems with one design in one hardware box.

Before going into the details of the technical problems and how our technology solves them, we should quantify the term "real-time" because there is a blurred line between real-time video delivery and downloading a video file now. There used to be a clear line between real-time video delivery and file downloading because file downloading used to be a process of getting the entire video file from a server to a PC before a video player can start playing the video file. This line becomes blurred due to the success of advanced file formats and the corresponding video players. Using such a video player to access a video file on a server, the video file downloading process and video playback process start at almost the same time (playback starts a little bit later, not noticeable by human perception, after some video data are downloaded). While video is played back, the downloading process continues using the available bandwidth. If the available bandwidth is larger than or equal to the video playback bit rate, the video data are downloaded to the PC ahead of or keeping pace with the playback and the user sees "real-time" video. If the available bandwidth is less than the video bit rate, playback stops for a while and resumes after some video data are downloaded, stops again when the downloaded video data are not enough to keep pace with the playback. Our definition of "real-time" video is that video playback starts without a user noticeable delay after the user requests the video, AND video playback continues without stopping during the entire playback process. This means that the video bit rate has to be lower than the available bandwidth almost all the time, except possibly for a very short period of time during downloading and an initial buffering delay is not noticeable by a human.

Bandwidth & Real-Time

One of the technical challenges in delivering video over IP is how to save bandwidth usage when a video bitstream is to be delivered to multiple users at the same time. This is an issue in many applications (e.g. video broadcasting and video conferencing). First of all, unicast in a client-server system is not going to work well because the server has to send a copy of the same video bitstream to each client and the bandwidth usage (waste of sending multiple identical copies) is too high. Therefore, people thought about using IP multicast that is a part of the Internet Protocol from day one. IP multicast would be perfect for saving bandwidth except for one thing. That is, who should be in charge of the multicast address allocation? Without an authority to assign the multicast IP addresses, different bitstreams sent to the same multicast IP address would collide (interfere) with each other and the client subscribed to the multicast IP address would receive different bitstreams in a random fashion. Therefore, IP multicast cannot be used on the open Internet and this is why IP multicast is not enabled in any routers on the Internet. For a closed IP network, such as a local area network (LAN) or an IP network completely controlled by a single organization, IP multicast may be used.

From the above discussions, we understand that, to deliver video content over an open IP network, we cannot rely on IP multicast and can only try to save bandwidth usage based on unicast. A hierarchical content delivery network (CDN) is a way to achieve the objective of efficient video delivery. Figure 1 illustrates such a CDN. The video contents are first distributed to a few root servers and each root server sends a single copy of each video bitstream to a delivery server. Between any two delivery servers, only a single copy of any given video bitstream is transmitted. The delivery servers at the edge of the CDN send unicast bitstreams to the clients.

cdn

Figure1. Illustration of a Hierarchical Content Delivery Network

The CDN delivery servers should have the capability of dynamically routing video bitstreams based on the network usage conditions. To eliminate bandwidth bottleneck, the distribution of the delivery servers should be as close to the edge of the network as possible. One of the functions of MARS™ is what we call "application-layer multicast" that can be considered as the delivery server in a CDN.

P2P and P4P

If we push the "edge" delivery server one step further, the CDN becomes the so-called peer-to-peer (P2P) network. In a sense, a P2P network is a CDN too, but the "edge" server is pushed all the way to the client terminal. Between the two extremes of central server and P2P, there should be an optimal distribution of the delivery servers for a given network. The optimal distribution of the delivery servers should have the advantage of P2P network for large scale deployment and the advantage of the central server for system level management. Thus the optimally distributed delivery servers form a "managed P2P" network or what some refer to as a "P4P, Peer-for-Peer" network. There have been many publications about P2P and its characteristics. One of the issues about P2P is how to make it a win-win technology for both network service provider and consumer. On one hand, the network service provider has no control over the P2P traffic. On the other hand, consumers do not have a reliable application. The "managed P2P" network resolves this issue to the satisfaction of both the network service provider and the consumer. The MARS™ network is a "managed P2P" network.

In addition to the bandwidth usage consideration in optimizing the distribution of the delivery servers, some other factors also need to be considered. For example, statistically, a client terminal in a P2P network uses more upstream bandwidth than downstream bandwidth because, for each received video bitstream, the client has to serve more than one other client. This is opposite to the asymmetric nature of the DSL access network. Another issue is the management and reliability of the delivery servers. Usually, the closer it is to the edge, the harder it is to manage. The extreme case is again in the P2P network. If someone turns off a client terminal, a brief interruption happens to the other client terminals that are receiving video bitstreams from the turned off client until a new route is established. The MARS™ network, as a "managed P2P" network, overcomes these issues.

Variation in Bandwidth, Resolution & Format

Another challenge for video over IP is that the bandwidth of the IP network is a shared resource for various applications. In addition, the access network bandwidth for one user may be very different from the bandwidth for another user. Therefore, the multiple users receiving the same video content at the same time may have very different bandwidth. On top of the large bandwidth variation, the terminal devices for decoding video bitstreams and displaying reconstructed video vary from a high-definition TV, to a PC and to a hand-held device. Figure 2 illustrates the problem with various bandwidth connections (256Kbps, 512Kbps, 1Mbps, and 2Mbps) between the video source and four different receiving sites.

Figure 2. Illustration of Bandwidth Variation

If the video source sends out a high-quality video bitstream at 2Mbps, it will satisfy the users at site 3, but users at sites 1, 2, and 4 will not be able to see the video at all because the bandwidth connections to these three sites are lower than 2Mbps. If the video source sends out a low-quality video bitstream at 256Kbps, it can be delivered to all sites, but the users at the high bandwidth sites will not be happy because they get the same low-quality video as the users with a low bandwidth connection while they may be paying a higher price for their high bandwidth connections. Another possibility is for the video source to send out four different video bitstreams at 256Kbps, 512Kbps, 1Mbps, and 2Mbps so that each site will receive a video bitstream that fits its bandwidth connection. This is a so-called simulcast approach. The problems with simulcast are (a) the video source has to generate four different bitstreams and manage them; (b) the total bandwidth for sending out the four bitstreams from the video source is 256Kbps + 512Kbps +1Mbps + 2Mbps = 3.768Mbps. Therefore, none of the three options, namely, single high bit rate bitstream, single low bit rate bitstream, and simulcast of four bistreams, would be good.

On the other hand, by adding video processing capability in the network as illustrated in Figure 3, this problem of bandwidth and terminal device variation is solved to everyone’s satisfaction.

Figure 3. Illustration of Video Processing inside Network

The exact number of network processing units and their locations depend on the network structure. The exact video processing operation performed by the network processing unit depends on the video encoding algorithm used at the video source. If the video source encodes the video using a non-scalable video coding algorithm, the network processing unit needs to perform a transcoding operation that changes quantization, frame rate, picture resolution, picture content size, or a combination of these four dimensions, to fit into each individual bandwidth link and/or terminal device capability. The transcoding operation usually includes a decoding operation and an encoding operation. Many fast transcoding techniques have been studied to take advantage of the information contained in the input bitstream to simplify the encoding operation. For example, reuse of the motion vectors decoded from the input video bitstream to reduce the computation complexity of motion estimation in the encoding operation is a well known technique. Even with the fast transcoding techniques, the transcoding operation still requires lots of computations and thus limits the number of video channels supported by a network processing unit.

On the other hand, if a scalable video coding algorithm is used in the video source to generate a scalable video bitstream, the network processing unit does not have to perform the transcoding operation. As opposed to a non-scalable video coding algorithm which was discussed previously, a scalable video coding can scale in four different dimensions (quantization, temporal, spatial, and content). The syntax of a scalable video bitstream should indicate how the bits, for each layer of each dimension, are arranged in the bitstream relative to other layers of the same dimension and other dimensions. The video processing unit simply decodes the syntax and decides to drop some bits according to a priority algorithm to fit into the individual bandwidth link and/or the terminal device capability. MARS™ is high-powered network processing equipment that can work with either non-scalable video coding or scalable video coding bitstreams. The difference is that the number of simultaneous video channels to be processed by MARS™ is much larger if the video bitstreams are scalable video coding bitstreams.

Another technical challenge in delivery of video over an open IP network is how to deal with the wide variety of video coding formats, MPEG-1, MPEG-2, MPEG-4 Part 2, MPEG-4 Part 10/AVC/H.264, and H.263, just to name a few. When content creators create video contents or application developers create video applications, the last thing they want to worry about is what video coding format is to be used in the coded video bitstreams. They would use whatever is conveniently available to them at the time. Therefore, the video contents may be coded in many different formats. On the other hand, it would be best if a video receiver only had to decode a single video coding format. To accommodate such a mismatch between the video content creators and the video receiver, video transcoding is needed. If such a transcoding task is performed in a central server, the computational load to the server becomes a heavy burden. Ideally, the network processing units illustrated in Figure 3 should perform the transcoding operation. In addition, such an approach allows gradual upgrading of user software or hardware for video decoding while video coding technology advances. When a new video coding format is used for coding new video contents, we don’t have to upgrade the user software or hardware right away in order to receive the new video contents coded using the new video coding format. The network processing units can transcode the new video coding format into the old one. Of course, the network processing units have to be upgraded to decode the new video coding format. Since there are much fewer network processing units than user end-points and it is easier for an operator to upgrade the network processing units in its network than upgrading the user end-points, such an approach makes economic sense. By using MARS™ in the network, we can achieve such a "future proof" system for the service providers.

Quality of service (QoS) in an IP network is a concept that is as old as the Internet Protocol (IP) itself. It is to provide a means to set different priorities for different packets to be delivered over an IP network. An IP packet may be stamped to have a higher priority than other packets so that it can be delivered faster than other packets. Actually, QoS enabled routers have been developed and deployed on the Internet for a long time. However, few applications can take advantage of QoS so far.

The QoS enabled routers can treat different IP packets with different priority settings in different ways, but they cannot decide the priority of any IP packets because they do not understand the payload of the IP packets. Therefore, someone else has to stamp the IP packets with different priorities. In a perfect world, an application running on a PC is able to set the priority of any IP packets because the application knows the different data types, audio, video, or data. Combined with the QoS enabled routers, the QoS problem is solved. However, such a scheme only works in a perfect world, not the real world we are in. One of the reasons is that the system administrators do not trust any application running on a PC to set up packet priority because everyone writing an application code would want the packets of the application to have the highest priority. Therefore, the system administrators in charge of configuring the routers do not want to turn on the QoS feature without knowing that the priority setting in the IP packets can be trusted.

Such a situation indicates that there is a gap between the application level and the infrastructure level. Therefore, there is a need to bridge this gap so that QoS can be used for real-time multimedia applications. Such a bridge may be called an application infrastructure level. It should be network equipment to be managed by the system administrators who also manage the infrastructure routers so that it can be trusted by the system administrators. It should understand the payload of the IP packets coming from the application level so that it can stamp the priority of each packet. With such a bridge for priority setting, the application level does not need to set priority of its IP packets and the infrastructure level routers can turn on the QoS feature for the packets coming from the trusted devices in the application infrastructure level. MARS™ is exactly such an "application infrastructure" device that is managed by the system administrator and understands the media data type. Therefore, MARS™ is a trusted QoS setting device. It labels IP packets with different priority according to the data payload AND its priority setting is "trusted" by QoS enabled routers. Finally, QoS will become a reality.

Next Phase of Internet Evolution:

We have all felt the impact of the Internet in our lives. What we, at Bada Networks, are working on is the next wave of technology that will have even greater impact in our personal lives and businesses.

Real-time, Interactivity, Multimedia and Collaboration are trends and behavior of the next phase of Internet Evolution. Bada Networks' revolutionary core technology offers a unique communications system that solves the technical challenges in these areas. Bada Networks delivers unsurpassed productivity, opportunities and experience for our customers. Our continuous expanding set of enabling applications and services based on the core technology helps our customers to achieve immediate benefits as well future-proof protection.