摘要:
A video demultiplexer and video decoder include features for efficient video data recovery in the event of channel error. The demultiplexer detects a boundary between physical layer data units and adds boundary information to the bitstream produced by the demultiplexer. The demultiplexer produces adaptation layer data units, which are processed by the adaptation layer to produce an application layer bitstream. When the video decoder encounters an error in the bitstream, it uses the boundary information to limit the amount of data that must be concealed. In particular, the boundary information permits the error to be associated with a small segment of data. The video decoder conceals data from the beginning of the segment of data, rather than an entire slice or frame in which the segment resides. In this manner, the video decoder provides efficient data recovery, limiting the loss of useful data that otherwise would be purposely discarded for concealment purposes.
摘要:
The disclosure is directed to a video slicing technique that promotes low complexity, bandwidth efficiency and error resiliency. A video encoder places an RM close to the beginning of each logical transmission unit (LTU) so that all but a very small end segment of each video slice fits substantially within an LTU. Instead of requiring placement of RMs exactly at the LTU boundaries, a video encoder applies an approximate alignment technique. Video slices are encoded so that RMs are placed close to the beginning of each LTU, e.g., at the end of the first MB falling within the LTU. A portion of the last MB from the preceding slice carries over into the next LTU. Loss of an LTU results in loss of virtually the entire current slice plus a very small portion of the previous slice.
摘要:
The disclosure is directed to techniques for region-of-interest (ROI) processing for video telephony (VT) applications. According to the disclosed techniques, a recipient device defines ROI information for video information transmitted by a sender device, i.e., far-end video information. The recipient device transmits the ROI information to the sender device. Using the ROI information transmitted by the recipient device, the sender device applies preferential encoding to an ROI within a video scene. ROI extraction may be applied to process a user description of a region of interest (ROI) to generate information specifying the ROI based on the description. The user description may be textual, graphical, or speech-based. An extraction module applies appropriate processing to generated the ROI information from the user description. The extraction module may locally reside with a video communication device, or reside in a distinct intermediate server configured for ROI extraction.
摘要:
The disclosure relates to techniques for video source rate control for video telephony (VT) applications. The source video encoding rate may controlled using a dual-buffer based estimation of a frame budget that defines a number of encoding bits available for a frame of the video. The dual-buffer based estimation technique may track the fullness of a physical video buffer and the fullness of the virtual video buffer. The source video encoding rate is then controlled based on the resulting frame budget. The contents of the virtual buffer depend on constraints imposed by a target encoding rate, while the contents of the physical buffer depend on constraints imposed by varying channel conditions. Consideration of physical video buffer fullness permits the video source rate control technique to be channel-adaptive. Consideration of virtual video buffer fullness permits the video source rate control technique to avoid encoding excessive video that could overwhelm the channel.
摘要:
The disclosure is directed to techniques for picture-in-picture (PIP) processing for video telephony (VT). According to the disclosed techniques, a local video communication device transmits PIP information to a remote video communication device. Using the PIP information, the remote video communication device applies preferential encoding to non-PIP regions of video transmitted to the local video communication device.
摘要:
The disclosure is directed to techniques for region-of-interest (ROI) processing for video telephone (VT) applications. According to the disclosed techniques, a recipient device defines ROI information for video information transmitted by a sender device, i.e., far-end video information. The recipient device transmits the ROI information to the sender device. Using the ROI information transmitted by the recipient device, the sender device applies preferential encoding to an ROI within a video scene. In this manner, the recipient device is able to remotely control ROI encoding of far-end video information by the sender device.
摘要:
The disclosure is directed to techniques for encoder-assisted adaptive interpolation of video frames. According to the disclosed techniques, an encoder generates information to assist a decoder in interpolation of a skipped video frame, i.e., an S frame. The information permits the decoder to reduce visual artifacts in the interpolated frame and thereby achieve improved visual quality. The information may include interpolation equation labels that identify selected interpolation equations to be used by the decoder for individual video blocks. As an option, to conserve bandwidth, the equation labels may be transmitted for only selected video blocks that meet a criterion for encoder-assisted interpolation. Other video blocks without equation labels may be interpolated according to a default interpolation technique.
摘要:
Configurations disclosed herein include systems, methods and apparatus that may be applied in a voice communications and/or storage application to remove, enhance, and/or replace the existing context. Example embodiments may first remove any existing context from a digital audio signal to obtain a context suppressed signal. The context suppressed signal may then be encoded. An audio context may be selected from among a plurality of audio contexts, with the selected audio context inserted into a signal based on the encoded context suppressed signal.
摘要:
Configurations disclosed herein include systems, methods, and apparatus that may be applied in a voice communications and/or storage application to remove, enhance, and/or replace the existing context. Particularly, certain embodiments contemplate suppressing the context component from the digital audio signal to obtain a context-suppressed signal; generating an audio context signal that is based on a first filter and a first plurality of sequences, each of the first plurality of sequences having a different time resolution and mixing a first signal that is based on the generated audio context signal with a second signal that is based on the context-suppressed signal to obtain a context-enhanced signal, wherein generating an audio context signal includes applying the first filter to each of the first plurality of sequences.
摘要:
Configurations disclosed herein include systems, methods, and apparatus that may be applied in a voice communications and/or storage application to remove, enhance, and/or replace the existing context. In one aspect, a method of processing a digital audio signal that includes a first audio context is disclosed. The method comprises based on a first audio signal that is produced by a first microphone, suppressing the first audio context from the digital audio signal to obtain a context-suppressed signal. The method may further comprise selecting a second context based on the first audio context, and mixing the second audio context with a signal that is based on the context-suppressed signal to obtain a context-enhanced signal.