摘要:
A projection onto convex sets (POCS)-based method for consistent reconstruction of a signal from a subset of quantized coefficients received from an N×K overcomplete transform. By choosing a frame operator F to be the concatenization of two or more K×K invertible transforms, the POCS projections are calculated in RK space using only the K×K transforms and their inverses, rather than the larger RN space using pseudo inverse transforms. Practical reconstructions are enabled based on, for example, wavelet, subband, or lapped transforms of an entire image. In one embodiment, unequal error protection for multiple description source coding is provided. In particular, given a bit-plane representation of the coefficients in an overcomplete representation of the source, one embodiment of the present invention provides coding the most significant bits with the highest redundancy and the least significant bits with the lowest redundancy. In one embodiment, this is accomplished by varying the quantization stepsize for the different coefficients. Then, the available received quantized coefficients are decoded using a method based on alternating projections onto convex sets.
摘要:
The production of an interleaved multimedia stream for servers and client computers coupled to each other by a diverse computer network which includes local area networks (LANs) and/or wide area networks (WANs) such as the internet. Interleaved multimedia streams can include compressed video frames for display in a video window, accompanying compressed audio frames and annotation frames. In one embodiment, a producer captures separate video/audio frames and generates an interleaved multimedia file. In another embodiment, the interleaved file include annotation frames which provide either pointer(s) to the event(s) of interest or include displayable data embedded within the annotation stream. The interleaved file is then stored in the web server for subsequent retrieval by client computer(s) in a coordinated manner, so that the client computer(s) is able to synchronously display the video frames and displayable event(s) in a video window and event window(s), respectively. In some embodiments, the interleaved file includes packets with variable length fields, each of which are at least one numerical unit in length.
摘要:
An image decoding and recognition system and method comprising a fast heuristic algorithm using hidden Markov models (HMM). The new search algorithm, called an "iterative complete path" (ICP) algorithm, patterned after well-known branch-and-bound (B&B) methods, significantly reduces the complexity and improves the speed of HMM image decoding without sacrificing the optimality of the straightforward procedure. An advantageous form of the heuristic functions which is useful in applying the ICP algorithm to text-like images is described. The ICP algorithm is directly applicable to the separable type of finite-state source models. Also disclosed is a technique for transforming more general source models into such a separable form.
摘要:
Non-linguistic signal information relating to one or more participants to an interaction may be determined using communication data received from the one or more participants. Feedback can be provided based on the determined non-linguistic signals. The participants may be given an opportunity to opt in to having their non-linguistic signal information collected, and may be provided complete control over how their information is shared or used.
摘要:
Techniques for managing visual compositions for a multimedia conference call are described. An apparatus may comprise a processor to allocate a display object bit rate for multiple display objects where a total display object bit rate for all display objects is equal to or less than a total input bit rate, and decode video information from multiple video streams each having different video layers with different levels of spatial resolution, temporal resolution and quality for two or more display objects. Other embodiments are described and claimed.
摘要:
The subject disclosure is directed towards an immersive conference, in which participants in separate locations are brought together into a common virtual environment (scene), such that they appear to each other to be in a common space, with geometry, appearance, and real-time natural interaction (e.g., gestures) preserved. In one aspect, depth data and video data are processed to place remote participants in the common scene from the first person point of view of a local participant. Sound data may be spatially controlled, and parallax computed to provide a realistic experience. The scene may be augmented with various data, videos and other effects/animations.
摘要:
A “Media Transmission Optimizer” provides a media transmission optimization framework for lossy or bursty networks such as the Internet. This optimization framework provides a novel form of dynamic Forward Error Correction (FEC) that focuses on the perceived quality of a recovered media signal rather than on the absolute accuracy of the recovered media signal. In general, the Media Transmission Optimizer provides an encoder that optimizes the transmission of redundant frames of electronic media information encoded at different bit rates, and provides optimized playback quality by providing a decoder that automatically selects an optimal path through one or more available representations of each frame as a function of overall rate/distortion criteria.
摘要:
Multi-device capture and spatial browsing of conferences is described. In one implementation, a system detects cameras and microphones, such as the webcams on participants' notebook computers, in a conference room, group meeting, or table game, and enlists an ad-hoc array of available devices to capture each participant and the spatial relationships between participants. A video stream composited from the array is browsable by a user to navigate a 3-dimensional representation of the meeting. Each participant may be represented by a video pane, a foreground object, or a 3-D geometric model of the participant's face or body displayed in spatial relation to the other participants in a 3-dimensional arrangement analogous to the spatial arrangement of the meeting. The system may automatically re-orient the 3-dimensional representation as needed to best show the currently interesting event such as current speaker or may extend navigation controls to a user for manually viewing selected participants or nuanced interactions between participants.
摘要:
An “adaptive audio playback controller” operates by decoding and reading received packets of an audio signal into a signal buffer. Samples of the decoded audio signal are then played out of the signal buffer according to the needs of a player device. Jitter control and packet loss concealment are accomplished by continuously analyzing buffer content in real-time, and determining whether to provide unmodified playback from the buffer contents, whether to compress buffer content, stretch buffer content, or whether to provide for packet loss concealment for overly delayed or lost packets as a function of buffer content. Further, the adaptive audio playback controller also determines where to stretch or compress particular frames or signal segments in the signal buffer, and how much to stretch or compress such segments in order to optimize perceived playback quality.
摘要:
A method and system provides the ability to share access information for external data over a digital voice communication channel. The access information of external data may be exchanged instead of the external data itself. More specifically, a recipient device may receive contextual information which relates to the access information of external data. The contextual information may be processed to identify the source of the external data and other information necessary to access the external data. For example, a hyperlink directed to the external data in a Web server may be exchanged while the recipient device and the sending device are involved in a digital conversation. The recipient device can access the external data by activating the hyperlink.