REAL-TIME JITTER CONTROL AND PACKET-LOSS CONCEALMENT IN AN AUDIO SIGNAL
    1.
    发明申请
    REAL-TIME JITTER CONTROL AND PACKET-LOSS CONCEALMENT IN AN AUDIO SIGNAL 审中-公开
    音频信号中的实时抖动控制和分组丢失隐藏

    公开(公告)号:US20090304032A1

    公开(公告)日:2009-12-10

    申请号:US12542558

    申请日:2009-08-17

    IPC分类号: H04J3/06

    摘要: An “adaptive audio playback controller” operates by decoding and reading received packets of an audio signal into a signal buffer. Samples of the decoded audio signal are then played out of the signal buffer according to the needs of a player device. Jitter control and packet loss concealment are accomplished by continuously analyzing buffer content in real-time, and determining whether to provide unmodified playback from the buffer contents, whether to compress buffer content, stretch buffer content, or whether to provide for packet loss concealment for overly delayed or lost packets as a function of buffer content. Further, the adaptive audio playback controller also determines where to stretch or compress particular frames or signal segments in the signal buffer, and how much to stretch or compress such segments in order to optimize perceived playback quality.

    摘要翻译: “自适应音频播放控制器”通过将音频信号的接收分组解码并读取到信号缓冲器来进行操作。 然后根据播放器设备的需要从信号缓冲器中播放经解码的音频信号的样本。 抖动控制和分组丢失隐藏是通过实时连续分析缓冲区内容来实现的,并且确定是否从缓冲器内容中提供未修改的重放,是否压缩缓冲区内容,扩展缓冲区内容,还是提供丢包隐藏 延迟或丢失的数据包作为缓冲区内容的函数。 此外,自适应音频重放控制器还确定在哪里拉伸或压缩信号缓冲器中的特定帧或信号段,以及拉伸或压缩这些段以便优化感知的播放质量。

    System and method for real-time jitter control and packet-loss concealment in an audio signal
    2.
    发明授权
    System and method for real-time jitter control and packet-loss concealment in an audio signal 有权
    用于音频信号中实时抖动控制和丢包隐藏的系统和方法

    公开(公告)号:US07596488B2

    公开(公告)日:2009-09-29

    申请号:US10663390

    申请日:2003-09-15

    IPC分类号: G10L19/04 G10L21/04 G10L11/06

    摘要: An “adaptive audio playback controller” operates by decoding and reading received packets of an audio signal into a signal buffer. Samples of the decoded audio signal are then played out of the signal buffer according to the needs of a player device. Jitter control and packet loss concealment are accomplished by continuously analyzing buffer content in real-time, and determining whether to provide unmodified playback from the buffer contents, whether to compress buffer content, stretch buffer content, or whether to provide for packet loss concealment for overly delayed or lost packets as a function of buffer content. Further, the adaptive audio playback controller also determines where to stretch or compress particular frames or signal segments in the signal buffer, and how much to stretch or compress such segments in order to optimize perceived playback quality.

    摘要翻译: “自适应音频播放控制器”通过将音频信号的接收分组解码并读取到信号缓冲器来进行操作。 然后根据播放器设备的需要从信号缓冲器中播放经解码的音频信号的样本。 抖动控制和分组丢失隐藏是通过实时连续分析缓冲区内容来实现的,并且确定是否从缓冲器内容中提供未修改的重放,是否压缩缓冲区内容,扩展缓冲区内容,还是提供丢包隐藏 延迟或丢失的数据包作为缓冲区内容的函数。 此外,自适应音频重放控制器还确定在哪里拉伸或压缩信号缓冲器中的特定帧或信号段,以及拉伸或压缩这些段以便优化感知的播放质量。

    CLIENT-SIDE ECHO CANCELLATION FOR MULTI-PARTY AUDIO CONFERENCING
    3.
    发明申请
    CLIENT-SIDE ECHO CANCELLATION FOR MULTI-PARTY AUDIO CONFERENCING 有权
    多方声音会议的客户端ECHO取消

    公开(公告)号:US20080310328A1

    公开(公告)日:2008-12-18

    申请号:US11763224

    申请日:2007-06-14

    IPC分类号: H04L12/16 H03B29/00 H04Q11/00

    CPC分类号: H04L12/16

    摘要: A “Client-Side Echo Canceller” provides a unique system and method for reducing Multipoint Control Unit (MCU) computational overhead in a multi-point audio conference. In general, the local audio input signal of each client is transmitted in real-time to the MCU. The MCU then combines the audio input signals of all clients to create a single composite signal that is transmitted back to all clients in real-time. Each client then locally processes the composite signal to remove each client's local contribution to the composite signal prior to local playback in order to eliminate a local echo of each client's local audio input. In various embodiments, local cancellation of the local audio input from the composite signal is performed on either a time domain or a transform domain representation of the composite signal. Further, since each client receives the same signal, MCU transmission bandwidth can be reduced via multicast transmissions.

    摘要翻译: “客户端回声消除器”提供了一种用于在多点音频会议中减少多点控制单元(MCU)计算开销的独特系统和方法。 通常,每个客户端的本地音频输入信号实时传输到MCU。 然后,MCU组合所有客户端的音频输入信号,以创建单个复合信号,并将其实时传输回所有客户端。 然后,每个客户端本地处理复合信号以在本地回放之前去除每个客户端对复合信号的局部贡献,以便消除每个客户端的本地音频输入的本地回波。 在各种实施例中,在复合信号的时域或变换域表示上执行来自复合信号的本地音频输入的局部消除。 此外,由于每个客户端接收到相同的信号,所以可以通过多播传输来减少MCU传输带宽。

    DECENTRALIZED ARCHITECTURE AND PROTOCOL FOR VOICE CONFERENCING
    4.
    发明申请
    DECENTRALIZED ARCHITECTURE AND PROTOCOL FOR VOICE CONFERENCING 有权
    分散式架构和语音会议协议

    公开(公告)号:US20070237099A1

    公开(公告)日:2007-10-11

    申请号:US11277905

    申请日:2006-03-29

    IPC分类号: H04L12/16

    摘要: A decentralized computer network architecture and method that gathers metadata from local and remote clients and, based on that metadata, locally makes a decision whether to send a packet over the network. Each client listens to what other clients are doing, and only sends when the total number of concurrent speakers is below some threshold. In a multi-party voice conferencing embodiment, the threshold is a number of concurrent speakers that is restricted to less than a certain number. Under the decentralized computer network architecture, the type of network topology used to connect the clients is flexible, as long as each client is running a peer-aware system to decide locally whether to send their packets. The decentralized computer network architecture and method is distributed to run on each client, making it suitable for a wide variety of network topologies (such as full-mesh, bridge-based, or a hybrid of the two).

    摘要翻译: 从本地和远程客户端收集元数据的分散计算机网络架构和方法,并且基于该元数据在本地做出是否通过网络发送分组的决定。 每个客户端监听其他客户端正在执行的操作,只有当并发扬声器的总数低于某个阈值时才发送。 在多方语音会议实施例中,阈值是限制在小于一定数量的并发扬声器的数量。 在分散式计算机网络架构下,用于连接客户端的网络拓扑的类型是灵活的,只要每个客户端都运行一个对等体感知系统来本地确定是否发送它们的数据包。 分布式计算机网络架构和方法分布在每个客户端上运行,使其适用于各种网络拓扑(如全网状,基于桥接或两者混合)。

    PEER-AWARE RANKING OF VOICE STREAMS
    5.
    发明申请
    PEER-AWARE RANKING OF VOICE STREAMS 有权
    同声传译语音流

    公开(公告)号:US20070230372A1

    公开(公告)日:2007-10-04

    申请号:US11277932

    申请日:2006-03-29

    IPC分类号: H04L12/16

    摘要: A peer-aware voice stream ranking method that makes decisions based on information about participants of a voice conference over a network. Whether to send a participant's own audio packet out on the network is based both on information about the participant's own voice packet and voice packets that the participant receives from other clients. A Voice Activity Score (VAS) is computed for each frame of a particular voice stream. The VAS includes a voiceness component, indicating the likelihood that the audio frame contains speech or voice, and an energy level component that indicating the ratio of current frame energy to the long-term average of energy for a current speaker. Using the VAS from the participants, the method also ranks the client's voice stream as compared to other clients' voice streams in the voice conference. If there are participants higher ranking, the client's voice stream is not sent.

    摘要翻译: 基于通过网络进行语音会议的参与者的信息进行决策的对等感知语音流排序方法。 是否在网络上发送参与者自己的音频数据包都是基于参与者自己的语音数据包和参与者从其他客户端收到的语音数据包的信息。 为特定语音流的每个帧计算语音活动分数(VAS)。 VAS包括声音分量,指示音频帧包含语音或语音的可能性,以及指示当前帧能量与当前说话者的长期能量平均值的比率的能量分量。 使用来自参与者的VAS,与语音会议中的其他客户端的语音流相比,该方法还对客户端的语音流进行排序。 如果参与者排名较高,则不会发送客户端的语音流。

    Variable speed playback of digital audio

    公开(公告)号:US20060277052A1

    公开(公告)日:2006-12-07

    申请号:US11143022

    申请日:2005-06-01

    IPC分类号: G10L21/04

    CPC分类号: G10L21/04

    摘要: A method and system for modifying a digital audio signal to vary its playback speed while preserving the signal's pitch and quality. The variable speed playback (VSP) system and method mitigates artifacts remaining after processing by existing techniques. The VSP system and method produces a consistent and pleasing sound to an audio file, even while its speed is varied during playback. The VSP method includes selecting and estimating an input frame, adjusting the frame position, and overlapping and adding the adjust frame to an output signal. The frame position adjustment is achieved using an enhanced correlation technique that finds all local maxima over a cross-correlation function. The local maxima having a highest correlation score is designated as a cut position, where the adjusted frame is cut from the input buffer. The VSP system and method using four input frames to generate one output frame.

    System and method for providing high-quality stretching and compression of a digital audio signal
    7.
    发明授权
    System and method for providing high-quality stretching and compression of a digital audio signal 有权
    用于提供数字音频信号的高质量拉伸和压缩的系统和方法

    公开(公告)号:US07337108B2

    公开(公告)日:2008-02-26

    申请号:US10660325

    申请日:2003-09-10

    IPC分类号: G10L11/06 G10L21/04 H04B1/66

    CPC分类号: G10L21/04 G10L2025/935

    摘要: An adaptive “temporal audio scaler” is provided for automatically stretching and compressing frames of audio signals received across a packet-based network. Prior to stretching or compressing segments of a current frame, the temporal audio scaler first computes a pitch period for each frame for sizing signal templates used for matching operations in stretching and compressing segments. Further, the temporal audio scaler also determines the type or types of segments comprising each frame. These segment types include “voiced” segments, “unvoiced” segments, and “mixed” segments which include both voiced and unvoiced portions. The stretching or compression methods applied to segments of each frame are then dependent upon the type of segments comprising each frame. Further, the amount of stretching and compression applied to particular segments is automatically variable for minimizing signal artifacts while still ensuring that an overall target stretching or compression ratio is maintained for each frame.

    摘要翻译: 提供了一种自适应“时间音频缩放器”,用于自动地拉伸和压缩通过基于分组的网络接收的音频信号的帧。 在拉伸或压缩当前帧的段之前,时间音频缩放器首先计算用于每个帧的音调周期,用于调整用于拉伸和压缩段中的匹配操作的信号模板。 此外,时间音频缩放器还确定包括每个帧的片段的类型或类型。 这些段类型包括“有声”段,“无声”段和包括有声和无声部分的“混合”段。 然后,应用于每个帧的段的拉伸或压缩方法取决于包括每个帧的段的类型。 此外,施加到特定段的拉伸和压缩量可自动变化以最小化信号伪影,同时仍然确保为每个帧维持整体目标拉伸或压缩比。

    Client-side echo cancellation for multi-party audio conferencing
    8.
    发明授权
    Client-side echo cancellation for multi-party audio conferencing 有权
    客户端回声消除多方音频会议

    公开(公告)号:US08005023B2

    公开(公告)日:2011-08-23

    申请号:US11763224

    申请日:2007-06-14

    IPC分类号: H04L12/16 H04B3/20

    CPC分类号: H04L12/16

    摘要: A “Client-Side Echo Canceller” provides a unique system and method for reducing Multipoint Control Unit (MCU) computational overhead in a multi-point audio conference. In general, the local audio input signal of each client is transmitted in real-time to the MCU. The MCU then combines the audio input signals of all clients to create a single composite signal that is transmitted back to all clients in real-time. Each client then locally processes the composite signal to remove each client's local contribution to the composite signal prior to local playback in order to eliminate a local echo of each client's local audio input. In various embodiments, local cancellation of the local audio input from the composite signal is performed on either a time domain or a transform domain representation of the composite signal. Further, since each client receives the same signal, MCU transmission bandwidth can be reduced via multicast transmissions.

    摘要翻译: “客户端回声消除器”提供了一种用于在多点音频会议中减少多点控制单元(MCU)计算开销的独特系统和方法。 通常,每个客户端的本地音频输入信号实时传输到MCU。 然后,MCU组合所有客户端的音频输入信号,以创建单个复合信号,并将其实时传输回所有客户端。 然后,每个客户端本地处理复合信号以在本地回放之前去除每个客户端对复合信号的局部贡献,以便消除每个客户端的本地音频输入的本地回波。 在各种实施例中,在复合信号的时域或变换域表示上执行来自复合信号的本地音频输入的局部消除。 此外,由于每个客户端接收到相同的信号,所以可以通过多播传输来减少MCU传输带宽。

    System and method for real-time jitter control and packet-loss concealment in an audio signal
    9.
    发明申请
    System and method for real-time jitter control and packet-loss concealment in an audio signal 有权
    用于音频信号中实时抖动控制和丢包隐藏的系统和方法

    公开(公告)号:US20050058145A1

    公开(公告)日:2005-03-17

    申请号:US10663390

    申请日:2003-09-15

    IPC分类号: G10L19/00 H04L12/56

    摘要: An “adaptive audio playback controller” operates by decoding and reading received packets of an audio signal into a signal buffer. Samples of the decoded audio signal are then played out of the signal buffer according to the needs of a player device. Jitter control and packet loss concealment are accomplished by continuously analyzing buffer content in real-time, and determining whether to provide unmodified playback from the buffer contents, whether to compress buffer content, stretch buffer content, or whether to provide for packet loss concealment for overly delayed or lost packets as a function of buffer content. Further, the adaptive audio playback controller also determines where to stretch or compress particular frames or signal segments in the signal buffer, and how much to stretch or compress such segments in order to optimize perceived playback quality.

    摘要翻译: “自适应音频播放控制器”通过将音频信号的接收分组解码并读取到信号缓冲器来进行操作。 然后根据播放器设备的需要从信号缓冲器中播放经解码的音频信号的样本。 抖动控制和分组丢失隐藏是通过实时连续分析缓冲区内容来实现的,并且确定是否从缓冲器内容中提供未修改的重放,是否压缩缓冲区内容,扩展缓冲区内容,还是提供丢包隐藏 延迟或丢失的数据包作为缓冲区内容的函数。 此外,自适应音频重放控制器还确定在哪里拉伸或压缩信号缓冲器中的特定帧或信号段,以及拉伸或压缩这些段以便优化感知的播放质量。

    System and method for providing high-quality stretching and compression of a digital audio signal
    10.
    发明申请
    System and method for providing high-quality stretching and compression of a digital audio signal 有权
    用于提供数字音频信号的高质量拉伸和压缩的系统和方法

    公开(公告)号:US20050055204A1

    公开(公告)日:2005-03-10

    申请号:US10660325

    申请日:2003-09-10

    CPC分类号: G10L21/04 G10L2025/935

    摘要: An adaptive “temporal audio scaler” is provided for automatically stretching and compressing frames of audio signals received across a packet-based network. Prior to stretching or compressing segments of a current frame, the temporal audio scaler first computes a pitch period for each frame for sizing signal templates used for matching operations in stretching and compressing segments. Further, the temporal audio scaler also determines the type or types of segments comprising each frame. These segment types include “voiced” segments, “unvoiced” segments, and “mixed” segments which include both voiced and unvoiced portions. The stretching or compression methods applied to segments of each frame are then dependent upon the type of segments comprising each frame. Further, the amount of stretching and compression applied to particular segments is automatically variable for minimizing signal artifacts while still ensuring that an overall target stretching or compression ratio is maintained for each frame.

    摘要翻译: 提供了一种自适应“时间音频缩放器”,用于自动地拉伸和压缩通过基于分组的网络接收的音频信号的帧。 在拉伸或压缩当前帧的段之前,时间音频缩放器首先计算用于每个帧的音调周期,用于调整用于拉伸和压缩段中的匹配操作的信号模板。 此外,时间音频缩放器还确定包括每个帧的片段的类型或类型。 这些段类型包括“有声”段,“无声”段和包括有声和无声部分的“混合”段。 然后,应用于每个帧的段的拉伸或压缩方法取决于包括每个帧的段的类型。 此外,施加到特定段的拉伸和压缩量可自动变化以最小化信号伪影,同时仍然确保为每个帧维持整体目标拉伸或压缩比。