Method, Apparatus, and System for Processing Audio Data
    13.
    发明申请
    Method, Apparatus, and System for Processing Audio Data 有权
    用于处理音频数据的方法,装置和系统

    公开(公告)号:US20160300578A1

    公开(公告)日:2016-10-13

    申请号:US15188518

    申请日:2016-06-21

    Inventor: Zhe Wang

    Abstract: A method for processing audio data includes obtaining a noise frame of an audio signal, and decomposing the current noise frame into a noise low-band signal and a noise high-band signal; and encoding and transmitting the noise low-band signal by using a first discontinuous transmission mechanism, and encoding and transmitting the noise high-band signal by using a second discontinuous transmission mechanism. According to the present disclosure, different processing manners are used for the high-band signal and the low-band signal, calculation loads and encoded bits may be saved under a premise of not lowering subjective quality of a codec, and bits that are saved may help to achieve an objective of reducing a transmission bandwidth or improving overall encoding quality.

    Abstract translation: 一种用于处理音频数据的方法包括获得音频信号的噪声帧,并将当前噪声帧分解为噪声低频带信号和噪声高频带信号; 并通过使用第一不连续传输机制对噪声低频带信号进行编码和发送,并通过使用第二不连续传输机制对噪声高频带信号进行编码和发送。 根据本公开,对于高频带信号和低频带信号使用不同的处理方式,可以在不降低编解码器的主观质量的前提下保存计算负载和编码比特,并且保存的比特可以 有助于实现减少传输带宽或提高整体编码质量的目标。

    METHOD AND APPARATUS FOR PERFORMING VOICE ACTIVITY DETECTION
    14.
    发明申请
    METHOD AND APPARATUS FOR PERFORMING VOICE ACTIVITY DETECTION 有权
    用于执行语音活动检测的方法和装置

    公开(公告)号:US20130282367A1

    公开(公告)日:2013-10-24

    申请号:US13924637

    申请日:2013-06-24

    Inventor: Zhe Wang

    CPC classification number: G10L25/93 G10L25/78 G10L2025/786

    Abstract: This application relates to a voice activity detection (VAD) apparatus configured to provide a voice activity detection decision for an input audio signal. The VAD apparatus includes a state detector and a voice activity calculator. The state detector is configured to determine, based on the input audio signal, a current working state of the VAD apparatus among at least two different working states. Each of the at least two different working states is associated with a corresponding working state parameter decision set which includes at least one voice activity decision parameter. The voice activity calculator is configured to calculate a voice activity detection parameter value for the at least one voice activity decision parameter of the working state parameter decision set associated with the current working state, and to provide the voice activity detection decision by comparing the calculated voice activity detection parameter value with a threshold.

    Abstract translation: 本申请涉及被配置为提供输入音频信号的语音活动检测决定的语音活动检测(VAD)装置。 VAD装置包括状态检测器和语音活动计算器。 状态检测器被配置为基于输入音频信号确定VAD装置在至少两个不同工作状态中的当前工作状态。 所述至少两个不同工作状态中的每一个与包括至少一个语音活动决策参数的对应工作状态参数决策集相关联。 语音活动计算器被配置为计算与当前工作状态相关联的工作状态参数决定集合的至少一个语音活动判定参数的语音活动检测参数值,并且通过比较计算出的语音来提供语音活动检测决定 具有阈值的活动检测参数值。

    Audio signal classification based on frequency spectrum fluctuation

    公开(公告)号:US12198719B2

    公开(公告)日:2025-01-14

    申请号:US18360675

    申请日:2023-07-27

    Inventor: Zhe Wang

    Abstract: An audio signal classification method includes determining, according to voice activity of a current audio frame, whether to obtain a frequency spectrum fluctuation of the current audio frame and store the frequency spectrum fluctuation in a frequency spectrum fluctuation memory, and updating, according to whether the audio frame is percussive music or activity of a historical audio frame, frequency spectrum fluctuations stored in the frequency spectrum fluctuation memory, and classifying the current audio frame as a speech frame or a music frame according to statistics of a part or all of effective data of the frequency spectrum fluctuations stored in the frequency spectrum fluctuation memory.

    Multi-channel audio signal coding method and apparatus

    公开(公告)号:US12165660B2

    公开(公告)日:2024-12-10

    申请号:US18154486

    申请日:2023-01-13

    Abstract: A multi-channel audio signal coding method includes obtaining a to-be-encoded first audio frame, pairing at least five channel signals according to a first pairing manner to obtain a first channel pair set, obtaining a first sum of correlation values of the first channel pair set, where one channel pair has one correlation value, pairing the at least five channel signals according to a second pairing manner to obtain a second channel pair set, obtaining a second sum of correlation values of the second channel pair set, determining a target pairing manner of the at least five channel signals based on the first sum of correlation values and the second sum of correlation values, and encoding the at least five channel signals based on a channel pair set corresponding to the target pairing manner, where the target pairing manner is the first pairing manner or the second pairing manner.

    Multi-Channel Signal Encoding and Decoding Method and Apparatus

    公开(公告)号:US20240169998A1

    公开(公告)日:2024-05-23

    申请号:US18423990

    申请日:2024-01-26

    CPC classification number: G10L19/008 G10L19/022

    Abstract: In a multi-channel signal encoding method, a current frame includes a first sound channel and a second sound channel. First group information of M blocks of the first sound channel and second group information of M blocks of the second sound channel are obtained. When the first group information and the second group information meet a preset condition, first adjusted group information and second adjusted group information are obtained based on the first group information and the second group information. Then, a first to-be-encoded spectrum is obtained based on the first adjusted group information and the spectrums of the M blocks of the first sound channel. Similarly, a second to-be-encoded spectrum may be obtained. Finally, the first to-be-encoded spectrum and the second to-be-encoded spectrum are encoded by using an encoding neural network to obtain a spectrum encoding result. The spectrum encoding result may be carried by a bitstream.

    AUDIO ENCODING METHOD AND APPARATUS, AND AUDIO DECODING METHOD AND APPARATUS

    公开(公告)号:US20240079016A1

    公开(公告)日:2024-03-07

    申请号:US18504102

    申请日:2023-11-07

    CPC classification number: G10L19/008 G10L25/03

    Abstract: An audio encoding method and apparatus and an audio decoding method and apparatus are disclosed. During encoding of an audio channel signal of a current frame, whether a first target virtual loudspeaker and a second target virtual loudspeaker corresponding to an audio channel signal of a previous frame of the current frame meet a specified condition is first determined. When the first target virtual loudspeaker and the second target virtual loudspeaker meet the specified condition, a first encoding parameter of the audio channel signal of the current frame is determined based on a second encoding parameter of the audio channel signal of the previous frame, so that the audio channel signal of the current frame is encoded based on the first encoding parameter to obtain an encoding result, and the encoding result is written into a bitstream.

    Method and Apparatus for Obtaining a Higher-Order Ambisonics (HOA) Coefficient

    公开(公告)号:US20230421978A1

    公开(公告)日:2023-12-28

    申请号:US18460861

    申请日:2023-09-05

    CPC classification number: H04S7/30 H04S2420/11

    Abstract: A method for obtaining a higher-order ambisonics (HOA) coefficient includes obtaining location information of a virtual speaker on a preset spherical surface, where the preset spherical surface includes M circles of longitude and N circles of latitude, and obtaining, based on the location information and a preset reference trigonometric function table, a trigonometric function value corresponding to the location information, where the reference trigonometric function table includes an elevation trigonometric function table and/or an azimuth trigonometric function table, and obtaining an HOA coefficient for the virtual speaker based on the trigonometric function value corresponding to the location information.

Patent Agency Ranking