AUDIO DECODER, AUDIO ENCODER, METHOD FOR DECODING, METHOD FOR ENCODING AND BITSTREAM, USING SCENE CONFIGURATION PACKET A CELL INFORMATION DEFINES AN ASSOCIATION BETWEEN THE ONE OR MORE CELLS AND RESPECTIVE ONE OR MORE DATA STRUCTURES

    公开(公告)号:WO2023083921A1

    公开(公告)日:2023-05-19

    申请号:PCT/EP2022/081373

    申请日:2022-11-09

    Abstract: Embodiments according to the invention are related to an audio decoder, for providing a decoded audio representation on the basis of an encoded audio representation, wherein the audio decoder is configured to spatially render one or more audio signals; wherein the audio decoder is configured to receive a plurality of packets of different packet types, the packets comprising one or more scene configuration packets providing a renderer configuration information defining a usage of scene objects and/or a usage of scene characteristics, the packets comprising one or more scene update packets defining a update of scene metadata for the rendering, the packets comprising one or more scene payload packets comprising definitions of one or more of the scene objects and/or definitions of one or more of the scene characteristics; wherein the audio decoder is configured to select definitions of one or more scene objects and/or definitions of one or more scene characteristics, which are in included in the scene payload packets, for the rendering in dependence on the renderer configuration information; and wherein the audio decoder is configured to update one or more scene metadata in dependence on a content of the one or more scene update packets. Further embodiments are related to encoders, methods and bitstreams. Further embodiments are related to decoders, encoders, methods and bitstreams with scene update packets with update conditions, with scene configuration packets providing a renderer configuration information defining a temporal evolution of a rendering scenario and with a timestamp information and/or with subscene cell information, wherein the cell information defines an association between the one or more cells and respective one or more data structures.

    AUDIO SIGNAL RECONSTRUCTION
    2.
    发明申请

    公开(公告)号:WO2023069805A1

    公开(公告)日:2023-04-27

    申请号:PCT/US2022/076172

    申请日:2022-09-09

    Abstract: A method includes receiving audio data that includes magnitude spectrum data descriptive of an audio signal. The method also includes providing the audio data as input to a neural network to generate an initial phase estimate for one or more samples of the audio signal. The method further includes determining, using a phase estimation algorithm, target phase data for the one or more samples of the audio signal based on the initial phase estimate and a magnitude spectrum of the one or more samples of the audio signal indicated by the magnitude spectrum data. The method also includes reconstructing the audio signal based on a target phase of the one or more samples of the audio signal indicated by the target phase data and based on the magnitude spectrum.

    编解码方法、装置、设备、存储介质及计算机程序

    公开(公告)号:WO2022258036A1

    公开(公告)日:2022-12-15

    申请号:PCT/CN2022/098016

    申请日:2022-06-10

    Abstract: 本申请实施例公开了一种编解码方法、装置、设备、存储介质及计算机程序,属于编解码技术领域。在本申请实施例中,通过对媒体数据的第一白化谱进行整形处理,以得到第二白化谱,之后基于第二白化谱进行编码。其中,第二白化谱在目标频段内的频谱幅度值大于或等于第一白化谱在目标频段内的频谱幅度值。可见本方案通过调高第一白化谱在目标频段内的频谱幅度值,使得到的第二白化谱中不同频率的谱线的统计平均能量相差较小,这样通过编码神经网络模型对第二白化谱进行处理的过程中,能够保留第二白化谱中更多的谱线,也即本方案能够编码更多的谱线,从而保留更多的频谱特征,编码质量得到提高。

    AUDIO DIRECTIVITY CODING
    6.
    发明申请

    公开(公告)号:WO2022248632A1

    公开(公告)日:2022-12-01

    申请号:PCT/EP2022/064343

    申请日:2022-05-25

    Abstract: The application discloses techniques for compressively encoding and decoding an audio signal representing a directivity pattern, the audio values having different values according to different discrete positions defined on an unit sphere. The audio signal values are encoded in a bitstream as prediction residual values. The prediction residual values being used in sequences to obtained predicted audio signal values by moving on positions defined on parallel lines, parallel to an equator of the sphere, the parallel lines defined from a first pole toward a second pole of the sphere. The predicted values are obtained based on an initial prediction sequence, on adjacent discrete positions preceding a given position or interpolated versions of the audio values of a previously predicted adjacent parallel line.

    音频处理、视频处理方法、装置、设备及存储介质

    公开(公告)号:WO2022227037A1

    公开(公告)日:2022-11-03

    申请号:PCT/CN2021/091615

    申请日:2021-04-30

    Inventor: 席迎来

    Abstract: 本申请提供一种音频处理、视频处理方法、装置、设备及存储介质;其中,音频处理方法包括:获取待处理的音频;识别所述音频的节拍点;根据所述节拍点的能量幅值,从所述音频的多个节拍点中筛选出至少一个节奏点并输出;其中,所述节奏点的能量幅值,大于所述多个节拍点中除所述节奏点之外的其他节拍点的能量幅值。本申请实施例的方案可以自动对音频打出节奏点;并且,本实施例方法对待处理的音频并未有限制,从而用户可以指定任意的带有音频的文件,使用户可以利用打出的节奏点进行后续处理,具有灵活性强的特点。

    语音合成方法、装置、设备及存储介质

    公开(公告)号:WO2022141678A1

    公开(公告)日:2022-07-07

    申请号:PCT/CN2021/072428

    申请日:2021-01-18

    Abstract: 一种语音合成方法、装置、设备及存储介质,语音合成方法包括获取原始文本、原始文本对应的音素序列,以及待合成语音的说话人特征(S100);将原始文本以及音素序列进行特征融合,得到融合特征(S110);基于融合特征及说话人特征进行编解码处理,得到声学频谱(S120);基于声学频谱进行语音合成,得到合成语音(S130)。由此通过融合原始文本及音素序列得到融合特征,丰富了输入信息,并且能够挖掘不同语种特有的发音信息,得到的合成语音更加自然、符合对应语种的发音特点,也即合成语音的质量更高。

    一种基于音乐频率的振动频率设计方法

    公开(公告)号:WO2022134213A1

    公开(公告)日:2022-06-30

    申请号:PCT/CN2021/070421

    申请日:2021-01-06

    Inventor: 张燕昕 郑亚军

    Abstract: 本发明提供了一种基于音乐频率的振动频率设计方法,包括以下步骤:S1:预先设置一组量化模块,包括:个性输入量化模块;音乐特征量化模块;振动效果量化模块;S2:用户个性化参数输入,通过个性输入量化模块,获取个性输入的具体量化值;S3:提取音乐信号的音乐特征,通过音乐特征量化模块,获取音乐特征的具体量化值;S4:量化计算,按照公式进行计算,获取振动效果频率相对值;S5:通过振动效果量化模块,将振动效果频率相对值进行映射,获取振动效果频率绝对值;S6:马达基于振动效果频率绝对值播放振动。本发明的振动频率设计方法实现了听觉上的音乐频率到触觉上的振动频率完美转换,为设计人员或者用户提供了高效、丰富的触觉体验。

Patent Agency Ranking