SPEECH CODING USING AUTO-REGRESSIVE GENERATIVE NEURAL NETWORKS

    公开(公告)号:US20230368804A1

    公开(公告)日:2023-11-16

    申请号:US18144413

    申请日:2023-05-08

    Applicant: Google LLC

    CPC classification number: G10L19/0204 G10L25/30

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for coding speech using neural networks. One of the methods includes obtaining a bitstream of parametric coder parameters characterizing spoken speech; generating, from the parametric coder parameters, a conditioning sequence; generating a reconstruction of the spoken speech that includes a respective speech sample at each of a plurality of decoder time steps, comprising, at each decoder time step: processing a current reconstruction sequence using an auto-regressive generative neural network, wherein the auto-regressive generative neural network is configured to process the current reconstruction to compute a score distribution over possible speech sample values, and wherein the processing comprises conditioning the auto-regressive generative neural network on at least a portion of the conditioning sequence; and sampling a speech sample from the possible speech sample values.

    SPEECH CODING USING AUTO-REGRESSIVE GENERATIVE NEURAL NETWORKS

    公开(公告)号:US20210366495A1

    公开(公告)日:2021-11-25

    申请号:US17332898

    申请日:2021-05-27

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for coding speech using neural networks. One of the methods includes obtaining a bitstream of parametric coder parameters characterizing spoken speech; generating, from the parametric coder parameters, a conditioning sequence; generating a reconstruction of the spoken speech that includes a respective speech sample at each of a plurality of decoder time steps, comprising, at each decoder time step: processing a current reconstruction sequence using an auto-regressive generative neural network, wherein the auto-regressive generative neural network is configured to process the current reconstruction to compute a score distribution over possible speech sample values, and wherein the processing comprises conditioning the auto-regressive generative neural network on at least a portion of the conditioning sequence; and sampling a speech sample from the possible speech sample values.

    Mixture model based soft-clipping detection

    公开(公告)号:US10110187B1

    公开(公告)日:2018-10-23

    申请号:US15632498

    申请日:2017-06-26

    Applicant: GOOGLE LLC

    Abstract: Mixture model based soft-clipping detection includes receiving input audio samples, generating soft-clipping information indicating whether the input audio samples include soft-clipping distortion, and outputting the soft-clipping information. Generating the soft-clipping information includes fitting a mixture model to the input audio samples, wherein fitting the mixture model to the input audio samples includes generating a fitted mixture model, such that the fitted mixture model has fitted parameters, and evaluating a soft-clipping distortion metric based on the parameters of the fitted mixture model, wherein evaluating the soft-clipping distortion metric includes identifying a soft-clipping distortion value.

    SPEECH CODING USING AUTO-REGRESSIVE GENERATIVE NEURAL NETWORKS

    公开(公告)号:US20200176004A1

    公开(公告)日:2020-06-04

    申请号:US16206823

    申请日:2018-11-30

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for coding speech using neural networks. One of the methods includes obtaining a bitstream of parametric coder parameters characterizing spoken speech; generating, from the parametric coder parameters, a conditioning sequence; generating a reconstruction of the spoken speech that includes a respective speech sample at each of a plurality of decoder time steps, comprising, at each decoder time step: processing a current reconstruction sequence using an auto-regressive generative neural network, wherein the auto-regressive generative neural network is configured to process the current reconstruction to compute a score distribution over possible speech sample values, and wherein the processing comprises conditioning the auto-regressive generative neural network on at least a portion of the conditioning sequence; and sampling a speech sample from the possible speech sample values.

    Speech coding using auto-regressive generative neural networks

    公开(公告)号:US12062380B2

    公开(公告)日:2024-08-13

    申请号:US18144413

    申请日:2023-05-08

    Applicant: Google LLC

    CPC classification number: G10L19/0204 G10L25/30

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for coding speech using neural networks. One of the methods includes obtaining a bitstream of parametric coder parameters characterizing spoken speech; generating, from the parametric coder parameters, a conditioning sequence; generating a reconstruction of the spoken speech that includes a respective speech sample at each of a plurality of decoder time steps, comprising, at each decoder time step: processing a current reconstruction sequence using an auto-regressive generative neural network, wherein the auto-regressive generative neural network is configured to process the current reconstruction to compute a score distribution over possible speech sample values, and wherein the processing comprises conditioning the auto-regressive generative neural network on at least a portion of the conditioning sequence; and sampling a speech sample from the possible speech sample values.

    Speech coding using auto-regressive generative neural networks

    公开(公告)号:US11676613B2

    公开(公告)日:2023-06-13

    申请号:US17332898

    申请日:2021-05-27

    Applicant: Google LLC

    CPC classification number: G10L19/0204 G10L25/30

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for coding speech using neural networks. One of the methods includes obtaining a bitstream of parametric coder parameters characterizing spoken speech; generating, from the parametric coder parameters, a conditioning sequence; generating a reconstruction of the spoken speech that includes a respective speech sample at each of a plurality of decoder time steps, comprising, at each decoder time step: processing a current reconstruction sequence using an auto-regressive generative neural network, wherein the auto-regressive generative neural network is configured to process the current reconstruction to compute a score distribution over possible speech sample values, and wherein the processing comprises conditioning the auto-regressive generative neural network on at least a portion of the conditioning sequence; and sampling a speech sample from the possible speech sample values.

    Speech coding using auto-regressive generative neural networks

    公开(公告)号:US11024321B2

    公开(公告)日:2021-06-01

    申请号:US16206823

    申请日:2018-11-30

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for coding speech using neural networks. One of the methods includes obtaining a bitstream of parametric coder parameters characterizing spoken speech; generating, from the parametric coder parameters, a conditioning sequence; generating a reconstruction of the spoken speech that includes a respective speech sample at each of a plurality of decoder time steps, comprising, at each decoder time step: processing a current reconstruction sequence using an auto-regressive generative neural network, wherein the auto-regressive generative neural network is configured to process the current reconstruction to compute a score distribution over possible speech sample values, and wherein the processing comprises conditioning the auto-regressive generative neural network on at least a portion of the conditioning sequence; and sampling a speech sample from the possible speech sample values.

Patent Agency Ranking