-
公开(公告)号:US20230368804A1
公开(公告)日:2023-11-16
申请号:US18144413
申请日:2023-05-08
Applicant: Google LLC
Inventor: Willem Bastiaan Kleijn , Jan K. Skoglund , Alejandro Luebs , Sze Chie Lim
CPC classification number: G10L19/0204 , G10L25/30
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for coding speech using neural networks. One of the methods includes obtaining a bitstream of parametric coder parameters characterizing spoken speech; generating, from the parametric coder parameters, a conditioning sequence; generating a reconstruction of the spoken speech that includes a respective speech sample at each of a plurality of decoder time steps, comprising, at each decoder time step: processing a current reconstruction sequence using an auto-regressive generative neural network, wherein the auto-regressive generative neural network is configured to process the current reconstruction to compute a score distribution over possible speech sample values, and wherein the processing comprises conditioning the auto-regressive generative neural network on at least a portion of the conditioning sequence; and sampling a speech sample from the possible speech sample values.
-
公开(公告)号:US20210366495A1
公开(公告)日:2021-11-25
申请号:US17332898
申请日:2021-05-27
Applicant: Google LLC
Inventor: Willem Bastiaan Kleijn , Jan K. Skoglund , Alejandro Luebs , Sze Chie Lim
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for coding speech using neural networks. One of the methods includes obtaining a bitstream of parametric coder parameters characterizing spoken speech; generating, from the parametric coder parameters, a conditioning sequence; generating a reconstruction of the spoken speech that includes a respective speech sample at each of a plurality of decoder time steps, comprising, at each decoder time step: processing a current reconstruction sequence using an auto-regressive generative neural network, wherein the auto-regressive generative neural network is configured to process the current reconstruction to compute a score distribution over possible speech sample values, and wherein the processing comprises conditioning the auto-regressive generative neural network on at least a portion of the conditioning sequence; and sampling a speech sample from the possible speech sample values.
-
公开(公告)号:US10110187B1
公开(公告)日:2018-10-23
申请号:US15632498
申请日:2017-06-26
Applicant: GOOGLE LLC
Inventor: Alejandro Luebs , Fritz Obermeyer
Abstract: Mixture model based soft-clipping detection includes receiving input audio samples, generating soft-clipping information indicating whether the input audio samples include soft-clipping distortion, and outputting the soft-clipping information. Generating the soft-clipping information includes fitting a mixture model to the input audio samples, wherein fitting the mixture model to the input audio samples includes generating a fitted mixture model, such that the fitted mixture model has fitted parameters, and evaluating a soft-clipping distortion metric based on the parameters of the fitted mixture model, wherein evaluating the soft-clipping distortion metric includes identifying a soft-clipping distortion value.
-
公开(公告)号:US20200176004A1
公开(公告)日:2020-06-04
申请号:US16206823
申请日:2018-11-30
Applicant: Google LLC
Inventor: Willem Bastiaan Kleijn , Jan K. Skoglund , Alejandro Luebs , Sze Chie Lim
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for coding speech using neural networks. One of the methods includes obtaining a bitstream of parametric coder parameters characterizing spoken speech; generating, from the parametric coder parameters, a conditioning sequence; generating a reconstruction of the spoken speech that includes a respective speech sample at each of a plurality of decoder time steps, comprising, at each decoder time step: processing a current reconstruction sequence using an auto-regressive generative neural network, wherein the auto-regressive generative neural network is configured to process the current reconstruction to compute a score distribution over possible speech sample values, and wherein the processing comprises conditioning the auto-regressive generative neural network on at least a portion of the conditioning sequence; and sampling a speech sample from the possible speech sample values.
-
公开(公告)号:US12062380B2
公开(公告)日:2024-08-13
申请号:US18144413
申请日:2023-05-08
Applicant: Google LLC
Inventor: Willem Bastiaan Kleijn , Jan K. Skoglund , Alejandro Luebs , Sze Chie Lim
CPC classification number: G10L19/0204 , G10L25/30
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for coding speech using neural networks. One of the methods includes obtaining a bitstream of parametric coder parameters characterizing spoken speech; generating, from the parametric coder parameters, a conditioning sequence; generating a reconstruction of the spoken speech that includes a respective speech sample at each of a plurality of decoder time steps, comprising, at each decoder time step: processing a current reconstruction sequence using an auto-regressive generative neural network, wherein the auto-regressive generative neural network is configured to process the current reconstruction to compute a score distribution over possible speech sample values, and wherein the processing comprises conditioning the auto-regressive generative neural network on at least a portion of the conditioning sequence; and sampling a speech sample from the possible speech sample values.
-
公开(公告)号:US11676613B2
公开(公告)日:2023-06-13
申请号:US17332898
申请日:2021-05-27
Applicant: Google LLC
Inventor: Willem Bastiaan Kleijn , Jan K. Skoglund , Alejandro Luebs , Sze Chie Lim
CPC classification number: G10L19/0204 , G10L25/30
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for coding speech using neural networks. One of the methods includes obtaining a bitstream of parametric coder parameters characterizing spoken speech; generating, from the parametric coder parameters, a conditioning sequence; generating a reconstruction of the spoken speech that includes a respective speech sample at each of a plurality of decoder time steps, comprising, at each decoder time step: processing a current reconstruction sequence using an auto-regressive generative neural network, wherein the auto-regressive generative neural network is configured to process the current reconstruction to compute a score distribution over possible speech sample values, and wherein the processing comprises conditioning the auto-regressive generative neural network on at least a portion of the conditioning sequence; and sampling a speech sample from the possible speech sample values.
-
公开(公告)号:US11024321B2
公开(公告)日:2021-06-01
申请号:US16206823
申请日:2018-11-30
Applicant: Google LLC
Inventor: Willem Bastiaan Kleijn , Jan K. Skoglund , Alejandro Luebs , Sze Chie Lim
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for coding speech using neural networks. One of the methods includes obtaining a bitstream of parametric coder parameters characterizing spoken speech; generating, from the parametric coder parameters, a conditioning sequence; generating a reconstruction of the spoken speech that includes a respective speech sample at each of a plurality of decoder time steps, comprising, at each decoder time step: processing a current reconstruction sequence using an auto-regressive generative neural network, wherein the auto-regressive generative neural network is configured to process the current reconstruction to compute a score distribution over possible speech sample values, and wherein the processing comprises conditioning the auto-regressive generative neural network on at least a portion of the conditioning sequence; and sampling a speech sample from the possible speech sample values.
-
-
-
-
-
-