-
公开(公告)号:US20200176004A1
公开(公告)日:2020-06-04
申请号:US16206823
申请日:2018-11-30
Applicant: Google LLC
Inventor: Willem Bastiaan Kleijn , Jan K. Skoglund , Alejandro Luebs , Sze Chie Lim
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for coding speech using neural networks. One of the methods includes obtaining a bitstream of parametric coder parameters characterizing spoken speech; generating, from the parametric coder parameters, a conditioning sequence; generating a reconstruction of the spoken speech that includes a respective speech sample at each of a plurality of decoder time steps, comprising, at each decoder time step: processing a current reconstruction sequence using an auto-regressive generative neural network, wherein the auto-regressive generative neural network is configured to process the current reconstruction to compute a score distribution over possible speech sample values, and wherein the processing comprises conditioning the auto-regressive generative neural network on at least a portion of the conditioning sequence; and sampling a speech sample from the possible speech sample values.
-
公开(公告)号:US20230368804A1
公开(公告)日:2023-11-16
申请号:US18144413
申请日:2023-05-08
Applicant: Google LLC
Inventor: Willem Bastiaan Kleijn , Jan K. Skoglund , Alejandro Luebs , Sze Chie Lim
CPC classification number: G10L19/0204 , G10L25/30
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for coding speech using neural networks. One of the methods includes obtaining a bitstream of parametric coder parameters characterizing spoken speech; generating, from the parametric coder parameters, a conditioning sequence; generating a reconstruction of the spoken speech that includes a respective speech sample at each of a plurality of decoder time steps, comprising, at each decoder time step: processing a current reconstruction sequence using an auto-regressive generative neural network, wherein the auto-regressive generative neural network is configured to process the current reconstruction to compute a score distribution over possible speech sample values, and wherein the processing comprises conditioning the auto-regressive generative neural network on at least a portion of the conditioning sequence; and sampling a speech sample from the possible speech sample values.
-
公开(公告)号:US20210366495A1
公开(公告)日:2021-11-25
申请号:US17332898
申请日:2021-05-27
Applicant: Google LLC
Inventor: Willem Bastiaan Kleijn , Jan K. Skoglund , Alejandro Luebs , Sze Chie Lim
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for coding speech using neural networks. One of the methods includes obtaining a bitstream of parametric coder parameters characterizing spoken speech; generating, from the parametric coder parameters, a conditioning sequence; generating a reconstruction of the spoken speech that includes a respective speech sample at each of a plurality of decoder time steps, comprising, at each decoder time step: processing a current reconstruction sequence using an auto-regressive generative neural network, wherein the auto-regressive generative neural network is configured to process the current reconstruction to compute a score distribution over possible speech sample values, and wherein the processing comprises conditioning the auto-regressive generative neural network on at least a portion of the conditioning sequence; and sampling a speech sample from the possible speech sample values.
-
公开(公告)号:US12062380B2
公开(公告)日:2024-08-13
申请号:US18144413
申请日:2023-05-08
Applicant: Google LLC
Inventor: Willem Bastiaan Kleijn , Jan K. Skoglund , Alejandro Luebs , Sze Chie Lim
CPC classification number: G10L19/0204 , G10L25/30
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for coding speech using neural networks. One of the methods includes obtaining a bitstream of parametric coder parameters characterizing spoken speech; generating, from the parametric coder parameters, a conditioning sequence; generating a reconstruction of the spoken speech that includes a respective speech sample at each of a plurality of decoder time steps, comprising, at each decoder time step: processing a current reconstruction sequence using an auto-regressive generative neural network, wherein the auto-regressive generative neural network is configured to process the current reconstruction to compute a score distribution over possible speech sample values, and wherein the processing comprises conditioning the auto-regressive generative neural network on at least a portion of the conditioning sequence; and sampling a speech sample from the possible speech sample values.
-
公开(公告)号:US11676613B2
公开(公告)日:2023-06-13
申请号:US17332898
申请日:2021-05-27
Applicant: Google LLC
Inventor: Willem Bastiaan Kleijn , Jan K. Skoglund , Alejandro Luebs , Sze Chie Lim
CPC classification number: G10L19/0204 , G10L25/30
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for coding speech using neural networks. One of the methods includes obtaining a bitstream of parametric coder parameters characterizing spoken speech; generating, from the parametric coder parameters, a conditioning sequence; generating a reconstruction of the spoken speech that includes a respective speech sample at each of a plurality of decoder time steps, comprising, at each decoder time step: processing a current reconstruction sequence using an auto-regressive generative neural network, wherein the auto-regressive generative neural network is configured to process the current reconstruction to compute a score distribution over possible speech sample values, and wherein the processing comprises conditioning the auto-regressive generative neural network on at least a portion of the conditioning sequence; and sampling a speech sample from the possible speech sample values.
-
公开(公告)号:US11024321B2
公开(公告)日:2021-06-01
申请号:US16206823
申请日:2018-11-30
Applicant: Google LLC
Inventor: Willem Bastiaan Kleijn , Jan K. Skoglund , Alejandro Luebs , Sze Chie Lim
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for coding speech using neural networks. One of the methods includes obtaining a bitstream of parametric coder parameters characterizing spoken speech; generating, from the parametric coder parameters, a conditioning sequence; generating a reconstruction of the spoken speech that includes a respective speech sample at each of a plurality of decoder time steps, comprising, at each decoder time step: processing a current reconstruction sequence using an auto-regressive generative neural network, wherein the auto-regressive generative neural network is configured to process the current reconstruction to compute a score distribution over possible speech sample values, and wherein the processing comprises conditioning the auto-regressive generative neural network on at least a portion of the conditioning sequence; and sampling a speech sample from the possible speech sample values.
-
-
-
-
-