Patent search ap:("GOOGLE LLC") AND inv:"Sze Chie Lim" Page 1

1.

发明授权
Spatial audio recording from home assistant devices 有权

公开(公告)号：US12200465B2

公开(公告)日：2025-01-14

申请号：US17748356

申请日：2022-05-19

Applicant: Google LLC

Inventor： Rajeev Conrad Nongpiur , Qian Zhang , Andrew James Sutter , Kung-Wei Liu , Jihan Li , Hélène Bahu , Leonardo Kusumo , Sze Chie Lim , Marco Tagliasacchi , Neil Zeghidour , Michael Takezo Chinen

IPC: H04S3/00 , G06N20/00 , G10L19/008 , H04R3/00 , H04R5/027 , H04S7/00

Abstract: The technology generally relates to spatial audio communication between devices. For example, a first device and a second device may be connected via a communication link. The first device may capture audio signals in an environment through two or more microphones. The first device may encode the captured audio with spatial configuration data. The first device may transmit the encoded audio via the communication link to the second device. The second device may decode the encoded audio into binaural or ambisonic audio to be output by one or more speakers of the second device. The binaural or ambisonic audio may be converted into spatial audio to be output. The second device may output the binaural or spatial audio to create an immersive listening experience.

2.

发明申请
SPEECH CODING USING AUTO-REGRESSIVE GENERATIVE NEURAL NETWORKS 有权

公开(公告)号：US20210366495A1

公开(公告)日：2021-11-25

申请号：US17332898

申请日：2021-05-27

Applicant: Google LLC

Inventor： Willem Bastiaan Kleijn , Jan K. Skoglund , Alejandro Luebs , Sze Chie Lim

IPC: G10L19/02 , G10L25/30

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for coding speech using neural networks. One of the methods includes obtaining a bitstream of parametric coder parameters characterizing spoken speech; generating, from the parametric coder parameters, a conditioning sequence; generating a reconstruction of the spoken speech that includes a respective speech sample at each of a plurality of decoder time steps, comprising, at each decoder time step: processing a current reconstruction sequence using an auto-regressive generative neural network, wherein the auto-regressive generative neural network is configured to process the current reconstruction to compute a score distribution over possible speech sample values, and wherein the processing comprises conditioning the auto-regressive generative neural network on at least a portion of the conditioning sequence; and sampling a speech sample from the possible speech sample values.

3.

发明授权
Speech coding using auto-regressive generative neural networks 有权

公开(公告)号：US12062380B2

公开(公告)日：2024-08-13

申请号：US18144413

申请日：2023-05-08

Applicant: Google LLC

Inventor： Willem Bastiaan Kleijn , Jan K. Skoglund , Alejandro Luebs , Sze Chie Lim

IPC: G10L19/02 , G10L25/30

CPC classification number: G10L19/0204 , G10L25/30

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for coding speech using neural networks. One of the methods includes obtaining a bitstream of parametric coder parameters characterizing spoken speech; generating, from the parametric coder parameters, a conditioning sequence; generating a reconstruction of the spoken speech that includes a respective speech sample at each of a plurality of decoder time steps, comprising, at each decoder time step: processing a current reconstruction sequence using an auto-regressive generative neural network, wherein the auto-regressive generative neural network is configured to process the current reconstruction to compute a score distribution over possible speech sample values, and wherein the processing comprises conditioning the auto-regressive generative neural network on at least a portion of the conditioning sequence; and sampling a speech sample from the possible speech sample values.

4.

发明授权
Speech coding using auto-regressive generative neural networks 有权

公开(公告)号：US11676613B2

公开(公告)日：2023-06-13

申请号：US17332898

申请日：2021-05-27

Applicant: Google LLC

Inventor： Willem Bastiaan Kleijn , Jan K. Skoglund , Alejandro Luebs , Sze Chie Lim

IPC: G10L19/02 , G10L25/30

CPC classification number: G10L19/0204 , G10L25/30

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for coding speech using neural networks. One of the methods includes obtaining a bitstream of parametric coder parameters characterizing spoken speech; generating, from the parametric coder parameters, a conditioning sequence; generating a reconstruction of the spoken speech that includes a respective speech sample at each of a plurality of decoder time steps, comprising, at each decoder time step: processing a current reconstruction sequence using an auto-regressive generative neural network, wherein the auto-regressive generative neural network is configured to process the current reconstruction to compute a score distribution over possible speech sample values, and wherein the processing comprises conditioning the auto-regressive generative neural network on at least a portion of the conditioning sequence; and sampling a speech sample from the possible speech sample values.

5.

发明授权
Speech coding using auto-regressive generative neural networks 有权

公开(公告)号：US11024321B2

公开(公告)日：2021-06-01

申请号：US16206823

申请日：2018-11-30

Applicant: Google LLC

Inventor： Willem Bastiaan Kleijn , Jan K. Skoglund , Alejandro Luebs , Sze Chie Lim

IPC: G10L19/02 , G10L25/30

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for coding speech using neural networks. One of the methods includes obtaining a bitstream of parametric coder parameters characterizing spoken speech; generating, from the parametric coder parameters, a conditioning sequence; generating a reconstruction of the spoken speech that includes a respective speech sample at each of a plurality of decoder time steps, comprising, at each decoder time step: processing a current reconstruction sequence using an auto-regressive generative neural network, wherein the auto-regressive generative neural network is configured to process the current reconstruction to compute a score distribution over possible speech sample values, and wherein the processing comprises conditioning the auto-regressive generative neural network on at least a portion of the conditioning sequence; and sampling a speech sample from the possible speech sample values.

6.

发明申请
SPEECH CODING USING AUTO-REGRESSIVE GENERATIVE NEURAL NETWORKS 审中-公开

公开(公告)号：US20200176004A1

公开(公告)日：2020-06-04

申请号：US16206823

申请日：2018-11-30

Applicant: Google LLC

Inventor： Willem Bastiaan Kleijn , Jan K. Skoglund , Alejandro Luebs , Sze Chie Lim

IPC: G10L19/02 , G10L25/30

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for coding speech using neural networks. One of the methods includes obtaining a bitstream of parametric coder parameters characterizing spoken speech; generating, from the parametric coder parameters, a conditioning sequence; generating a reconstruction of the spoken speech that includes a respective speech sample at each of a plurality of decoder time steps, comprising, at each decoder time step: processing a current reconstruction sequence using an auto-regressive generative neural network, wherein the auto-regressive generative neural network is configured to process the current reconstruction to compute a score distribution over possible speech sample values, and wherein the processing comprises conditioning the auto-regressive generative neural network on at least a portion of the conditioning sequence; and sampling a speech sample from the possible speech sample values.

7.

发明公开
SPECIFYING LOUDNESS IN AN IMMERSIVE AUDIO PACKAGE 审中-公开

公开(公告)号：US20240329915A1

公开(公告)日：2024-10-03

申请号：US18353037

申请日：2023-07-14

Applicant: GOOGLE LLC

Inventor： Sze Chie Lim , Shawn Singh

IPC: G06F3/16

CPC classification number: G06F3/165 , G06F3/162

Abstract: A method including generating an audio stream including a first substream as first audio data and a second substream as second audio data, generating a first loudness parameter associated with playback of the first substream, generating a second loudness parameter associated with playback of the second substream, and generating an audio package including an identification corresponding to the first audio data, an identification corresponding to the second audio data, and a codec agnostic container including the first loudness parameter, and the second loudness parameter.

8.

发明申请
CODING OF A SOUNDFIELD REPRESENTATION 审中-公开

公开(公告)号：US20190259397A1

公开(公告)日：2019-08-22

申请号：US16404076

申请日：2019-05-06

Applicant: Google LLC

Inventor： Willem Bastiaan Kleijn , Jan Skoglund , Sze Chie Lim

IPC: G10L19/008 , G10L19/20 , H04S3/00 , G10L19/24

Abstract: A method includes: receiving a representation of a soundfield, the representation characterizing the soundfield around a point in space; decomposing the received representation into independent signals; and encoding the independent signals, wherein a quantization noise for any of the independent signals has a common spatial profile with the independent signal.

9.

发明授权
Identifying salient features for generative networks 有权

公开(公告)号：US12242567B2

公开(公告)日：2025-03-04

申请号：US17250506

申请日：2019-05-16

Applicant: Google LLC

Inventor： Willem Bastiaan Kleijn , Sze Chie Lim , Michael Chinen , Jan Skoglund

IPC: G06N3/045 , G06F18/2113 , G06F18/213 , G06N3/08 , G06N3/088

Abstract: Implementations identify a small set of independent, salient features from an input signal. The salient features may be used for conditioning a generative network, making the generative network robust to noise. The salient features may facilitate compression and data transmission. An example method includes receiving an input signal and extracting salient features for the input signal by providing the input signal to an encoder trained to extract salient features. The salient features may be independent and have a sparse distribution. The encoder may be configured to generate almost identical features from two input signals a system designer deems equivalent. The method also includes conditioning a generative network using the salient features. In some implementations, the method may also include extracting a plurality of time sequences from the input signal and extracting the salient features for each time sequence.

10.

发明公开
IMMERSIVE AUDIO PACKAGE 审中-公开

公开(公告)号：US20240331709A1

公开(公告)日：2024-10-03

申请号：US18355928

申请日：2023-07-20

Applicant: GOOGLE LLC

Inventor： Sze Chie Lim , Shawn Singh , Anjali Wheeler , Jani Huoponen , Jan Skoglund

IPC: G10L19/02

CPC classification number: G10L19/02

Abstract: A method including receiving first audio data, receiving second audio data, compressing the first audio data as first compressed audio data, compressing the second audio data as second compressed audio data, generating a codec dependent container including a parameter associated with compressing the first audio data, compressing the second audio data, a reference to the first compressed audio data, and a reference to the second compressed audio data, generating a codec agnostic container including at least one parameter representing time-varying data associated with playback of the first audio data and the second audio data, and generating an audio package including the codec dependent container and the codec agnostic container.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification