-
公开(公告)号:US11924618B2
公开(公告)日:2024-03-05
申请号:US17959734
申请日:2022-10-04
Applicant: Google LLC
Inventor: Rajeev Conrad Nongpiur , Ananya Misra , Chanwoo Kim
CPC classification number: H04R3/005 , H04R5/027 , H04R29/005 , H04R29/006 , H04R2201/401 , H04R2430/20
Abstract: A method for auralizing a multi-microphone device. Path information for one or more sound paths using dimensions and room reflection coefficients of a simulated room for one of a plurality of microphones included in a multi-microphone device is determined. An array-related transfer functions (ARTFs) for the one of the plurality of microphones is retrieved. The auralized impulse response for the one of the plurality of microphones is generated based at least on the retrieved ARTFs and the determined path information.
-
公开(公告)号:US20230379645A1
公开(公告)日:2023-11-23
申请号:US17748356
申请日:2022-05-19
Applicant: Google LLC
Inventor: Rajeev Conrad Nongpiur , Qian Zhang , Andrew James Sutter , Kung-Wei Liu , Jihan Li , Hélène Bahu , Leonardo Kusumo , Sze Chie Lim , Marco Tagliasacchi , Neil Zeghidour , Michael Takezo Chinen
CPC classification number: H04S7/30 , G10L19/008 , H04R5/027 , H04R3/005 , H04S3/008 , G06N20/00 , H04S2420/11 , H04S2400/11 , H04S2400/15 , H04S2420/03 , H04S2400/01 , H04R2420/07
Abstract: The technology generally relates to spatial audio communication between devices. For example, a first device and a second device may be connected via a communication link. The first device may capture audio signals in an environment through two or more microphones. The first device may encode the captured audio with spatial configuration data. The first device may transmit the encoded audio via the communication link to the second device. The second device may decode the encoded audio into binaural or ambisonic audio to be output by one or more speakers of the second device. The binaural or ambisonic audio may be converted into spatial audio to be output. The second device may output the binaural or spatial audio to create an immersive listening experience.
-
13.
公开(公告)号:US20220247978A1
公开(公告)日:2022-08-04
申请号:US17659718
申请日:2022-04-19
Applicant: Google LLC
Inventor: Jason Evans Goulden , Rengarajan Aravamudhan , Hae Rim Jeong , Michael Dixon , James Edward Stewart , Sayed Yusef Shafi , Sahana Mysore , Seungho Yang , Yu-An Lien , Christopher Charles Burns , Rajeev Conrad Nongpiur , Jeffrey Boyd
Abstract: A method of presenting appropriate actions for responding to a visitor to a smart home environment via an electronic greeting system of the smart home environment, including detecting a visitor of the smart home environment; obtaining context information from the smart home environment regarding the visitor; based on the context information, identifying a plurality of appropriate actions available to a user of a client device for interacting with the visitor via the electronic greeting system; and causing the identified actions to be presented to the user of the client device.
-
公开(公告)号:US20190387315A1
公开(公告)日:2019-12-19
申请号:US16555118
申请日:2019-08-29
Applicant: Google LLC
Inventor: Rajeev Conrad Nongpiur , Ananya Misra , Chanwoo Kim
Abstract: A method for auralizing a multi-microphone device. Path information for one or more sound paths using dimensions and room reflection coefficients of a simulated room for one of a plurality of microphones included in a multi-microphone device is determined. An array-related transfer functions (ARTFs) for the one of the plurality of microphones is retrieved. The auralized impulse response for the one of the plurality of microphones is generated based at least on the retrieved ARTFs and the determined path information.
-
公开(公告)号:US12073319B2
公开(公告)日:2024-08-27
申请号:US16940294
申请日:2020-07-27
Applicant: GOOGLE LLC
Inventor: Rajeev Conrad Nongpiur , Byungchul Kim , Marie Vachovsky , Monica Song , Khe Chai Sim , Qian Zhang
Abstract: Systems and techniques are provided for sound model localization within an environment. Sound recordings of sounds in the environment may be received from devices in the environment. Preliminary labels for the sound recordings may be determined using pre-trained sound models. The preliminary labels may have associated probabilities. Sound clips with preliminary labels may be generated based on sound recordings that have preliminary labels whose probability is over a high-recall threshold for the pre-trained sound model that determined the preliminary label. The sound clips with preliminary labels may be sent to a user device. Labeled sound clips may be received from the user device. The labeled sound clips may be based on the sound clips with preliminary labels. Training data sets may be generated for the pre-trained sound models using the labeled sound clips. The pre-trained sound models may be trained using the training data sets to generate localized sound models.
-
公开(公告)号:US11470419B2
公开(公告)日:2022-10-11
申请号:US16555118
申请日:2019-08-29
Applicant: Google LLC
Inventor: Rajeev Conrad Nongpiur , Ananya Misra , Chanwoo Kim
Abstract: A method for auralizing a multi-microphone device. Path information for one or more sound paths using dimensions and room reflection coefficients of a simulated room for one of a plurality of microphones included in a multi-microphone device is determined. An array-related transfer functions (ARTFs) for the one of the plurality of microphones is retrieved. The auralized impulse response for the one of the plurality of microphones is generated based at least on the retrieved ARTFs and the determined path information.
-
公开(公告)号:US20220238112A1
公开(公告)日:2022-07-28
申请号:US17722960
申请日:2022-04-18
Applicant: Google LLC
Inventor: Chanwoo Kim , Rajeev Conrad Nongpiur , Michiel A.U. Bacchiani
Abstract: Systems and methods are described for improving endpoint detection of a voice query submitted by a user. In some implementations, a synchronized video data and audio data is received. A sequence of frames of the video data that includes images corresponding to lip movement on a face is determined. The audio data is endpointed based on first audio data that corresponds to a first frame of the sequence of frames and second audio data that corresponds to a last frame of the sequence of frames. A transcription of the endpointed audio data is generated by an automated speech recognizer. The generated transcription is then provided for output.
-
公开(公告)号:US11356643B2
公开(公告)日:2022-06-07
申请号:US17400887
申请日:2021-08-12
Applicant: Google LLC
Inventor: Jason Evans Goulden , Rengarajan Aravamudhan , Hae Rim Jeong , Michael Dixon , James Edward Stewart , Sayed Yusef Shafi , Sahana Mysore , Seungho Yang , Yu-An Lien , Christopher Charles Burns , Rajeev Conrad Nongpiur , Jeffrey Boyd
IPC: H04N7/18 , G05B15/02 , G06K9/00 , G08B7/06 , G06K9/78 , H04M11/02 , G08B13/196 , G08B3/10 , H04N7/14 , G06F3/0482
Abstract: A method of presenting appropriate actions for responding to a visitor to a smart home environment via an electronic greeting system of the smart home environment, including detecting a visitor of the smart home environment; obtaining context information from the smart home environment regarding the visitor; based on the context information, identifying a plurality of appropriate actions available to a user of a client device for interacting with the visitor via the electronic greeting system; and causing the identified actions to be presented to the user of the client device.
-
公开(公告)号:US20220027725A1
公开(公告)日:2022-01-27
申请号:US16940294
申请日:2020-07-27
Applicant: GOOGLE LLC
Inventor: Rajeev Conrad Nongpiur , Byungchul Kim , Marie Vachovsky , Monica Song , Khe Chai Sim , Qian Zhang
Abstract: Systems and techniques are provided for sound model localization within an environment. Sound recordings of sounds in the environment may be received from devices in the environment. Preliminary labels for the sound recordings may be determined using pre-trained sound models. The preliminary labels may have associated probabilities. Sound clips with preliminary labels may be generated based on sound recordings that have preliminary labels whose probability is over a high-recall threshold for the pre-trained sound model that determined the preliminary label. The sound clips with preliminary labels may be sent to a user device. Labeled sound clips may be received from the user device. The labeled sound clips may be based on the sound clips with preliminary labels. Training data sets may be generated for the pre-trained sound models using the labeled sound clips. The pre-trained sound models may be trained using the training data sets to generate localized sound models.
-
公开(公告)号:US10755714B2
公开(公告)日:2020-08-25
申请号:US16412677
申请日:2019-05-15
Applicant: Google LLC
Inventor: Chanwoo Kim , Rajeev Conrad Nongpiur , Michiel A. U. Bacchiani
Abstract: Systems and methods are described for improving endpoint detection of a voice query submitted by a user. In some implementations, a synchronized video data and audio data is received. A sequence of frames of the video data that includes images corresponding to lip movement on a face is determined. The audio data is endpointed based on first audio data that corresponds to a first frame of the sequence of frames and second audio data that corresponds to a last frame of the sequence of frames. A transcription of the endpointed audio data is generated by an automated speech recognizer. The generated transcription is then provided for output.
-
-
-
-
-
-
-
-
-