Patent search ap:("Google LLC") AND inv:"Michael Rubinstein" Page 1

1.

发明授权
Audio-visual hearing aid 有权

公开(公告)号：US12073844B2

公开(公告)日：2024-08-27

申请号：US17601042

申请日：2020-10-01

Applicant: Google LLC

Inventor： Anatoly Efros , Noam Etzion-Rosenberg , Tal Remez , Oran Lang , Inbar Mosseri , Israel Or Weinstein , Benjamin Schlesinger , Michael Rubinstein , Ariel Ephrat , Yukun Zhu , Stella Laurenzo , Amit Pitaru , Yossi Matias

IPC: G10L21/0208 , G10L17/00 , G10L21/0272 , G10L25/57

CPC classification number: G10L21/0208 , G10L17/00 , G10L21/0272 , G10L25/57 , G10L2021/02087

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for audio-visual speech separation. A method includes: receiving, by a user device, a first indication of one or more first speakers visible in a current view recorded by a camera of the user device, in response, generating a respective isolated speech signal for each of the one or more first speakers that isolates speech of the first speaker in the current view and sending the isolated speech signals for each of the one or more first speakers to a listening device operatively coupled to the user device, receiving, by the user device, a second indication of one or more second speakers visible in the current view recorded by the camera of the user device, and in response generating and sending a respective isolated speech signal for each of the one or more second speakers to the listening device.

2.

发明公开
Optimizing Generative Machine-Learned Models for Subject-Driven Text-to-3D Generation 审中-公开

公开(公告)号：US20240320912A1

公开(公告)日：2024-09-26

申请号：US18611236

申请日：2024-03-20

Applicant: Google LLC

Inventor： Yuanzhen Li , Amit Raj , Varun Jampani , Benjamin Joseph Mildenhall , Benjamin Michael Poole , Jonathan Tilton Barron , Kfir Aberman , Michael Niemeyer , Michael Rubinstein , Nataniel Ruiz Gutierrez , Shiran Elyahu Zada , Srinivas Kaza

IPC: G06T17/00 , H04N13/279 , H04N13/351

CPC classification number: G06T17/00 , H04N13/279 , H04N13/351

Abstract: A fractional training process can be performed training images to an instance of a machine-learned generative image model to obtain a partially trained instance of the model. A fractional optimization process can be performed with the partially trained instance to an instance of a machine-learned three-dimensional (3D) implicit representation model obtain a partially optimized instance of the model. Based on the plurality of training images, pseudo multi-view subject images can be generated with the partially optimized instance of the 3D implicit representation model and a fully trained instance of the generative image model; The partially trained instance of the model can be trained with a set of training data. The partially optimized instance of the machine-learned 3D implicit representation model can be trained with the machine-learned multi-view image model.

3.

发明公开
AUDIO-VISUAL HEARING AID 审中-公开

公开(公告)号：US20230267942A1

公开(公告)日：2023-08-24

申请号：US17601042

申请日：2020-10-01

Applicant: Google LLC

Inventor： Anatoly Efros , Noam Etzion-Rosenberg , Tal Remez , Oran Lang , Inbar Mosseri , Israel Or Weinstein , Benjamin Schlesinger , Michael Rubinstein , Ariel Ephrat , Yukun Zhu , Stella Laurenzo , Amit Pitaru , Yossi Matias

IPC: G10L21/0208 , G10L25/57

CPC classification number: G10L21/0208 , G10L25/57 , G10L2021/02087

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for audio-visual speech separation. A method includes: receiving, by a user device, a first indication of one or more first speakers visible in a current view recorded by a camera of the user device, in response, generating a respective isolated speech signal for each of the one or more first speakers that isolates speech of the first speaker in the current view and sending the isolated speech signals for each of the one or more first speakers to a listening device operatively coupled to the user device, receiving, by the user device, a second indication of one or more second speakers visible in the current view recorded by the camera of the user device, and in response generating and sending a respective isolated speech signal for each of the one or more second speakers to the listening device.

4.

发明申请
Deep Saliency Prior 有权

公开(公告)号：US20230015117A1

公开(公告)日：2023-01-19

申请号：US17856370

申请日：2022-07-01

Applicant: Google LLC

Inventor： Kfir Aberman , David Edward Jacobs , Kai Jochen Kohlhoff , Michael Rubinstein , Yossi Gandelsman , Junfeng He , Inbar Mosseri , Yael Pritch Knaan

IPC: G06T7/194 , G06V40/20 , G06T7/11 , G06T3/00 , G06T11/00

Abstract: Techniques for tuning an image editing operator for reducing a distractor in raw image data are presented herein. The image editing operator can access the raw image data and a mask. The mask can indicate a region of interest associated with the raw image data. The image editing operator can process the raw image data and the mask to generate processed image data. Additionally, a trained saliency model can process at least the processed image data within the region of interest to generate a saliency map that provides saliency values. Moreover, a saliency loss function can compare the saliency values provided by the saliency map for the processed image data within the region of interest to one or more target saliency values. Subsequently, the one or more parameter values of the image editing operator can be modified based at least in part on the saliency loss function.

5.

发明授权
Analysis and visualization of subtle motions in videos 有权

公开(公告)号：US11526996B2

公开(公告)日：2022-12-13

申请号：US17055831

申请日：2019-06-20

Applicant: GOOGLE LLC

Inventor： Michael Rubinstein , Derek Debusschere , Mike Krainin , Ce Liu

IPC: G06K9/00 , G06T7/207 , G06T7/269 , G06T7/73 , A61B5/024 , A61B5/08

Abstract: Example embodiments allow for fast, efficient motion-magnification of video streams by decomposing image frames of the video stream into local phase information at multiple spatial scales and/or orientations. The phase information for each image frame is then scaled to magnify local motion and the scaled phase information is transformed back into image frames to generate a motion-magnified video stream. Scaling of the phase information can include temporal filtering of the phase information across image frames, for example, to magnify motion at a particular frequency. In some embodiments, temporal filtering of phase information at a frequency of breathing, cardiovascular pulse, or some other process of interest allows for motion-magnification of motions within the video stream corresponding to the breathing or the other particular process of interest. The phase information can also be used to determine time-varying motion signals corresponding to motions of interest within the video stream.

6.

发明公开
PERSONALIZED TEXT-TO-IMAGE DIFFUSION MODEL 审中-公开

公开(公告)号：US20240296596A1

公开(公告)日：2024-09-05

申请号：US18569844

申请日：2023-08-23

Applicant: Google LLC

Inventor： Kfir Aberman , Nataniel Ruiz Gutierrez , Michael Rubinstein , Yuanzhen Li , Yael Pritch Knaan , Varun Jampani

IPC: G06T11/00 , G06V10/764

CPC classification number: G06T11/00 , G06V10/764 , G06V2201/07

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a text-to-image model so that the text-to-image model generates images that each depict a variable instance of an object class when the object class without the unique identifier is provided as a text input, and that generates images that each depict a same subject instance of the object class when the unique identifier is provided as the text input.

7.

发明公开
Re-Timing Objects in Video Via Layered Neural Rendering 审中-公开

公开(公告)号：US20230206955A1

公开(公告)日：2023-06-29

申请号：US17927101

申请日：2020-05-22

Applicant: Google LLC

Inventor： Forrester H. Cole , Erika Lu , Tali Dekel , William T. Freeman , David Henry Salesin , Michael Rubinstein

IPC: G11B27/00 , G06V10/82 , G06V20/40 , G11B27/031

CPC classification number: G11B27/005 , G06V10/82 , G06V20/46 , G11B27/031

Abstract: A computer-implemented method for decomposing videos into multiple layers (212, 213) that can be re-combined with modified relative timings includes obtaining video data including a plurality of image frames (201) depicting one or more objects. For each of the plurality of frames, the computer-implemented method includes generating one or more object maps descriptive of a respective location of at least one object of the one or more objects within the image frame. For each of the plurality of frames, the computer-implemented method includes inputting the image frame and the one or more object maps into a machine-learned layer Tenderer model. (220) For each of the plurality of frames, the computer-implemented method includes receiving, as output from the machine-learned layer Tenderer model, a background layer illustrative of a background of the video data and one or more object layers respectively associated with one of the one or more object maps. The object layers include image data illustrative of the at least one object and one or more trace effects at least partially attributable to the at least one object such that the one or more object layers and the background layer can be re-combined with modified relative timings.

8.

发明授权
Audio-visual speech separation 有权

公开(公告)号：US11456005B2

公开(公告)日：2022-09-27

申请号：US16761707

申请日：2018-11-21

Applicant: GOOGLE LLC

Inventor： Inbar Mosseri , Michael Rubinstein , Ariel Ephrat , William Freeman , Oran Lang , Kevin William Wilson , Tali Dekel , Avinatan Hassidim

IPC: G10L21/10 , G06K9/62 , G10L15/16 , G10L21/18 , G06V20/40 , G06V40/16

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for audio-visual speech separation. A method includes: obtaining, for each frame in a stream of frames from a video in which faces of one or more speakers have been detected, a respective per-frame face embedding of the face of each speaker; processing, for each speaker, the per-frame face embeddings of the face of the speaker to generate visual features for the face of the speaker; obtaining a spectrogram of an audio soundtrack for the video; processing the spectrogram to generate an audio embedding for the audio soundtrack; combining the visual features for the one or more speakers and the audio embedding for the audio soundtrack to generate an audio-visual embedding for the video; determining a respective spectrogram mask for each of the one or more speakers; and determining a respective isolated speech spectrogram for each speaker.

9.

发明申请
AUDIO-VISUAL SPEECH SEPARATION 审中-公开

公开(公告)号：US20200335121A1

公开(公告)日：2020-10-22

申请号：US16761707

申请日：2018-11-21

Applicant: GOOGLE LLC

Inventor： Inbar Mosseri , Michael Rubinstein , Ariel Ephrat , William Freeman , Oran Lang , Kevin William Wilson , Tali Dekel , Avinatan Hassidim

IPC: G10L21/10 , G10L21/18 , G10L15/16 , G06K9/00 , G06K9/62

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for audio-visual speech separation. A method includes: obtaining, for each frame in a stream of frames from a video in which faces of one or more speakers have been detected, a respective per-frame face embedding of the face of each speaker; processing, for each speaker, the per-frame face embeddings of the face of the speaker to generate visual features for the face of the speaker; obtaining a spectrogram of an audio soundtrack for the video; processing the spectrogram to generate an audio embedding for the audio soundtrack; combining the visual features for the one or more speakers and the audio embedding for the audio soundtrack to generate an audio-visual embedding for the video; determining a respective spectrogram mask for each of the one or more speakers; and determining a respective isolated speech spectrogram for each speaker.

10.

发明授权
Adaptive glare removal and/or color correction 有权

公开(公告)号：US10675955B2

公开(公告)日：2020-06-09

申请号：US15812622

申请日：2017-11-14

Applicant: Google LLC

Inventor： Julia Winn , Abraham Stephens , Daniel Pettigrew , Aaron Maschinot , Ce Liu , Michael Krainin , Michael Rubinstein , Jingyu Cui

IPC: G06K9/34 , B60J3/04 , G02F1/1333 , G02F1/137 , H04N5/00 , H04N1/60 , G06T5/00 , H04N1/38 , H04N5/232 , G06T5/50 , H05B45/20

Abstract: Some implementations relate to determining whether glare is present in captured image(s) of an object (e.g., a photo) and/or to determining one or more attributes of any present glare. Some of those implementations further relate to adapting one or more parameters for a glare removal process based on whether the glare is determined to be present and/or based on one or more of the determined attributes of any glare determined to be present. Some additional and/or alternative implementations disclosed herein relate to correcting color of a flash image of an object (e.g., a photo). The flash image is based on one or more images captured by a camera of a client device with a flash component of the client device activated. In various implementations, correcting the color of the flash image is based on a determined color space of an ambient image of the object.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification