Patent search ap:("Google LLC") AND inv:"Dounia Berrada" Page 1

1.

发明申请
Video and Audio Multimodal Searching System 有权

公开(公告)号：US20240403362A1

公开(公告)日：2024-12-05

申请号：US18326496

申请日：2023-05-31

Applicant: Google LLC

Inventor： Harshit Kharbanda , Belinda Luna Zeng , Viviana Caso Corella , Aashi Jain , David William Hendon , Christopher James Kelley , Jessica Lee , Dounia Berrada , Kai Yu , Louis Wang , Thomas J. Duerig , Radu Soricut , Robin Dua

IPC: G06F16/735 , G06F16/732 , G06F16/783 , G06T7/70 , G06V10/62 , G06V10/774 , G06V20/40

Abstract: A multimodal search system using a video query is described. The system can receive video data captured by a camera of a user device. The video data can have a sequence of image frames. Additionally, the system can receive audio data associated with the video data captured by the user device. Moreover, the system can process, using one or more machine-learned models, the sequence of image frames to generate video embeddings related to the sequence of the image frames. The video embeddings can have a plurality of image embeddings associated with the sequence of image frames. Furthermore, the system can determine one or more video results based on the video embeddings and the audio data. Subsequently, the system can transmit, to the user device, the one or more video results.

2.

发明公开
Visual and Audio Multimodal Searching System 审中-公开

公开(公告)号：US20240362279A1

公开(公告)日：2024-10-31

申请号：US18306638

申请日：2023-04-25

Applicant: Google LLC

Inventor： Harshit Kharbanda , Belinda Luna Zeng , Viviana Caso Corella , Christopher James Kelley , Jessica Lee , Pendar Yousefi , Dounia Berrada , Sundeep Vaddadi , Kai Yu , Balint Miklos , Severin Heiniger , Louis Wang

IPC: G06F16/9532 , G06F16/538 , G06F40/40

CPC classification number: G06F16/9532 , G06F16/538 , G06F40/40

Abstract: A multimodal search system is described. The system can receive image data captured by a camera of a user device. Additionally, the system can receive audio data associated with the image data. The audio data can be captured by a microphone of the user device. Moreover, the system can process the image data to generate visual features. Furthermore, the system can process the audio data to generate a plurality of words. The system can generate a plurality of search terms based on the plurality of words and the visual features. Subsequently, the system can determine one or more search results associated with the plurality of search terms and provide the one or more search results as an output.

3.

发明申请
Systems and Methods for Analyzing Text Extracted from Images and Performing Appropriate Transformations on the Extracted Text 有权

公开(公告)号：US20250087207A1

公开(公告)日：2025-03-13

申请号：US18736113

申请日：2024-06-06

Applicant: Google LLC

Inventor： Harshit Kharbanda , Jessica Lee , Christopher James Kelley , Fabian Roth , Dounia Berrada , Samer Hassan Hassan , Afroz Mohiuddin , Misha Khalman , Ali Essam Ali Elqursh , Belinda Luna Zeng

IPC: G10L15/183 , G06F16/583 , G06V10/778 , G06V30/14 , G06V30/148 , G10L15/22 , G10L15/30

Abstract: The present disclosure provides computer-implemented methods, systems, and devices for responding to requests associated with an image. A computing system obtains, wherein the image depicts a first set of textual content. The computing system determines one or more characteristics of the first set of textual content. The computing system determines a response type from a plurality of response types based on the one or more characteristics. The computing system generates a model input, wherein the model input comprises data descriptive of the first set of textual content and a prompt associated with the response type. The computing system provides providing the model input as an input to a machine-learned language model. The computing system receives a second set of text as an output of the machine-learned language model as a result of the machine-learned language model processing the model input. The computing system provides the second set of text for display to a user, wherein the second set of textual content is associated with the response type.

4.

发明公开
Medical Condition Visual Search 审中-公开

公开(公告)号：US20240339217A1

公开(公告)日：2024-10-10

申请号：US18620434

申请日：2024-03-28

Applicant: Google LLC

Inventor： Peggy Yen Phuong Bui , Bianca Madalina Buisman , Quang Anh Duong , Anastasia Martynova , Ayush Jain , Yuan Liu , Jonathan David Krause , Amit Sanjay Talreja , Rajeev Vijay Rikhye , Mahvish A. Nagda , Pinal Bavishi , Christopher James Eicher , Abigail Ward , Jieming Yu , Louis Wang , Dounia Berrada , Dale Richard Webster , Harshit Kharbanda , Igor Bonaci , Kai Yu , Ke Lan , Kaan Yücer , Willa Angel Chen Miller , Lars Thomas Hansen

IPC: G16H50/20 , G06T7/00 , G16H30/40

CPC classification number: G16H50/20 , G06T7/0012 , G16H30/40 , G06T2207/20104 , G06T2207/30088

Abstract: Systems and methods for diagnostic visual search can include processing a search query with a plurality of classification models to determine a search query intent and predict potential diagnosis. The search query can include an image that is processed to determine the presence of a body part and may be processed to determine if the search query is descriptive of a diagnostic search query. Based on the intent determination, the image may then be processed by a conditions classification model to determine one or more predicted condition classifications. Condition information can then be obtained and provided based on the one or more predicted condition classifications.

5.

发明授权
Systems and methods for analyzing text extracted from images and performing appropriate transformations on the extracted text 有权

公开(公告)号：US12033620B1

公开(公告)日：2024-07-09

申请号：US18463951

申请日：2023-09-08

Applicant: Google LLC

Inventor： Harshit Kharbanda , Jessica Lee , Christopher James Kelley , Fabian Roth , Dounia Berrada , Samer Hassan Hassan , Afroz Mohiuddin , Mikhail Khalman , Ali Essam Ali Elqursh , Belinda Luna Zeng

IPC: G06F3/0483 , G06F16/30 , G06F16/33 , G06F16/583 , G06V10/778 , G06V30/14 , G06V30/148 , G10L15/183 , G10L15/22 , G10L15/30

CPC classification number: G10L15/183 , G06F16/5846 , G06V10/778 , G06V30/1456 , G06V30/153 , G10L15/22 , G10L15/30

Abstract: The present disclosure provides computer-implemented methods, systems, and devices for responding to requests associated with an image. A computing system obtains, wherein the image depicts a first set of textual content. The computing system determines one or more characteristics of the first set of textual content. The computing system determines a response type from a plurality of response types based on the one or more characteristics. The computing system generates a model input, wherein the model input comprises data descriptive of the first set of textual content and a prompt associated with the response type. The computing system provides providing the model input as an input to a machine-learned language model. The computing system receives a second set of text as an output of the machine-learned language model as a result of the machine-learned language model processing the model input. The computing system provides the second set of text for display to a user, wherein the second set of textual content is associated with the response type.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification