Patent search ap:("Google LLC") AND inv:"Guolong Su" Page 1

1.

发明申请
Query-Based Document Extraction with Large Vision-Language Models 有权

公开(公告)号：US20240394284A1

公开(公告)日：2024-11-28

申请号：US18386803

申请日：2023-11-03

Applicant: Google LLC

Inventor： Daniel Vlasic , Yiming Gu , Daniel Hernandez Diaz , Ilaï Deutel , Xi Xiong , Tianli Yu , Joseph Pagadora , Mingyang Ling , Jill Daley , Guolong Su

IPC: G06F16/332 , G06F40/40 , G06T7/11 , G06V30/414

Abstract: An aspect of the disclosed technology is a system and process that are able to answer a document query as text and also provide the location in an image where the answer text is detected. In one aspect of the disclosed technology, a machine learning model combines vision and language features for joint learning.

2.

发明公开
ZERO-SHOT FORM ENTITY QUERY FRAMEWORK 审中-公开

公开(公告)号：US20240153297A1

公开(公告)日：2024-05-09

申请号：US18501982

申请日：2023-11-03

Applicant: Google LLC

Inventor： Zizhao Zhang , Zifeng Wang , Vincent Perot , Jacob Devlin , Chen-Yu Lee , Guolong Su , Hao Zhang , Tomas Jon Pfister

IPC: G06V30/24 , G06F16/21 , G06V30/19 , G06V30/412

CPC classification number: G06V30/24 , G06F16/211 , G06V30/19147 , G06V30/412

Abstract: A method for extracting entities comprises obtaining a document that includes a series of textual fields that includes a plurality of entities. Each entity represents information associated with a predefined category. The method includes generating, using the document, a series of tokens representing the series of textual fields. The method includes generating an entity prompt that includes the series of tokens and one of the plurality of entities and generating a schema prompt that includes a schema associated with the document. The method includes generating a model query that includes the entity prompt and the schema prompt and determining, using an entity extraction model and the model query, a location of the one of the plurality of entities among the series of tokens. The method includes extracting, from the document, the one of the plurality of entities using the location of the one of the plurality of entities.

3.

发明申请
Document Entity Extraction Using Machine-Learned Models 有权

公开(公告)号：US20250068847A1

公开(公告)日：2025-02-27

申请号：US18453236

申请日：2023-08-21

Applicant: Google LLC

Inventor： Vincent Perot , Florian Luisier , Kai Kang , Ramya Sree Boppana , Jiaqi Mu , Xiaoyu Sun , Carl Elie Saroufim , Guolong Su , Hao Zhang , Nikolay Alexeevich Glushnev , Nan Hua , Yun-Hsuan Sung , Michael Yiupun Kwong

IPC: G06F40/295 , G06V30/19

Abstract: Systems and methods for performing document entity extraction are described herein. The method can include receiving an inference document and a target schema. The method can also include generating one or more document inputs from the inference document and one or more schema inputs from the target schema. The method can further include, for each combination of the document input and schema input, obtaining one or more extraction inputs by generating a respective extraction input based on the combination, providing the respective extraction input to the machine-learned model, and receiving a respective output of the machine-learned model based on the respective extraction. The method can also include validating the extracted entity data based on reference spatial locations and inference spatial locations and outputting the validated extracted entity data.

4.

发明公开
STRUCTURAL ENCODING AND ATTENTION PARADIGMS FOR SEQUENCE MODELING 审中-公开

公开(公告)号：US20240354504A1

公开(公告)日：2024-10-24

申请号：US18684557

申请日：2021-08-25

Applicant: Google LLC

Inventor： Chen-Yu Lee , Chun-Liang Li , Timothy Dozat , Vincent Perot , Guolong Su , Nan Hua , Joshua Ainslie , Renshen Wang , Yasuhisa Fujii , Tomas Pfister

IPC: G06F40/284 , G06V30/10 , G06V30/416

CPC classification number: G06F40/284 , G06V30/10 , G06V30/416

Abstract: Systems and methods for providing a structure-aware sequence model that can interpret a document's text without first inferring the proper reading order of the document. In some examples, the model may use a graph convolutional network to generate contextualized “supertoken” embeddings for each token, which are then fed to a transformer that employs a sparse attention paradigm in which attention weights for at least some supertokens are modified based on differences between predicted and actual values of the order and distance between the attender and attendee supertokens.

5.

发明公开
Complementary Prompting For Rehearsal-Free Continual Learning 审中-公开

公开(公告)号：US20230274143A1

公开(公告)日：2023-08-31

申请号：US18173985

申请日：2023-02-24

Applicant: Google LLC

Inventor： Zizhao Zhang , Zifeng Wang , Chen-Yu Lee , Ruoxi Sun , Sayna Ebrahimi , Xiaoqi Ren , Guolong Su , Vincent Perot , Tomas Pfister , Han Zhang

IPC: G06N3/08

CPC classification number: G06N3/08

Abstract: A method for rehearsal-free continual learning includes obtaining a set of training samples where training sample in the set of training samples is associated with a respective task of a plurality of different tasks. The method includes obtaining a task-invariant prompt representative of learned knowledge common to each respective task of the plurality of different tasks. The method includes, for each respective task of the plurality of different tasks, obtaining a respective task-specific prompt representative of learned knowledge specific to the respective task. The method includes, during each of one or more training iterations, for each respective training sample in the set of training samples, selecting the respective task-specific prompt representative of the respective task of the respective training sample and training a model using the task-invariant prompt and the selected respective task-specific prompt.

Patent Agency Ranking