-
公开(公告)号:US20240394284A1
公开(公告)日:2024-11-28
申请号:US18386803
申请日:2023-11-03
Applicant: Google LLC
Inventor: Daniel Vlasic , Yiming Gu , Daniel Hernandez Diaz , Ilaï Deutel , Xi Xiong , Tianli Yu , Joseph Pagadora , Mingyang Ling , Jill Daley , Guolong Su
IPC: G06F16/332 , G06F40/40 , G06T7/11 , G06V30/414
Abstract: An aspect of the disclosed technology is a system and process that are able to answer a document query as text and also provide the location in an image where the answer text is detected. In one aspect of the disclosed technology, a machine learning model combines vision and language features for joint learning.
-
公开(公告)号:US20240362940A1
公开(公告)日:2024-10-31
申请号:US18306604
申请日:2023-04-25
Applicant: Google LLC
Inventor: Jing Xiong , Tianli Yu , Shengyang Dai
IPC: G06V30/19
CPC classification number: G06V30/1912 , G06V30/1916
Abstract: A method includes receiving, from a user device associated with a user, a plurality of annotated documents. Each respective annotated document includes one or more fields and each respective field labeled by a respective annotation. The method includes, for a threshold number of iterations, randomly selecting a respective subset of annotated documents from the plurality of annotated documents; training a respective model on the respective subset of annotated documents; and generating, using the plurality of annotated documents not selected for the respective subset of annotated documents, a respective evaluation of the respective model. The method also includes providing, to the user device, each respective evaluation.
-