-
公开(公告)号:US11789990B1
公开(公告)日:2023-10-17
申请号:US17733581
申请日:2022-04-29
发明人: Meena AbdelMaseeh Adly Fouad , Zhihong Zeng , Anirudh Prabakaran , Samriddhi Shakya , Tom Sebastian , Tallam Sai Teja , Simon Ioffe , Narasimha Goli
IPC分类号: G06F16/30 , G06F16/35 , G06V30/416 , G06F16/383 , G06N3/045
CPC分类号: G06F16/35 , G06F16/383 , G06N3/045 , G06V30/416
摘要: A process for document processing (e.g., automated package splitting) may involve producing, for each document page of an ordered plurality of document pages, an image of the document page and a representation of text from the document page; generating, for each document page of the ordered plurality, and based on the image of the document page and the representation of text from the document page, an embedding of the document page; and generating, for each document page among a subset of the ordered plurality, a label for the document page that indicates whether the document page is a document first page, based on the embedding of the document page, the embedding of each of at least one document page that precedes the document page in the ordered plurality, and the embedding of each of at least one document page that follows the document page in the ordered plurality.
-
公开(公告)号:US20230350932A1
公开(公告)日:2023-11-02
申请号:US17733581
申请日:2022-04-29
发明人: Meena AbdelMaseeh Adly Fouad , Zhihong Zeng , Anirudh Prabakaran , Samriddhi Shakya , Tom Sebastian , Tallam Sai Teja , Simon Ioffe , Narasimha Goli
IPC分类号: G06F16/35 , G06F16/383 , G06V30/416 , G06N3/04
CPC分类号: G06F16/35 , G06F16/383 , G06V30/416 , G06N3/0454
摘要: A process for document processing (e.g., automated package splitting) may involve producing, for each document page of an ordered plurality of document pages, an image of the document page and a representation of text from the document page; generating, for each document page of the ordered plurality, and based on the image of the document page and the representation of text from the document page, an embedding of the document page; and generating, for each document page among a subset of the ordered plurality, a label for the document page that indicates whether the document page is a document first page, based on the embedding of the document page, the embedding of each of at least one document page that precedes the document page in the ordered plurality, and the embedding of each of at least one document page that follows the document page in the ordered plurality.
-