-
公开(公告)号:US20250021602A1
公开(公告)日:2025-01-16
申请号:US18900105
申请日:2024-09-27
Applicant: Amazon Technologies, Inc.
Inventor: Shrikant G Nayak , Sathya Prakash Podila Venkata Subramanya , Divya Nalam , Vijay Daniel Manason , Valluri Subbanna Chowdary
IPC: G06F16/84 , G06F18/214 , G06F40/154 , G06F40/16 , G06N5/02 , G06N20/20
Abstract: Results of applying a set of voting rules to a target corpus of documents are used to obtain a set of derived probabilistic labels indicating the probabilities of the presence of a particular attribute within the documents' constituent objects. A machine learning model is trained to identify a candidate portion of a document from which a value of the attribute is to be extracted. The training data for the model includes learned representations obtained from paths of constituent objects, and the corresponding derived labels. A proposed value for the attribute, obtained based on an assigned attribute value presence probability score for an individual constituent object from a selected candidate portion of a document, is provided.
-
公开(公告)号:US12130863B1
公开(公告)日:2024-10-29
申请号:US17107633
申请日:2020-11-30
Applicant: Amazon Technologies, Inc.
Inventor: Shrikant G Nayak , Sathya Prakash Podila Venkata Subramanya , Divya Nalam , Vijay Daniel Manason , Valluri Subbanna Chowdary
IPC: G06F16/80 , G06F16/84 , G06F18/214 , G06F40/154 , G06F40/16 , G06N5/02 , G06N20/20
CPC classification number: G06F16/86 , G06F18/2148 , G06F40/154 , G06F40/16 , G06N5/02 , G06N20/20
Abstract: Results of applying a set of voting rules to a target corpus of documents are used to obtain a set of derived probabilistic labels indicating the probabilities of the presence of a particular attribute within the documents' constituent objects. A machine learning model is trained to identify a candidate portion of a document from which a value of the attribute is to be extracted. The training data for the model includes learned representations obtained from paths of constituent objects, and the corresponding derived labels. A proposed value for the attribute, obtained based on an assigned attribute value presence probability score for an individual constituent object from a selected candidate portion of a document, is provided.
-