-
公开(公告)号:US11501549B2
公开(公告)日:2022-11-15
申请号:US16917572
申请日:2020-06-30
发明人: Sawani Bade , Srinivasan Ponpathirkoottam Raghavan , Samatha Kottha , Shruti Chhabra , Praneeth Medhatithi Shishtla , Debayan Chakraborty , Sreerekha T. V. , Himani Bhatt , Amit Nandi , Akanksha Juneja , Soubhagya Ranjan Mohapatra , Ashok Kumar Shivarajan , Kedar Bhat , Karthick Selvamuthukumaran
IPC分类号: G06F30/00 , G06V30/413 , G06F40/284 , G06V30/412 , G06V30/414
摘要: A hybrid rule-based Artificial Intelligence (AI) document processing system processes a non-editable document with at least one invoice to accurately extract data from tables in the invoices. The non-editable document is preprocessed for conversion into a markup format and pages including the invoice are identified. The invoice is processed via a document process by parsing the pages in different directions to generate a first set of predictions and via a block process wherein logical information blocks from the invoice are processed to generate a second set of predictions. The missing entries from a selected table are identified by applying rules to the first set of predictions and the second set of predictions. Any discrepancy between the missing entry values between the first and second set of predictions are resolved and the resulting data is exported to downstream systems for further uses.
-
公开(公告)号:US20210357633A1
公开(公告)日:2021-11-18
申请号:US16917572
申请日:2020-06-30
发明人: Sawani BADE , Srinivasan Ponpathirkoottam Raghavan , Samatha Kottha , Shruti Chhabra , Praneeth Medhatithi Shishtla , Debayan Chakraborty , Sreerekha T. V. , Himani Bhatt , Amit Nandi , Akanksha Juneja , Soubhagya Ranjan Mohapatra , Ashok Kumar Shivarajan , Kedar Bhat , Karthick Selvamuthukumaran
IPC分类号: G06K9/00 , G06F40/284
摘要: A hybrid rule-based Artificial Intelligence (AI) document processing system processes a non-editable document with at least one invoice to accurately extract data from tables in the invoices. The non-editable document is preprocessed for conversion into a markup format and pages including the invoice are identified. The invoice is processed via a document process by parsing the pages in different directions to generate a first set of predictions and via a block process wherein logical information blocks from the invoice are processed to generate a second set of predictions. The missing entries from a selected table are identified by applying rules to the first set of predictions and the second set of predictions. Any discrepancy between the missing entry values between the first and second set of predictions are resolved and the resulting data is exported to downstream systems for further uses.
-