Abstract:
Systems and methods for matching document images to tabulated records. Essential features are extracted from document images and converted into a set of document-image vectors. Spreadsheet data is converting a set of sheet-line vectors. A differential parametric algorithm is applied to determine a similarity score metric which is used in a matching algorithm. The matching algorithm is applied to the set of document images and the spreadsheet data.
Abstract:
Systems and methods for processing documents based on a cardinal graph convolution network by generating cardinal graph representations representing words as single nodes with edges connected between neighbouring nodes in four cardinal directions. Features tensors are generated for nodes of the cardinal graph representation and the cardinal directions are encoded to generate an adjacency tensor having node neighbour indices. Entries of the adjacency tensor are transformed into a one-hot encoding of the node neighbour indices. Neighbourhood feature tensors are created over node indices and the features in each block may be scaled, convolved and reduced into new feature tensors.
Abstract:
Systems and methods for automatic information retrieval from imaged documents. Deep network architectures retrieve information from imaged documents using a neuronal visual-linguistic mechanism including a geometrically trained neuronal network. An expense management platform uses the neuronal visual-linguistic mechanism to determine geometric-semantic information of the imaged document.
Abstract:
The disclosure herein relates to business content analysis. In particular, the disclosure relates to systems and methods of an expense management system operable to perform automatic business documents' content analysis for generating business reports associated with automated value added tax (VAT) reclaim, Travel and Expenses (T&E) management, Import / Export management and the like. The system is further operable to provide various organizational expense management aspects for the corporate finance department and the business traveler based upon stored data. Additionally, the system is configured to use a content recognition engine, configured as an enhanced OCR mechanism used for extracting tagged text from invoice images and also provides continuous learning mechanism in a structured mode allowing classification of invoice images by type, providing continual process of improvement and betterment throughout.