Abstract:
An efficient method and system to enhance digital acquisition devices for analog data is presented. The enhancements offered by the method and system are available to the user in local as well as in remote deployments yielding efficiency gains for a large variety of business processes. The quality enhancements of the acquired digital data are achieved efficiently by employing virtual reacquisition. The method of virtual reacquisition renders unnecessary the physical reacquisition of the analog data in case the digital data obtained by the acquisition device are of insufficient quality. The method and system allows multiple users to access the same acquisition device for analog data. In some embodiments, one or more users can virtually reacquire data provided by multiple analog or digital sources. The acquired raw data can be processed by each user according to his personal preferences and/or requirements. The preferred processing settings and attributes are determined interactively in real time as well as non real time, automatically and a combination thereof.
Abstract:
According to one embodiment, a computer-implemented method for cleaning up a data set having a possible incorrect label includes: selecting a plurality of training documents; estimating a quality of an organization of a plurality of categories; and determining whether the quality of the organization is greater than a predetermined quality threshold. Corresponding system and computer program product embodiments are also presented. Other aspects and advantages of the present invention will become apparent from the following detailed description, which, when taken in conjunction with the drawings, illustrate by way of example the principles of the invention.
Abstract:
A method is provided for organizing data sets. In use, an automatic decision system is created or updated for determining whether data elements fit a predefined organization or not, where the decision system is based on a set of preorganized data elements. A plurality of data elements is organized using the decision system. At least one organized data element is selected for output to a user based on a score or confidence from the decision system for the at least one organized data element. Additionally, at least a portion of the at least one organized data element is output to the user. A response is received from the user comprising at least one of a confirmation, modification, and a negation of the organization of the at least one organized data element. The automatic decision system is recreated or updated based on the user response. Other embodiments are also presented.
Abstract:
An efficient method and system to enhance digital acquisition devices for analog data is presented. The enhancements offered by the method and system are available to the user in local as well as in remote deployments yielding efficiency gains for a large variety of business processes. The quality enhancements of the acquired digital data are achieved efficiently by employing virtual reacquisition. The method of virtual reacquisition renders unnecessary the physical reacquisition of the analog data in case the digital data obtained by the acquisition device are of insufficient quality. The method and system allows multiple users to access the same acquisition device for analog data. In some embodiments, one or more users can virtually reacquire data provided by multiple analog or digital sources. The acquired raw data can be processed by each user according to his personal preferences and/or requirements. The preferred processing settings and attributes are determined interactively in real time as well as non real time, automatically and a combination thereof.
Abstract:
Systems, methods and computer program products for classifying documents are presented. Systems, methods and computer program products for analyzing documents, e.g. for verifying an association of an invoice with an entity are also presented. Systems, methods and computer program products for managing medical records are presented. One exemplary system includes a memory; and a processor in communication with the memory, the processor being configured to process at least some instructions stored in the memory. The memory stores computer executable program code comprising instructions for: training a classifier based on an invoice format associated with a first entity; accessing a plurality of invoices labeled as being associated with at least one of the first entity and other entities; and outputting an identifier of at least one of the invoices having a high probability of not being associated with the first entity.
Abstract:
In one embodiment, a method includes performing optical character recognition (OCR) on an image of a financial document and at least one of: (a) correct OCR errors in the financial document using at least one of textual information from a complementary document and predefined business rules; (b) normalize data from the complementary document using at least one of textual information from the financial document and the predefined business rules: and (c) normalize data from the financial document using at least one of textual information from the complementary document and the predefined business riles. Exemplary systems and computer program products are also disclosed.
Abstract:
A method includes storing raw or normalized video data in a computer accessible storage medium; analyzing portions of the video data with a first analytic engine to: determine whether the raw video data is within a first set of parameters; and generate with the first analytic engine a first set of processor settings; processing the raw or normalized video data with the first set of processor settings; and analyzing portions of the processed data with a second analytic engine to determine whether the processed data is within a second set of parameters; generating with the second analytic engine a second set of processor settings to reprocess the raw or normalized video data, sending the second set of processor settings to the first analytic engine; and reprocessing the raw or normalized video data with the first analytic engine using the second set of processor settings.
Abstract:
A method is provided for organizing data sets. In use, an automatic decision system is created or updated for determining whether data elements fit a predefined organization or not, where the decision system is based on a set of preorganized data elements. A plurality of data elements is organized using the decision system. At least one organized data element is selected for output to a user based on a score or confidence from the decision system for the at least one organized data element. Additionally, at least a portion of the at least one organized data element is output to the user. A response is received from the user comprising at least one of a confirmation, modification, and a negation of the organization of the at least: one organized data element. The automatic decision system is recreated or updated based on the user response. Other embodiments are also presented.
Abstract:
According to one embodiment, a computer-implemented method for confirming/rejecting a most relevant example includes: generating a binary decision model by training a binary classifier using a plurality of training documents; classifying one or more test documents into one of a plurality of categories using the binary decision model, wherein the one or more test documents lack a user-defined category label; selecting a most relevant example of the classified test documents from among the classified test documents; displaying, using a display of the computer, the most relevant example of the classified test documents to a user; receiving, via the computer and from the user, a confirmation or a negation of a classification label of the most relevant example of the classified test documents; and storing the confirmation or the negation of the classification label of the most relevant example of the classified test documents to a memory of the computer.
Abstract:
A method according to one embodiment includes performing optical character recognition (OCR) on an image of a first document; and at least one of: correcting OCR errors in the first document using at least one of textual information from a complementary document and predefined business rules; normalizing data from the complementary document using at least one of textual information from the first document and the predefined business rules; and normalizing data from the first document using at least one of textual information from the complementary document and the predefined business rules. Additional systems, methods and computer program products are also presented.