摘要:
Words of an input string are morphologically analyzed to identify their alternative base forms and parts of speech. The analyzed words of the input string are used to compile the input string into a first finite-state network. The first finite-state network is matched with a second finite-state network of multiword expressions to identify all subpaths of the first finite-state network that match one or more complete paths in the second finite-state network. Each matching subpath of the first finite-state network and path of the second finite-state network identify a multiword expression in the input string. The morphological analysis is performed without disambiguating words and without segmenting the input string into sentences in the input string to compile the first finite-state network with at least one path that identifies alternative base forms or parts of speech of a word in the input string.
摘要:
Multiword expressions are mapped to identifiers using finite-state networks. Each of a plurality of multiword expressions is encoded into a regular expression. Each regular expression encodes a base form common to a plurality of derivative forms defined by ones of the multiword expressions. Each of the plurality of regular expressions is compiled with factorization into a set of finite-state networks. A union of the finite-state networks in the set of finite-state networks is performed to define a multiword finite-state network and a set of subnets. The multiword finite-state network and the set of subnets are traversed to identify a path corresponding to one of the plurality of multiword expressions, wherein only transitions originating from the multiword finite-state network are accounted for to ascertain a path number identifying a base form of the one of the plurality of multiword expressions.
摘要:
A method of configuring a widget and a tactile user interface which displays the widget are disclosed which enable menu-setting operations with minimal touch gestures and occupation of screen space. The interface includes a touch sensitive display device and memory which stores instructions for displaying the widget and a set of graphic objects together on the display device. The widget has two or more virtual sides, each of the sides being associated with a respective functionality. The widget is flipped, in response to a recognized touch gesture, from a first of the sides to a second of the sides, whereby the functionality of the widget is changed. The graphic objects are associated, in memory, with respective items having attributes. The graphic objects exhibit a response to the widget functionality of a currently displayed one of the sides of the widget based on the attributes of the respective items.
摘要:
A system and method for document management are provided. The method relies on a logging system which automatically generates image logs for input documents for each job (print, copy, fax, scan, etc.) processed by the multifunction printing device(s) of an organization. The image logs are processed to identify keywords which are the basis of a search for similar documents among those which have been previously archived as well as documents in other accessible document repositories, including Web documents. The method identifies matching documents and optionally also revisions and related documents. A procedure is provided for ensuring that for each document processed by a multifunction device or other image output device of the organization, image data is archived (or identified as a public document without archiving). The method avoids duplication by using a digital matching document, where available, enabling the images of the image log for the input document to be discarded.
摘要:
A system and method for reviewing documents are provided. A collection of documents is portioned into sets of documents for review by a plurality of reviewers. For each set, documents in the set are displayed on a display device for review by a reviewer and temporarily organized through grouping and sorting. The reviewer's labels for the displayed documents are received. Based on the reviewer's labels, a class from a plurality of classes is assigned to each of the reviewed documents. A classifier model stored in computer memory is progressively trained, based on features extracted from the reviewed documents in the set and their assigned classes. Prior to review of all documents in the set, a calculated subset of documents for which the classifier model assigns a class different from the one assigned based on the reviewer's label is returned for a second review by a reviewer. Models generated from one or more other document sets can be used to assess the review of a first of the sets.
摘要:
A system and method for document image acquisition and retrieval which find application in litigation for responding to discovery requests are disclosed. The method includes automatically acquiring image data and associated records for documents being processed by a plurality of image output devices within an organization and archiving the image data and associated records as image logs for the processed documents. When a request for document production is received by the organization, the image logs (and/or information extracted therefrom) are automatically filtered through at least one classifier trained to return documents responsive to the document request, and documents corresponding to the filtered out image logs are output. One of the filters may be configured for filtering privileged from non-privileged documents.
摘要:
An imaging system includes a processing component which receives images to be rendered and a rendering device, such as a marking engine, fax machine or email system, in communication with the processing component for rendering an image supplied by the processing component. A haptic interface is in communication with the processing component for inputting commands from the user to the processing component for rendering the image, and outputting feedback from the processing component to the user as a force feedback.
摘要:
A probabilistic clustering system is defined at least in part by probabilistic model parameters indicative of word counts, ratios, or frequencies characterizing classes of the clustering system. An association of one or more documents in the probabilistic clustering system is changed from one or more source classes to one or more destination classes. Probabilistic model parameters characterizing classes affected by the changed association are locally updated without updating probabilistic model parameters characterizing classes not affected by the changed association.
摘要:
An imaging system includes a processing component which receives images to be rendered and a rendering device, such as a marking engine, fax machine or email system, in communication with the processing component for rendering an image supplied by the processing component. A haptic interface is in communication with the processing component for inputting commands from the user to the processing component for rendering the image, and outputting feedback from the processing component to the user as a force feedback.
摘要:
A system and method for document image acquisition and retrieval find application in litigation for responding to discovery requests. The method includes receiving automatically acquired electronic image logs comprising image data and associated records for documents processed by a plurality of image output devices within an organization. When a request for document production is received, the image logs (and/or information extracted therefrom) are automatically filtered through at least one classifier trained to return documents responsive to the document request, and documents corresponding to the filtered out image logs are output. One of the filters may be configured for filtering out documents that include attorney-client exchanges.