摘要:
A method, a system and a computer program product for analyzing a document are disclosed. In response to receiving the document, the document is partitioned into a plurality of segments using a set of pre-defined attributes. The plurality of segments of the document is mapped with corresponding segments of at least one template selected from a set of stored templates. A first template from the set of stored templates is selected and a group of segments in the first template is identified by computing at least one of a structural similarity and a textual similarity associated with the group of segments compared with the plurality of segments of the document. A subset of segments from the group of segments is aligned with corresponding segments from the plurality of segments of the document. A set of scores is computed using a set of pre-defined criteria, in response to the mapping. The document is analyzed based on the computed set of scores.
摘要:
A method, a system and a computer program product for analyzing a document are disclosed. In response to receiving the document, the document is partitioned into a plurality of segments using a set of pre-defined attributes. The plurality of segments of the document is mapped with corresponding segments of at least one template selected from a set of stored templates. A first template from the set of stored templates is selected and a group of segments in the first template is identified by computing at least one of a structural similarity and a textual similarity associated with the group of segments compared with the plurality of segments of the document. A subset of segments from the group of segments is aligned with corresponding segments from the plurality of segments of the document. A set of scores is computed using a set of pre-defined criteria, in response to the mapping. The document is analyzed based on the computed set of scores.
摘要:
Embodiments of the present invention relate to an approach for reusing information/knowledge. Specifically, embodiments of the present invention provide an approach for retrieving previously stored data to satisfy queries (e.g., jobs/tickets) for solutions to problems while maintaining privacy/security of the data as well as ensuring the quality of the results. In a typical embodiment, a query for a solution to a problem is received and details are extracted therefrom. Using the details, a search is performed on a set of data stored in at least one computer storage device. Based on the search, a set of results will be generated and classified into a set of categories. In any event, the quality of each of the set of results will be assessed based on the usefulness of the set of results.
摘要:
Embodiments of the present invention relate to an approach for reusing information/knowledge. Specifically, embodiments of the present invention provide an approach for retrieving previously stored data to satisfy queries (e.g., jobs/tickets) for solutions to problems while maintaining privacy/security of the data as well as ensuring the quality of the results. In a typical embodiment, a query for a solution to a problem is received and details are extracted therefrom. Using the details, a search is performed on a set of data stored in at least one computer storage device. Based on the search, a set of results will be generated and classified into a set of categories. In any event, the quality of each of the set of results will be assessed based on the usefulness of the set of results.
摘要:
A method, an apparatus and an article of manufacture for automatic speech recognition. The method includes obtaining at least one language model word and at least one rule-based grammar word, determining an acoustic similarity of at least one pair of language model word and rule-based grammar word, and increasing a transition cost to the at least one language model word based on the acoustic similarity of the at least one language model word with the at least one rule-based grammar word to generate a modified language model for automatic speech recognition.
摘要:
A system and method are described for generating semantically similar sentences for a statistical language model. A semantic class generator determines for each word in an input utterance a set of corresponding semantically similar words. A sentence generator computes a set of candidate sentences each containing at most one member from each set of semantically similar words. A sentence verifier grammatically tests each candidate sentence to determine a set of grammatically correct sentences semantically similar to the input utterance. Also note that the generated semantically similar sentences are not restricted to be selected from an existing sentence database.
摘要:
A system and method are described for generating semantically similar sentences for a statistical language model. A semantic class generator determines for each word in an input utterance a set of corresponding semantically similar words. A sentence generator computes a set of candidate sentences each containing at most one member from each set of semantically similar words. A sentence verifier grammatically tests each candidate sentence to determine a set of grammatically correct sentences semantically similar to the input utterance. Also note that the generated semantically similar sentences are not restricted to be selected from an existing sentence database.
摘要:
A method, an apparatus and an article of manufacture for automatic speech recognition. The method includes obtaining at least one language model word and at least one rule-based grammar word, determining an acoustic similarity of at least one pair of language model word and rule-based grammar word, and increasing a transition cost to the at least one language model word based on the acoustic similarity of the at least one language model word with the at least one rule-based grammar word to generate a modified language model for automatic speech recognition.