摘要:
A method of training language model parameters trains discriminative model parameters in the language model based on a performance measure having discrete values.
摘要:
A system is utilized for determining a relationship between first and second textual inputs. The system identifies constituents in the first textual input, having predetermined characteristics indicative of usefulness in determining the relationship. The relationship is then determined based on the constituents identified. The constituents can be eliminated from the first textual input, weighted in the first textual input, or simply annotated in one of a variety of ways.
摘要:
A system is utilized for determining a relationship between first and second textual inputs. The system identifies constituents in the first textual input, having predetermined characteristics indicative of usefulness in determining the relationship. The relationship is then determined based on the constituents identified. The constituents can be eliminated from the first textual input, weighted in the first textual input, or simply annotated in one of a variety of ways.
摘要:
A system is utilized for determining a relationship between first and second textual inputs. The system identifies constituents in the first textual input, having predetermined characteristics indicative of usefulness in determining the relationship. The relationship is then determined based on the constituents identified. The constituents can be eliminated from the first textual input, weighted in the first textual input, or simply annotated in one of a variety of ways.
摘要:
A system is utilized for determining a relationship between first and second textual inputs. The system identifies constituents in the first textual input, having predetermined characteristics indicative of usefulness in determining the relationship. The relationship is then determined based on the constituents identified. The constituents can be eliminated from the first textual input, weighted in the first textual input, or simply annotated in one of a variety of ways.
摘要:
A word breaking facility operates to identify words within a Japanese text string. The word breaking facility performs morphological processing to identify postfix bound morphemes and prefix bound morphemes. The word breaking facility also performs opheme matching to identify likely stem characters. A scoring heuristic is applied to determine an optimal analysis that includes a postfix analysis, a stem analysis, and a prefix analysis. The morphological analyses are stored in an efficient compressed format to minimize the amount of memory they occupy and maximize the analysis speed. The morphological analyses of postfixes, stems, and prefixes is performed in a right-to-left fashion. The word breaking facility may be used in applications that demand identity of selection granularity, autosummarization applications, content indexing applications, and natural language processing applications.
摘要:
Grammatical element prediction is used in the context of machine translation. Features from both the source language and the target language sentences (or other text fragments) are used in predicting the grammatical elements.
摘要:
Grammatical element prediction is used to predict grammatical elements in text fragments (such as phrases or sentences). In one embodiment, a statistical model, using syntax features, is used to predict grammatical elements.
摘要:
A method of training language model parameters trains discriminative model parameters in the language model based on a performance measure having discrete values.
摘要:
Document summarization is performed by scoring individual words in sentences in a document or document cluster. Sentences from the document or document cluster are selected to form a summary based on the scores of the words contained in those sentences.