Discriminative training of models for sequence classification

发明申请

US20080162117A1 Discriminative training of models for sequence classification 审中-公开

标题翻译：序列分类模型的辨别性训练

请登陆查看更多内容

专利标题： Discriminative training of models for sequence classification
专利标题（中）： 序列分类模型的辨别性训练
申请号： US11646983

申请日： 2006-12-28
公开(公告)号： US20080162117A1

公开(公告)日： 2008-07-03
发明人: Srinivas Bangalore , Patrick Haffner , Stephan Kanthak
申请人： Srinivas Bangalore , Patrick Haffner , Stephan Kanthak
主分类号： G06F17/21
IPC分类号： G06F17/21

Discriminative training of models for sequence classification

摘要：

Classification of sequences, such as the translation of natural language sentences, is carried out using an independence assumption. The independence assumption is an assumption that the probability of a correct translation of a source sentence word into a particular target sentence word is independent of the translation of other words in the sentence. Although this assumption is not a correct one, a high level of word translation accuracy is nonetheless achieved. In particular, discriminative training is used to develop models for each target vocabulary word based on a set of features of the corresponding source word in training sentences, with at least one of those features relating to the context of the source word. Each model comprises a weight vector for the corresponding target vocabulary word. The weights comprising the vectors are associated with respective ones of the features; each weight is a measure of the extent to which the presence of that feature for the source word makes it more probable that the target word in question is the correct one.

摘要（中）：

使用独立假设进行序列分类，如自然语言句子的翻译。独立性假设是将源语句正确翻译成特定目标句子词的概率与句子中其他单词的翻译无关的假设。尽管这种假设不是正确的，但仍然会实现高水平的字翻译精度。特别地，歧视性训练被用于基于训练句子中相应源词的一组特征来开发每个目标词汇词的模型，其中至少一个与源词的上下文有关的特征。每个模型包括对应的目标词汇单词的权重向量。包括向量的权重与相应的特征相关联; 每个权重是衡量源字符的该特征的存在程度使得所述目标词更可能是正确的。

信息查询

Global Dossier Espacenet