发明申请
- 专利标题: Discriminative training of models for sequence classification
- 专利标题(中): 序列分类模型的辨别性训练
-
申请号: US11646983申请日: 2006-12-28
-
公开(公告)号: US20080162117A1公开(公告)日: 2008-07-03
- 发明人: Srinivas Bangalore , Patrick Haffner , Stephan Kanthak
- 申请人: Srinivas Bangalore , Patrick Haffner , Stephan Kanthak
- 主分类号: G06F17/21
- IPC分类号: G06F17/21
摘要:
Classification of sequences, such as the translation of natural language sentences, is carried out using an independence assumption. The independence assumption is an assumption that the probability of a correct translation of a source sentence word into a particular target sentence word is independent of the translation of other words in the sentence. Although this assumption is not a correct one, a high level of word translation accuracy is nonetheless achieved. In particular, discriminative training is used to develop models for each target vocabulary word based on a set of features of the corresponding source word in training sentences, with at least one of those features relating to the context of the source word. Each model comprises a weight vector for the corresponding target vocabulary word. The weights comprising the vectors are associated with respective ones of the features; each weight is a measure of the extent to which the presence of that feature for the source word makes it more probable that the target word in question is the correct one.
信息查询