发明授权
- 专利标题: Sequence classification for machine translation
- 专利标题(中): 机器翻译序列分类
-
申请号: US11647080申请日: 2006-12-28
-
公开(公告)号: US07783473B2公开(公告)日: 2010-08-24
- 发明人: Srinivas Bangalore , Patrick Haffner , Stephan Kanthak
- 申请人: Srinivas Bangalore , Patrick Haffner , Stephan Kanthak
- 申请人地址: US NV Reno
- 专利权人: AT&T Intellectual Property II, L.P.
- 当前专利权人: AT&T Intellectual Property II, L.P.
- 当前专利权人地址: US NV Reno
- 代理商 Ronald D. Slusky
- 主分类号: G06F17/28
- IPC分类号: G06F17/28 ; G10L21/00
摘要:
Classification of sequences, such as the translation of natural language sentences, is carried out using an independence assumption. The independence assumption is an assumption that the probability of a correct translation of a source sentence word into a particular target sentence word is independent of the translation of other words in the sentence. Although this assumption is not a correct one, a high level of word translation accuracy is nonetheless achieved. In particular, discriminative training is used to develop models for each target vocabulary word based on a set of features of the corresponding source word in training sentences, with at least one of those features relating to the context of the source word. Each model comprises a weight vector for the corresponding target vocabulary word. The weights comprising the vectors are associated with respective ones of the features; each weight is a measure of the extent to which the presence of that feature for the source word makes it more probable that the target word in question is the correct one.
公开/授权文献
- US20080162111A1 Sequence classification for machine translation 公开/授权日:2008-07-03
信息查询