发明授权
- 专利标题: Detecting writing systems and languages
- 专利标题(中): 检测书写系统和语言
-
申请号: US12479522申请日: 2009-06-05
-
公开(公告)号: US08326602B2公开(公告)日: 2012-12-04
- 发明人: Richard L. Sites
- 申请人: Richard L. Sites
- 申请人地址: US CA Mountain View
- 专利权人: Google Inc.
- 当前专利权人: Google Inc.
- 当前专利权人地址: US CA Mountain View
- 代理机构: Harness, Dickey & Pierce, P.L.C.
- 主分类号: G06F17/20
- IPC分类号: G06F17/20 ; G06F17/28 ; G06F17/27 ; G06F17/21
摘要:
Methods, systems, and apparatus, including computer program products, for detecting writing systems and languages are disclosed. In one implementation, a method is provided. The method includes receiving text; detecting a first segment of the text, where a substantial amount of the first segment represents a first language; detecting a second segment of the text, where a substantial amount of the second segment represents a second language; identifying scores for each n-gram of size x included in the text; and detecting an edge that identifies a transition from the first language to the second language in the text based on variations of the scores.
公开/授权文献
- US20100312545A1 Detecting Writing Systems and Languages 公开/授权日:2010-12-09
信息查询