发明授权
- 专利标题: Linguistic based determination of text location origin
- 专利标题(中): 基于语言的文本位置来源的确定
-
申请号: US15132969申请日: 2016-04-19
-
公开(公告)号: US09514125B1公开(公告)日: 2016-12-06
- 发明人: Corville O. Allen , Roberto DeLima , Andrew R. Freed , Robert L. Nielsen
- 申请人: International Business Machines Corporation
- 申请人地址: US NY Armonk
- 专利权人: International Business Machines Corporation
- 当前专利权人: International Business Machines Corporation
- 当前专利权人地址: US NY Armonk
- 代理商 Nathan M. Rau
- 主分类号: G06F17/20
- IPC分类号: G06F17/20 ; G06F17/27 ; G06F17/30
摘要:
A method includes receiving a text and identifying a set of linguistic characteristics contained in the text, where linguistic characteristics include grammatical, syntactic, and idiomatic features of the text. The method also includes determining a plurality of locations of origin in which the text was potentially written based on the set of linguistic characteristics. The method also includes retrieving a set of reference documents for each location of origin in the plurality of locations of origin and producing a set of proximity scores by performing a set of proximity checks using the set of linguistic characteristics, the set of reference documents, and the text, wherein the proximity checks analyze how often and how close linguistic characteristics are to one another. The method also includes ranking the plurality of locations of origin based on the set of proximity scores and returning a set of one or more ranked locations of origin.
信息查询