发明授权
- 专利标题: Pair character string retrieval system
- 专利标题(中): 配对字符串检索系统
-
申请号: US13264037申请日: 2010-04-05
-
公开(公告)号: US08788522B2公开(公告)日: 2014-07-22
- 发明人: Kouichi Kimura
- 申请人: Kouichi Kimura
- 申请人地址: JP Tokyo
- 专利权人: Hitachi, Ltd.
- 当前专利权人: Hitachi, Ltd.
- 当前专利权人地址: JP Tokyo
- 代理机构: Mattingly & Malur, PC
- 优先权: JP2009-097270 20090413
- 国际申请: PCT/JP2010/056146 WO 20100405
- 国际公布: WO2010/119783 WO 20101021
- 主分类号: G06F7/00
- IPC分类号: G06F7/00 ; G06F19/22 ; G06F17/30
摘要:
A data structure of index information for retrieving pair character strings on a computer at high speed is provided. A method of retrieving a pair character strings appearing in close proximity of each other in a document using the index information at high speed is also provided. Bits of a suffix array of reference document data are rearranged, thereby creating index information LSA localizable, or usable as an index for a subregion of the document. Through use of this, a process of dichotomizing a region, where the entire document is designated as an initial region, is repeated and positions of index information for a query character string in the reference document data are gradually detailed. The distance between the pair is evaluated and candidates are narrowed down. Finally, positions where the pair character strings occur in close proximity of each other are identified.
公开/授权文献
- US20120041977A1 PAIR CHARACTER STRING RETRIEVAL SYSTEM 公开/授权日:2012-02-16
信息查询