-
公开(公告)号:WO2015078273A1
公开(公告)日:2015-06-04
申请号:PCT/CN2014/090370
申请日:2014-11-05
Inventor: LIU, Xiaojun
IPC: G06F17/30
CPC classification number: G06F17/30622 , G06F17/30675 , G06F17/30864
Abstract: Methods and apparatuses for search are provided and related to the field of search technology. A method may include: performing term segmentation for grabbed documents to count a term frequency of each term, the term frequency of the term representing a number of the grabbed documents containing the term; generating a high frequency term inverted index and a low frequency term inverted index respectively, wherein the high frequency term inverted index contains terms having a term frequency higher than a predefined threshold, and the low frequency term inverted index contains terms having a term frequency not higher than the predefined threshold; and loading the high frequency term inverted index and the low frequency term inverted index respectively to different retrieval modules, the different retrieval modules respectively corresponding to mutually independent storage devices.
Abstract translation: 提供搜索方法和设备,并与搜索技术领域相关。 一种方法可以包括:对被抓取的文档执行术语分段以对每个术语的术语频率进行计数,该术语的术语频率表示包含术语的被抓取文档的数量; 分别产生高频项反转索引和低频项反转索引,其中高频项反向索引包含具有高于预定义阈值的项频率的项,低频项反向索引包含术语频率不高的项 超过预定阈值; 并将高频项倒置指数和低频项倒置指数分别加载到不同的检索模块,不同的检索模块分别对应于相互独立的存储设备。