System and method to enable parallel text search using in-charge index ranges
    21.
    发明授权
    System and method to enable parallel text search using in-charge index ranges 失效
    系统和方法,可以使用使用索引范围启用并行文本搜索

    公开(公告)号:US07689545B2

    公开(公告)日:2010-03-30

    申请号:US11185733

    申请日:2005-07-21

    IPC分类号: G06F7/00

    摘要: In registering operation of a document to be searched for, a document identifier management table for managing a range of a document identifier stored for each page and a page identifier of the page is created, and an individual-search-server's search range management table for managing the range of the document identifier in charge of each search server is created. In searching operation of each search server of the document to be searched for, the individual-search-server's search range management table is referred to acquire a range of the allocated document identifier. For each index key forming a query term specified as a query condition, the document identifier management table is referred to to acquire the page identifier storing the document identifier of the allocated range. The searching operation is carried out by referring to a page shown by the acquired page identifier.

    摘要翻译: 在记录要搜索的文档的操作中,创建用于管理为每个页面存储的文档标识符的范围的文档标识符管理表和页面的页面标识符,以及个人搜索服务器的搜索范围管理表 创建管理每个搜索服务器的文档标识符的范围。 在搜索要搜索的文档的每个搜索服务器的搜索操作中,参考个人搜索服务器的搜索范围管理表来获取所分配的文档标识符的范围。 对于形成作为查询条件指定的查询项的每个索引关键字,参考文档标识符管理表来获取存储分配范围的文档标识符的页面标识符。 通过参考由所获取的页面标识符示出的页面来执行搜索操作。

    Information retrieving system
    22.
    发明申请
    Information retrieving system 失效
    信息检索系统

    公开(公告)号:US20070100873A1

    公开(公告)日:2007-05-03

    申请号:US11344835

    申请日:2006-01-31

    IPC分类号: G06F7/00

    摘要: The technology for changing the nodes in an information retrieving system using a computer. When information items are registered by allocating to n nodes, steps are used to extract index information as a set of pairs of index keys of information items and addresses of information items, divide the index information into m (m>n) buckets and produce a partial inverted file to be closed within each of the buckets. Here, m and n are respectively integers of 1 (one) or above. When the allocation of the search-targeted ranges to the nodes is altered, the allocation to the buckets to each of the nodes is changed, and the partial inverted file of each bucket and the inverted file of the existing indexes are merged to produce new indexes, so that the indexes can be produced and updated with high speed.

    摘要翻译: 使用计算机更改信息检索系统中节点的技术。 当通过分配给n个节点来登记信息项时,使用步骤来提取索引信息作为信息项的索引关键字对和信息项的地址对,将索引信息划分为m(m> n)个桶并产生一个 部分反转文件在每个桶内封闭。 这里,m和n分别为1(1)以上的整数。 当针对节点的搜索目标范围的分配被改变时,改变对每个节点的桶的分配,并且将每个桶的部分反转文件和现有索引的反转文件合并以产生新的索引 ,从而可以高速生成和更新索引。

    LOG MANAGEMENT COMPUTER AND LOG MANAGEMENT METHOD
    23.
    发明申请
    LOG MANAGEMENT COMPUTER AND LOG MANAGEMENT METHOD 审中-公开
    日志管理计算机和日志管理方法

    公开(公告)号:US20140317137A1

    公开(公告)日:2014-10-23

    申请号:US14355139

    申请日:2012-03-12

    IPC分类号: G06F17/30

    摘要: The purpose of the invention is to provide a log management computer that shortens log search time while reducing log storage volume. The log management computer manages a log acquired from a log generating system that generates the log, which is an operation record. The log management computer is characterized by: extracting from a log message contained in the log, both a common portion that is common with another log message and a different portion that is different from another log message; storing the extracted common portion in common portion information of a storage area; storing the extracted different portion in different portion information of the storage area; and if a search request containing a search condition is received, searching for a log message that matches the search condition.

    摘要翻译: 本发明的目的是提供一种在减少日志存储量的同时缩短日志搜索时间的日志管理计算机。 日志管理计算机管理从生成日志的日志生成系统获取的日志,该日志是操作记录。 日志管理计算机的特征在于:从包含在日志中的日志消息中提取与另一个日志消息相同的公共部分和与另一个日志消息不同的不同部分; 将所提取的公共部分存储在存储区域的公共部分信息中; 将提取的不同部分存储在存储区域的不同部分信息中; 并且如果接收到包含搜索条件的搜索请求,则搜索与搜索条件匹配的日志消息。

    Method, program and apparatus for document retrieval system
    24.
    发明申请
    Method, program and apparatus for document retrieval system 有权
    文件检索系统的方法,程序和装置

    公开(公告)号:US20070192274A1

    公开(公告)日:2007-08-16

    申请号:US11625983

    申请日:2007-01-23

    IPC分类号: G06F17/30

    摘要: The present invention realize a high speed retrieval performance in a document retrieval system referring to partial data of documents including structured data such as XML documents and electric mails, without providing further memory. The present invention includes storage means for storing documents to be retrieved onto a disk device, a calculation means for calculating an allocated capacity of the memory, and storage means for saving, onto the memory, partial data of the documents stored on the disk device by the calculated allocated capacity of the memory. The present invention also includes a first retrieval means for retrieving partial data stored on the memory, determining means for determining whether or not to retrieve the documents stored on the disk device based on the result from the first retrieval, and a second means for retrieving the documents stored on the disk device based on the result from the above determination.

    摘要翻译: 本发明在文件检索系统中实现高速检索性能,参考包括诸如XML文档和电子邮件的结构化数据的文档的部分数据,而不提供进一步的存储器。 本发明包括用于存储要被检索到盘装置上的文件的存储装置,用于计算存储器的分配容量的计算装置,以及存储装置,用于将存储在盘装置上的文件的部分数据保存在存储器中 计算出的内存分配容量。 本发明还包括用于检索存储在存储器上的部分数据的第一检索装置,用于基于第一检索的结果来确定是否检索存储在磁盘装置上的文档的确定装置,以及用于检索 基于上述确定的结果存储在磁盘设备上的文档。

    Method and system for retrieving a document
    25.
    发明申请
    Method and system for retrieving a document 失效
    检索文档的方法和系统

    公开(公告)号:US20060101004A1

    公开(公告)日:2006-05-11

    申请号:US11185733

    申请日:2005-07-21

    IPC分类号: G06F17/30

    摘要: In registering operation of a document to be searched for, a document identifier management table for managing a range of a document identifier stored for each page and a page identifier of the page is created, and an individual-search-server's search range management table for managing the range of the document identifier in charge of each search server is created. In searching operation of each search server of the document to be searched for, the individual-search-server's search range management table is referred to acquire a range of the allocated document identifier. For each index key forming a query term specified as a query condition, the document identifier management table is referred to to acquire the page identifier storing the document identifier of the allocated range. The searching operation is carried out by referring to a page shown by the acquired page identifier.

    摘要翻译: 在记录要搜索的文档的操作中,创建用于管理为每个页面存储的文档标识符的范围的文档标识符管理表和页面的页面标识符,以及个人搜索服务器的搜索范围管理表 创建管理每个搜索服务器的文档标识符的范围。 在搜索要搜索的文档的每个搜索服务器的搜索操作中,参考个人搜索服务器的搜索范围管理表来获取所分配的文档标识符的范围。 对于形成作为查询条件指定的查询项的每个索引关键字,参考文档标识符管理表来获取存储分配范围的文档标识符的页面标识符。 通过参考由所获取的页面标识符示出的页面来执行搜索操作。

    Information retrieving system
    26.
    发明授权
    Information retrieving system 失效
    信息检索系统

    公开(公告)号:US07558802B2

    公开(公告)日:2009-07-07

    申请号:US11344835

    申请日:2006-01-31

    IPC分类号: G06F17/30

    摘要: The technology for changing the nodes in an information retrieving system using a computer. When information items are registered by allocating to n nodes, steps are used to extract index information as a set of pairs of index keys of information items and addresses of information items, divide the index information into m (m>n) buckets and produce a partial inverted file to be closed within each of the buckets. Here, m and n are respectively integers of 1 (one) or above. When the allocation of the search-targeted ranges to the nodes is altered, the allocation to the buckets to each of the nodes is changed, and the partial inverted file of each bucket and the inverted file of the existing indexes are merged to produce new indexes, so that the indexes can be produced and updated with high speed.

    摘要翻译: 使用计算机更改信息检索系统中节点的技术。 当通过分配给n个节点来登记信息项时,使用步骤来提取索引信息作为信息项的索引关键字对和信息项的地址对,将索引信息划分为m(m> n)个桶并产生一个 部分反转文件在每个桶内封闭。 这里,m和n分别为1(1)以上的整数。 当针对节点的搜索目标范围的分配被改变时,改变对每个节点的桶的分配,并且将每个桶的部分反转文件和现有索引的反转文件合并以产生新的索引 ,从而可以高速生成和更新索引。

    Retrieval apparatus, retrieval method and retrieval program
    27.
    发明申请
    Retrieval apparatus, retrieval method and retrieval program 失效
    检索设备,检索方法和检索程序

    公开(公告)号:US20080154882A1

    公开(公告)日:2008-06-26

    申请号:US11788478

    申请日:2007-04-20

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30011

    摘要: A retrieval apparatus 100 for searching document data comprises a document storage area 141 for storing documents to be searched and a document management table 142 for storing a data size of a document such that the data size is associated with a document ID for identifying the document. The retrieval apparatus 100 reads out from the document management table data sizes of documents to be searched, and calculates a retrieval document size by adding up the read out data sizes, and calculates an estimated time t1 taken for a retrieval process by the index scan method and an estimated time t2 taken for the retrieval process by the text scan method, based on the retrieval document size. The retrieval apparatus 100 compares the estimated times t1 and t2, and decides which method to use for a retrieval process, the index scan method or the text scan method.

    摘要翻译: 用于搜索文档数据的检索装置100包括用于存储要搜索的文档的文档存储区域141和用于存储文档的数据大小的文档管理表142,使得数据大小与用于识别文档的文档ID相关联。 检索装置100从文档管理表中读出要搜索的文档的数据大小,并通过将读出的数据大小相加来计算检索文档大小,并且通过索引扫描计算用于检索处理的估计时间t 1 方法和基于检索文档大小的文本扫描方法对于检索处理所采用的估计时间t 2。 检索装置100比较估计时间t 1和t 2,并且确定用于检索处理,索引扫描方法或文本扫描方法的方法。