USING MULTIPLE DATA STRUCTURES TO MANAGE DATA IN CACHE
    4.
    发明申请
    USING MULTIPLE DATA STRUCTURES TO MANAGE DATA IN CACHE 失效
    使用多个数据结构来管理缓存中的数据

    公开(公告)号:US20080021853A1

    公开(公告)日:2008-01-24

    申请号:US11459004

    申请日:2006-07-20

    IPC分类号: G06N3/10

    CPC分类号: G06F12/124 G06F12/123

    摘要: Provided are a method, system and program for using multiple data structures to manage data in cache. A plurality of data structures each have entries identifying data from a first computer readable medium added to a second computer readable medium. A request is received for data in the first computer readable medium. A determination is made as to whether there is an entry for the requested data in one of the data structures. The requested data is retrieved from the first computer readable medium to store in the second computer readable medium in response to determining that there is no entry for the requested data in one of the data structures. One of the data structures is selected in response to determining that there is no entry for the requested data in one of the data structures and an entry for the retrieved data is added to the selected data structure.

    摘要翻译: 提供了一种使用多个数据结构来管理缓存中的数据的方法,系统和程序。 多个数据结构各自具有标识来自添加到第二计算机可读介质的第一计算机可读介质的数据的条目。 接收第一计算机可读介质中的数据的请求。 确定在数据结构之一中是否存在所请求的数据的条目。 响应于确定在数据结构之一中没有针对所请求的数据的条目,从第一计算机可读介质检索所请求的数据以存储在第二计算机可读介质中。 响应于确定在数据结构之一中没有针对所请求的数据的条目而选择数据结构中的一个,并且将所检索的数据的条目添加到所选择的数据结构。

    Clustering hypertext with applications to WEB searching
    5.
    发明授权
    Clustering hypertext with applications to WEB searching 有权
    将超文本聚类到应用程序到WEB搜索

    公开(公告)号:US07233943B2

    公开(公告)日:2007-06-19

    申请号:US10660242

    申请日:2003-09-11

    IPC分类号: G06F17/30 G06F7/00

    摘要: A method of searching a database of documents, wherein the method includes performing a search of the database using a query to produce query result documents; constructing a word dictionary of words within the query result documents; constructing an out-link dictionary of documents within the database that are pointed to by the query result documents; adding the query result documents to the out-link dictionary; constructing an in-link dictionary of documents within the database that point to the query result documents; and adding the query result documents to the in-link dictionary.

    摘要翻译: 一种搜索文档数据库的方法,其中所述方法包括使用查询来执行数据库的搜索以产生查询结果文档; 在查询结果文档中构建单词词典; 构建数据库中由查询结果文档指向的文档的外链接字典; 将查询结果文档添加到外链字典; 构建数据库中指向查询结果文档的文档的链接字典; 并将查询结果文档添加到链接字典中。

    Concept decomposition using clustering
    6.
    发明授权
    Concept decomposition using clustering 失效
    使用聚类的概念分解

    公开(公告)号:US06560597B1

    公开(公告)日:2003-05-06

    申请号:US09528941

    申请日:2000-03-21

    IPC分类号: G06F1730

    摘要: A system and method operates with a document collection in which documents are represented as normalized document vectors. The document vector space is partitioned into a set of disjoint clusters and a concept vector is computed for each partition, the concept vector comprising the mean vector of all the documents in each partition. Documents are then reassigned to the cluster having their closest concept vector, and a new set of concept vectors for the new partitioning is computed. From an initial partitioning, the concept vectors are iteratively calculated to a stopping threshold value, leaving a concept vector subspace of the document vectors. The documents can then be projected onto the concept vector subspace to be represented as a linear combination of the concept vectors, thereby reducing the dimensionality of the document space. A search query can be received for the content of text documents and a search can then be performed on the projected document vectors to identify text documents that correspond to the search query.

    摘要翻译: 系统和方法与文档集合一起操作,其中文档被表示为归一化的文档向量。 文档向量空间被分割成一组不相交的簇,并且为每个分区计算概念向量,该概念向量包括每个分区中的所有文档的平均向量。 然后将文档重新分配给具有最接近的概念向量的集群,并且计算新的分区的新的一组概念向量。 从初始分割,将概念向量迭代计算为停止阈值,留下文档向量的概念向量子空间。 然后可以将文档投影到概念向量子空间上以被表示为概念向量的线性组合,从而降低文档空间的维度。 可以接收关于文本文档的内容的搜索查询,然后可以对投影的文档向量执行搜索以识别与搜索查询相对应的文本文档。