Multi-stage query processing system and method for use with tokenspace repository
    1.
    发明授权
    Multi-stage query processing system and method for use with tokenspace repository 有权
    多阶段查询处理系统和方法用于托管存储库

    公开(公告)号:US08407239B2

    公开(公告)日:2013-03-26

    申请号:US10917746

    申请日:2004-08-13

    IPC分类号: G06F7/00 G06F17/30

    摘要: A multi-stage query processing system and method enables multi-stage query scoring, including “snippet” generation, through incremental document reconstruction facilitated by a multi-tiered mapping scheme. At one or more stages of a multi-stage query processing system a set of relevancy scores are used to select a subset of documents for presentation as an ordered list to a user. The set of relevancy scores can be derived in part from one or more sets of relevancy scores determined in prior stages of the multi-stage query processing system. In some embodiments, the multi-stage query processing system is capable of executing one or more passes on a user query, and using information from each pass to expand the user query for use in a subsequent pass to improve the relevancy of documents in the ordered list.

    摘要翻译: 多级查询处理系统和方法通过多层次映射方案促进的增量文档重建实现多阶段查询评分,包括片段生成。 在多阶段查询处理系统的一个或多个阶段,使用一组相关性分数来选择文档的子集,以作为用户的排序列表呈现。 相关性分数的集合可以部分地从多级查询处理系统的先前阶段中确定的一组或多组相关性得分导出。 在一些实施例中,多级查询处理系统能够执行用户查询的一个或多个传递,并且使用来自每个遍的信息来扩展用户查询以用于随后的传递中以改善订购中的文档的相关性 列表。

    System and method of accessing a document efficiently through multi-tier web caching
    2.
    发明授权
    System and method of accessing a document efficiently through multi-tier web caching 有权
    通过多层网页缓存有效访问文档的系统和方法

    公开(公告)号:US07587398B1

    公开(公告)日:2009-09-08

    申请号:US10882794

    申请日:2004-06-30

    IPC分类号: G06F17/30

    摘要: The present invention is directed to a client-server network system implementing a multi-tier caching strategy for a user to access a document efficiently. The system comprises a client cache assistant serving as proxy for web browsers, a remote cache server managing user-requested documents and a search engine repository storing a huge number of documents as a backup for the remote cache server. Upon receipt of a document request, the client cache assistant examines its client cache to identify the requested document. If not successful, the remote cache server then identifies a copy of the requested document in its own cache and transmits a content difference between the two copies to the client cache assistant. If the server copy is still not fresh or not found, the remote cache server seeks another copy of the requested document from the search engine repository and transmits another content difference to the client cache assistant. The client cache assistant merges the content differences and the original copy into a new copy of the requested document.

    摘要翻译: 本发明涉及一种实现用于用户高效访问文档的多层缓存策略的客户端 - 服务器网络系统。 该系统包括用作web浏览器的代理的客户端缓存助理,管理用户请求的文档的远程缓存服务器和存储大量文档作为远程高速缓存服务器的备份的搜索引擎存储库。 在接收到文档请求时,客户机缓存助理检查其客户端缓存以识别所请求的文档。 如果不成功,则远程缓存服务器然后在其自己的高速缓存中标识所请求的文档的副本,并将两个副本之间的内容差异发送到客户端缓存助理。 如果服务器副本仍然不新鲜或未找到,则远程缓存服务器从搜索引擎存储库中寻找所请求文档的另一个副本,并向客户端缓存助理发送另一个内容差异。 客户端缓存助理将内容差异和原始副本合并到请求文档的新副本中。

    Technique for passive cache compaction using a least recently used cache algorithm
    3.
    发明授权
    Technique for passive cache compaction using a least recently used cache algorithm 有权
    使用最近最少使用的缓存算法进行被动缓存压缩的技术

    公开(公告)号:US09164922B2

    公开(公告)日:2015-10-20

    申请号:US13546452

    申请日:2012-07-11

    IPC分类号: G06F12/12 H04L29/06 H04L29/08

    摘要: An example method for passive compaction of a cache includes determining first metadata associated with first data and second metadata associated with second data. The first metadata includes a first retrieval time, and the second metadata includes a second retrieval time. The example method further includes obtaining a first metadata key including a first unique identifier and obtaining a second metadata key including a second unique identifier. The example method also includes generating a first data key and generating a second data key. The example method further includes writing, at a client device, the first and second data to the cache. Each of the first and second data occupy one or more contiguous blocks of physical memory in the cache, and the first and second data are stored in the cache in an order based on the relative values of the first and second retrieval times.

    摘要翻译: 用于缓存的被动压缩的示例性方法包括确定与第一数据相关联的第一元数据和与第二数据相关联的第二元数据。 第一元数据包括第一检索时间,第二元数据包括第二检索时间。 该示例方法还包括获得包括第一唯一标识符的第一元数据密钥并获得包括第二唯一标识符的第二元数据密钥。 该示例方法还包括生成第一数据密钥并生成第二数据密钥。 该示例方法还包括在客户端设备处将第一和第二数据写入高速缓存。 第一和第二数据中的每一个占据高速缓存中的一个或多个物理存储器的连续块,并且第一和第二数据以基于第一和第二检索时间的相对值的顺序存储在高速缓存中。

    System and method of accessing a document efficiently through multi-tier web caching
    5.
    发明授权
    System and method of accessing a document efficiently through multi-tier web caching 有权
    通过多层网页缓存有效访问文档的系统和方法

    公开(公告)号:US08788475B2

    公开(公告)日:2014-07-22

    申请号:US13536701

    申请日:2012-06-28

    IPC分类号: G06F17/30

    摘要: Upon receipt of a document request, a client assistant examines its cache for the document. If not successful, a server searches for the requested document in its cache. If the server copy is still not fresh or not found, the server seeks the document from its host. If the host cannot provide the copy, the server seeks it from a document repository. Certain documents are identified from the document repository as being fresh or stable. Information about each of these identified documents is transmitted to the server which inserts entries into an index if the index does not already contain an entry for the document. If and when this particular document is requested, the document will not be present in the server, however the server will contain an entry directing the server to obtain the document from the document repository rather than the document's web host.

    摘要翻译: 在接收到文档请求时,客户端助理检查其文件的缓存。 如果不成功,服务器将在其缓存中搜索所请求的文档。 如果服务器副本仍然不新鲜或找不到,则服务器从其主机寻找文档。 如果主机无法提供副本,则服务器从文档存储库中查找它。 某些文件从文档库中确定为新鲜或稳定。 关于这些标识文档中的每一个的信息被传送到服务器,如果该索引尚未包含该文档的条目,则将该条目插入到索引中。 如果请求此特定文档时,该文档将不存在于服务器中,但是服务器将包含一个条目,指示服务器从文档存储库而不是文档的Web主机获取文档。

    System and Method of Accessing a Document Efficiently Through Multi-Tier Web Caching
    6.
    发明申请
    System and Method of Accessing a Document Efficiently Through Multi-Tier Web Caching 有权
    通过多层Web缓存高效访问文档的系统和方法

    公开(公告)号:US20120271852A1

    公开(公告)日:2012-10-25

    申请号:US13536701

    申请日:2012-06-28

    IPC分类号: G06F17/30

    摘要: Upon receipt of a document request, a client assistant examines its cache for the document. If not successful, a server searches for the requested document in its cache. If the server copy is still not fresh or not found, the server seeks the document from its host. If the host cannot provide the copy, the server seeks it from a document repository. Certain documents are identified from the document repository as being fresh or stable. Information about each of these identified documents is transmitted to the server which inserts entries into an index if the index does not already contain an entry for the document. If and when this particular document is requested, the document will not be present in the server, however the server will contain an entry directing the server to obtain the document from the document repository rather than the document's web host.

    摘要翻译: 在接收到文档请求时,客户端助理检查其文件的缓存。 如果不成功,服务器将在其缓存中搜索所请求的文档。 如果服务器副本仍然不新鲜或找不到,则服务器从其主机寻找文档。 如果主机无法提供副本,则服务器从文档存储库中查找它。 某些文件从文档库中确定为新鲜或稳定。 关于这些标识文档中的每一个的信息被传送到服务器,如果索引尚未包含文档的条目,则将该条目插入到索引中。 如果请求此特定文档时,该文档将不存在于服务器中,但是服务器将包含一个条目,指示服务器从文档存储库而不是文档的Web主机获取文档。

    Automatic identification of related entities
    7.
    发明授权
    Automatic identification of related entities 有权
    自动识别相关实体

    公开(公告)号:US09477758B1

    公开(公告)日:2016-10-25

    申请号:US13553731

    申请日:2012-07-19

    IPC分类号: G06F7/00 G06F17/30

    CPC分类号: G06F17/30864

    摘要: In one aspect, the present disclosure can be embodied in a method that includes identifying a collection of entities from one or more data sources, calculating a score for subsets of entities from the collection based on one or more seed entities associated with the collection, identifying one or more entities from each of the subsets based on the calculated score, assigning the calculated score to the identified one or more entities from the respective subset, and ranking the one or more entities based on the assigned score, so as to identify entities in the collection that are related to the one or more seed entities.

    摘要翻译: 一方面,本公开可以体现在一种方法中,该方法包括从一个或多个数据源识别实体的集合,基于与集合相关联的一个或多个种子实体从集合计算实体的子集的分数,识别 基于所计算的分数从所述子集中的每一个的一个或多个实体,将所计算的分数从所述相应子集分配给所识别的一个或多个实体,并且基于所分配的分数对所述一个或多个实体进行排名,以便识别 与一个或多个种子实体相关的集合。

    Technique for Passive Cache Compaction Using A Least Recently Used Cache Algorithm
    8.
    发明申请
    Technique for Passive Cache Compaction Using A Least Recently Used Cache Algorithm 有权
    使用最近最少使用的缓存算法进行被动缓存压缩的技术

    公开(公告)号:US20150169470A1

    公开(公告)日:2015-06-18

    申请号:US13546452

    申请日:2012-07-11

    IPC分类号: G06F12/12 H04L29/08 H04L29/06

    摘要: An example method for passive compaction of a cache includes determining first metadata associated with first data and second metadata associated with second data. The first metadata includes a first retrieval time, and the second metadata includes a second retrieval time. The example method further includes obtaining a first metadata key including a first unique identifier and obtaining a second metadata key including a second unique identifier. The example method also includes generating a first data key and generating a second data key. The example method further includes writing, at a client device, the first and second data to the cache. Each of the first and second data occupy one or more contiguous blocks of physical memory in the cache, and the first and second data are stored in the cache in an order based on the relative values of the first and second retrieval times.

    摘要翻译: 用于缓存的被动压缩的示例性方法包括确定与第一数据相关联的第一元数据和与第二数据相关联的第二元数据。 第一元数据包括第一检索时间,第二元数据包括第二检索时间。 该示例方法还包括获得包括第一唯一标识符的第一元数据密钥并获得包括第二唯一标识符的第二元数据密钥。 该示例方法还包括生成第一数据密钥并生成第二数据密钥。 该示例方法还包括在客户端设备处将第一和第二数据写入高速缓存。 第一和第二数据中的每一个占用高速缓存中的一个或多个物理存储器的连续块,并且第一和第二数据以基于第一和第二检索时间的相对值的顺序存储在高速缓存中。

    System and Method of Accessing a Document Efficiently Through Multi-Tier Web Caching
    9.
    发明申请
    System and Method of Accessing a Document Efficiently Through Multi-Tier Web Caching 有权
    通过多层Web缓存高效访问文档的系统和方法

    公开(公告)号:US20090037393A1

    公开(公告)日:2009-02-05

    申请号:US12251413

    申请日:2008-10-14

    IPC分类号: G06F7/06 G06F17/30

    摘要: Upon receipt of a document request, a client assistant examines its cache for the document. If not successful, a server searches for the requested document in its cache. If the server copy is still not fresh or not found, the server seeks the document from its host. If the host cannot provide the copy, the server seeks it from a document repository. Certain documents are identified from the document repository as being fresh or stable. Information about each these identified documents is transmitted to the server which inserts entries into an index if the index does not already contain an entry for the document. If and when this particular document is requested, the document will not be present in the server, however the server will contain an entry directing the server to obtain the document from the document repository rather than the document's web host.

    摘要翻译: 在接收到文档请求时,客户端助理检查其文件的缓存。 如果不成功,服务器将在其缓存中搜索所请求的文档。 如果服务器副本仍然不新鲜或找不到,则服务器从其主机寻找文档。 如果主机无法提供副本,则服务器从文档存储库中查找它。 某些文件从文档库中确定为新鲜或稳定。 关于每个这些标识的文档的信息被传送到服务器,如果索引还没有包含文档的条目,则将该条目插入到索引中。 如果请求此特定文档时,该文档将不存在于服务器中,但是服务器将包含一个条目,指示服务器从文档存储库而不是文档的Web主机获取文档。

    Organizing Data in a Distributed Storage System
    10.
    发明申请
    Organizing Data in a Distributed Storage System 有权
    在分布式存储系统中组织数据

    公开(公告)号:US20130339295A1

    公开(公告)日:2013-12-19

    申请号:US13898411

    申请日:2013-05-20

    IPC分类号: G06F17/30

    摘要: A distributed storage system is provided. The distributed storage system includes multiple front-end servers and zones for managing data for clients. Data within the distributed storage system is associated with a plurality of accounts and divided into a plurality of groups, each group including a plurality of splits, each split being associated with a respective account, and each group having multiple tablets and each tablet managed by a respective tablet server of the distributed storage system. Data associated with different accounts may be replicated within the distributed storage system using different data replication policies. There is no limit to the amount of data for an account by adding new splits to the distributed storage system. In response to a client request for a particular account's data, a front-end server communicates such request to a particular zone that has the client-requested data and returns the client-requested data to the requesting client.

    摘要翻译: 提供分布式存储系统。 分布式存储系统包括多个前端服务器和用于管理客户端数据的区域。 分布式存储系统内的数据与多个帐户相关联,并被分成多个组,每个组包括多个分组,每个分组与相应的帐户相关联,并且每组具有多个平板电脑,每个分组由 分布式存储系统的平板电脑服务器。 可以使用不同的数据复制策略在分布式存储系统内复制与不同帐户相关联的数据。 通过向分布式存储系统添加新的拆分,帐户数据的数量没有限制。 响应于客户端对特定帐户的数据的请求,前端服务器将该请求传送到具有客户端请求的数据的特定区域,并将客户端请求的数据返回给请求客户端。