System and methods for detecting images distracting to a user
    21.
    发明授权
    System and methods for detecting images distracting to a user 有权
    用于检测图像的系统和方法分散给用户

    公开(公告)号:US07877382B1

    公开(公告)日:2011-01-25

    申请号:US11027020

    申请日:2004-12-31

    IPC分类号: G06F7/00

    CPC分类号: G06F17/3028 G06F17/30867

    摘要: Methods and apparatus for detecting distracting search engine results are described. In one embodiment, the method includes monitoring the behavior of a user with respect to a group of images that are related in some manner to a query, and using the monitored behavior to calculate the distractiveness of a particular image. The method also includes adding to a group of images related to a query a set of images that are unrelated to the query and monitoring the behavior of a user with respect to all the images.

    摘要翻译: 描述了用于检测分心搜索引擎结果的方法和装置。 在一个实施例中,该方法包括监视用户相对于以某种方式与查询相关的图像组的行为,以及使用所监视的行为来计算特定图像的分散性。 该方法还包括将与查询有关的图像集合添加到与查询无关的一组图像并且监视用户相对于所有图像的行为。

    Document compression system and method for use with tokenspace repository
    22.
    发明授权
    Document compression system and method for use with tokenspace repository 有权
    文档压缩系统和方法用于托管存储库

    公开(公告)号:US07917480B2

    公开(公告)日:2011-03-29

    申请号:US10917739

    申请日:2004-08-13

    IPC分类号: G06F7/00 G06F17/00 G06F15/18

    摘要: The disclosed embodiments enable multi-stage query scoring, including “snippet” generation, through incremental document reconstruction facilitated by a multi-tiered mapping scheme. The mapping scheme includes a first mapping between unique tokens contained in a set of documents and unique global token identifiers (e.g., 32-bit integers) contained in a global-lexicon (i.e., dictionary). The mapping scheme also includes a second mapping between the global token identifiers and a set of fixed-length local token identifiers (e.g., 8-bit integers) contained in one or more mini-lexicons (i.e., sub-dictionaries). Each mini-lexicon is associated with a range of token positions in the tokenized documents. The first and second mappings are used to encode/decode documents into local token identifiers having fixed widths which can be compactly stored in the tokenspace repository. The use of fixed-length local token identifiers allows for fast and efficient decoding of tokenized documents.

    摘要翻译: 所公开的实施例通过由多层映射方案促进的增量文档重建能够实现多阶段查询评分,包括“代码段”生成。 映射方案包括包含在一组文档中的唯一标记和包含在全局词典(即字典)中的唯一全局令牌标识符(例如,32位整数)之间的第一映射。 映射方案还包括全局令牌标识符与包含在一个或多个小词典(即子词典)中的一组固定长度的本地令牌标识符(例如,8位整数)之间的第二映射。 每个迷你词典与令牌化文档中的一系列令牌位置相关联。 第一和第二映射用于将文档编码/解码为具有固定宽度的本地令牌标识符,其可以紧凑地存储在令牌空间存储库中。 使用固定长度的本地令牌标识符可以快速有效地解码标记化文档。

    Generating Content Snippets Using a Tokenspace Repository
    23.
    发明申请
    Generating Content Snippets Using a Tokenspace Repository 有权
    使用令牌空间存储库生成内容片段

    公开(公告)号:US20130212076A1

    公开(公告)日:2013-08-15

    申请号:US13685581

    申请日:2012-11-26

    IPC分类号: G06F17/30

    摘要: A search engine server system receives from a client system a search query and identifies a set of documents in accordance with the search query. A content snippet corresponding to content in a respective document of the identified set of documents is generated, the content snippet associated with at least one query term of the one or more query terms in the search query. A response to the search query is returned to the client system, the response including information identifying at least the respective document and including the content snippet. Generating the content snippet includes performing a first decompression operation on first token identifiers, from a compressed document repository, to provide a set of second token identifiers, and performing a second decompression operation on the set of second token identifiers to recover uncompressed content comprising a portion of the respective document.

    摘要翻译: 搜索引擎服务器系统从客户端系统接收搜索查询,并根据搜索查询识别一组文档。 产生对应于所识别的一组文档的相应文档中的内容的内容片段,该内容片段与搜索查询中的一个或多个查询词的至少一个查询词相关联。 对搜索查询的响应被返回到客户端系统,响应包括至少标识相应文档并且包括内容片段的信息。 生成内容片段包括对来自压缩文档库的第一令牌标识符执行第一解压缩操作,以提供一组第二令牌标识符,以及对所述第二令牌标识符集合执行第二解压缩操作,以恢复未压缩内容,其包括部分 的相关文件。

    Document compression system and method for use with tokenspace repository
    24.
    发明申请
    Document compression system and method for use with tokenspace repository 有权
    文档压缩系统和方法用于托管存储库

    公开(公告)号:US20070220023A1

    公开(公告)日:2007-09-20

    申请号:US10917739

    申请日:2004-08-13

    IPC分类号: G06F7/00

    摘要: The disclosed embodiments enable multi-stage query scoring, including “snippet” generation, through incremental document reconstruction facilitated by a multi-tiered mapping scheme. The mapping scheme includes a first mapping between unique tokens contained in a set of documents and unique global token identifiers (e.g., 32-bit integers) contained in a global-lexicon (i.e., dictionary). The mapping scheme also includes a second mapping between the global token identifiers and a set of fixed-length local token identifiers (e.g., 8-bit integers) contained in one or more mini-lexicons (i.e., sub-dictionaries). Each mini-lexicon is associated with a range of token positions in the tokenized documents. The first and second mappings are used to encode/decode documents into local token identifiers having fixed widths which can be compactly stored in the tokenspace repository. The use of fixed-length local token identifiers allows for fast and efficient decoding of tokenized documents.

    摘要翻译: 所公开的实施例通过由多层映射方案促进的增量文档重建能够实现多阶段查询评分,包括“代码段”生成。 映射方案包括包含在一组文档中的唯一标记和包含在全局词典(即字典)中的唯一全局令牌标识符(例如,32位整数)之间的第一映射。 映射方案还包括全局令牌标识符与包含在一个或多个小词典(即子词典)中的一组固定长度的本地令牌标识符(例如,8位整数)之间的第二映射。 每个迷你词典与令牌化文档中的一系列令牌位置相关联。 第一和第二映射用于将文档编码/解码为具有固定宽度的本地令牌标识符,其可以紧凑地存储在令牌空间存储库中。 使用固定长度的本地令牌标识符可以快速有效地解码标记化文档。

    System and method for encoding and decoding variable-length data
    25.
    发明授权
    System and method for encoding and decoding variable-length data 有权
    用于对可变长度数据进行编码和解码的系统和方法

    公开(公告)号:US07068192B1

    公开(公告)日:2006-06-27

    申请号:US10917745

    申请日:2004-08-13

    IPC分类号: H03M7/40

    CPC分类号: H03M7/40

    摘要: A system and method for encoding and decoding variable-length data includes storing data values in a data structure including a data field and a tag field. The data field includes one or more variable-length data subfields capable of storing variable-length data (e.g., 1 to N bytes of data). In some embodiments, the data subfields and the tag field of the data structure each start on a byte boundary which simplifies decoding. The tag field includes one or more tag subfields, each corresponding to the one or more data subfields. Each tag subfield includes one or more tag bits which indicate the length of the data stored in the corresponding data subfield. Unpacking or decompressing data values from the data structure can be achieved by using a look-up table of offsets and masks, thus reducing the number of bit operations needed to unpack data values from the data structure.

    摘要翻译: 用于编码和解码可变长度数据的系统和方法包括将数据值存储在包括数据字段和标签字段的数据结构中。 数据字段包括能够存储可变长度数据(例如,1到N字节数据)的一个或多个可变长度数据子字段。 在一些实施例中,数据结构的数据子字段和标签字段各自以字节边界开始,这简化了解码。 标签字段包括一个或多个标签子字段,每个子字段对应于一个或多个数据子字段。 每个标签子字段包括指示存储在相应数据子字段中的数据的长度的一个或多个标签位。 可以通过使用偏移量和掩码的查找表来实现从数据结构中解包或解压缩数据值,从而减少从数据结构中解包数据值所需的位操作数。

    SYSTEMS AND METHODS FOR GENERATING STATISTICS FROM SEARCH ENGINE QUERY LOGS
    26.
    发明申请
    SYSTEMS AND METHODS FOR GENERATING STATISTICS FROM SEARCH ENGINE QUERY LOGS 有权
    用于从搜索引擎查询统计信息生成统计信息的系统和方法

    公开(公告)号:US20110040733A1

    公开(公告)日:2011-02-17

    申请号:US11746049

    申请日:2007-05-08

    IPC分类号: G06F7/00

    摘要: A computer-implemented method includes calculating first statistics about a user-identified event within a first subset of a database of events; selecting a second subset of the database of events based on said first statistics; calculating second statistics about the user-identified event within the second subset of the database of events; merging the first and second statistics as statistics of the user-identified event within the entire database of events; and generating a result including at least a portion of the merged statistics of the user-identified event.

    摘要翻译: 计算机实现的方法包括计算关于事件数据库的第一子集内的用户标识事件的第一统计信息; 基于所述第一统计数据选择事件数据库的第二子集; 计算关于事件数据库的第二子集内的用户标识事件的第二统计; 将第一和第二统计信息合并到整个事件数据库内的用户标识事件的统计数据; 以及生成包括所述用户识别事件的合并统计信息的至少一部分的结果。

    Systems and methods for generating statistics from search engine query logs
    27.
    发明授权
    Systems and methods for generating statistics from search engine query logs 有权
    从搜索引擎查询日志生成统计信息的系统和方法

    公开(公告)号:US09262767B2

    公开(公告)日:2016-02-16

    申请号:US13396511

    申请日:2012-02-14

    IPC分类号: G06F7/00 G06F17/30 G06Q30/02

    摘要: A computer-implemented method includes calculating first statistics about a user-identified event within a first subset of a database of events; selecting a second subset of the database of events based on said first statistics; calculating second statistics about the user-identified event within the second subset of the database of events; merging the first and second statistics as statistics of the user-identified event within the entire database of events; and generating a result including at least a portion of the merged statistics of the user-identified event.

    摘要翻译: 计算机实现的方法包括计算关于事件数据库的第一子集内的用户标识事件的第一统计信息; 基于所述第一统计数据选择事件数据库的第二子集; 计算关于事件数据库的第二子集内的用户标识事件的第二统计; 将第一和第二统计信息合并到整个事件数据库内的用户标识事件的统计数据; 以及生成包括所述用户识别事件的合并统计信息的至少一部分的结果。

    Multi-Stage Query Processing System and Method for Use with Tokenspace Repository
    28.
    发明申请
    Multi-Stage Query Processing System and Method for Use with Tokenspace Repository 有权
    多阶段查询处理系统和方法用于Tokenpace存储库

    公开(公告)号:US20130212092A1

    公开(公告)日:2013-08-15

    申请号:US13851036

    申请日:2013-03-26

    IPC分类号: G06F17/30

    摘要: A multi-stage query processing system and method enables multi-stage query scoring, including “snippet” generation, through incremental document reconstruction facilitated by a multi-tiered mapping scheme. At one or more stages of a multi-stage query processing system a set of relevancy scores are used to select a subset of documents for presentation as an ordered list to a user. The set of relevancy scores can be derived in part from one or more sets of relevancy scores determined in prior stages of the multi-stage query processing system. In some embodiments, the multi-stage query processing system is capable of executing one or more passes on a user query, and using information from each pass to expand the user query for use in a subsequent pass to improve the relevancy of documents in the ordered list.

    摘要翻译: 多级查询处理系统和方法通过多层次映射方案促进的增量文档重建实现了多阶段查询评分,包括“代码段”生成。 在多阶段查询处理系统的一个或多个阶段,使用一组相关性分数来选择文档的子集,以作为用户的排序列表呈现。 相关性分数的集合可以部分地从多级查询处理系统的先前阶段中确定的一组或多组相关性得分导出。 在一些实施例中,多级查询处理系统能够执行用户查询的一个或多个传递,并且使用来自每个遍的信息来扩展用户查询以用于随后的传递中以改善订购中的文档的相关性 列表。

    Multi-stage query processing system and method for use with tokenspace repository
    29.
    发明授权
    Multi-stage query processing system and method for use with tokenspace repository 有权
    多阶段查询处理系统和方法用于托管存储库

    公开(公告)号:US08407239B2

    公开(公告)日:2013-03-26

    申请号:US10917746

    申请日:2004-08-13

    IPC分类号: G06F7/00 G06F17/30

    摘要: A multi-stage query processing system and method enables multi-stage query scoring, including “snippet” generation, through incremental document reconstruction facilitated by a multi-tiered mapping scheme. At one or more stages of a multi-stage query processing system a set of relevancy scores are used to select a subset of documents for presentation as an ordered list to a user. The set of relevancy scores can be derived in part from one or more sets of relevancy scores determined in prior stages of the multi-stage query processing system. In some embodiments, the multi-stage query processing system is capable of executing one or more passes on a user query, and using information from each pass to expand the user query for use in a subsequent pass to improve the relevancy of documents in the ordered list.

    摘要翻译: 多级查询处理系统和方法通过多层次映射方案促进的增量文档重建实现多阶段查询评分,包括片段生成。 在多阶段查询处理系统的一个或多个阶段,使用一组相关性分数来选择文档的子集,以作为用户的排序列表呈现。 相关性分数的集合可以部分地从多级查询处理系统的先前阶段中确定的一组或多组相关性得分导出。 在一些实施例中,多级查询处理系统能够执行用户查询的一个或多个传递,并且使用来自每个遍的信息来扩展用户查询以用于随后的传递中以改善订购中的文档的相关性 列表。

    Systems and methods for generating statistics from search engine query logs
    30.
    发明授权
    Systems and methods for generating statistics from search engine query logs 有权
    从搜索引擎查询日志生成统计信息的系统和方法

    公开(公告)号:US08126874B2

    公开(公告)日:2012-02-28

    申请号:US11746049

    申请日:2007-05-08

    IPC分类号: G06F7/00 G06F17/30

    摘要: A computer-implemented method includes calculating first statistics about a user-identified event within a first subset of a database of events; selecting a second subset of the database of events based on said first statistics; calculating second statistics about the user-identified event within the second subset of the database of events; merging the first and second statistics as statistics of the user-identified event within the entire database of events; and generating a result including at least a portion of the merged statistics of the user-identified event.

    摘要翻译: 计算机实现的方法包括计算关于事件数据库的第一子集内的用户标识事件的第一统计信息; 基于所述第一统计数据选择事件数据库的第二子集; 计算关于事件数据库的第二子集内的用户标识事件的第二统计; 将第一和第二统计信息合并到整个事件数据库内的用户标识事件的统计数据; 以及生成包括所述用户识别事件的合并统计信息的至少一部分的结果。