Locating meaningful stopwords or stop-phrases in keyword-based retrieval systems

    公开(公告)号:US09817920B1

    公开(公告)日:2017-11-14

    申请号:US14628692

    申请日:2015-02-23

    Applicant: GOOGLE INC.

    Abstract: A stopword detection component detects stopwords (also stop-phrases) in search queries input to keyword-based information retrieval systems. Potential stopwords are initially identified by comparing the terms in the search query to a list of known stopwords. Context data is then retrieved based on the search query and the identified stopwords. In one implementation, the context data includes documents retrieved from a document index. In another implementation, the context data includes categories relevant to the search query. Sets of retrieved context data are compared to one another to determine if they are substantially similar. If the sets of context data are substantially similar, this fact may be used to infer that the removal of the potential stopword(s) is not material to the search. If the sets of context data are not substantially similar, the potential stopword can be considered material to the search and should not be removed from the query.

    Providing result-based query suggestions
    12.
    发明授权
    Providing result-based query suggestions 有权
    提供基于结果的查询建议

    公开(公告)号:US09563692B1

    公开(公告)日:2017-02-07

    申请号:US14696020

    申请日:2015-04-24

    Applicant: Google Inc.

    Abstract: In general, one aspect of the subject matter described can be embodied in a method that includes, obtaining a plurality of search results responsive to an initial search query, the search results including a first search result that identifies a first resource; determining, using a document-to-query-to-document model, that the first resource is relevant to a first suggested query different from the initial search query; generating a presentation of the search results responsive to the initial search query; and providing the presentation of the search results in response to the initial search query. Each search result in the presentation includes a link to a respective resource, wherein the first search result in the presentation includes a link that, upon a selection by a user, can cause the first suggested query to be submitted to a search engine.

    Abstract translation: 通常,所描述的主题的一个方面可以体现在一种方法中,该方法包括:响应于初始搜索查询获得多个搜索结果,所述搜索结果包括识别第一资源的第一搜索结果; 使用文档到查询到文档模型来确定所述第一资源与不同于所述初始搜索查询的第一建议查询相关; 响应于初始搜索查询生成搜索结果的呈现; 以及响应于初始搜索查询提供搜索结果的呈现。 呈现中的每个搜索结果包括到相应资源的链接,其中呈现中的第一搜索结果包括在用户选择时可以将第一建议查询提交给搜索引擎的链接。

    System and method for providing search query refinements
    13.
    发明授权
    System and method for providing search query refinements 有权
    提供搜索查询优化的系统和方法

    公开(公告)号:US09552388B2

    公开(公告)日:2017-01-24

    申请号:US14169879

    申请日:2014-01-31

    Applicant: Google Inc.

    Abstract: A system and method for providing search query refinements are presented. A stored query and a stored document are associated as a logical pairing. A weight is assigned to the logical pairing. The search query is issued and a set of search documents is produced. At least one search document is matched to at least one stored document. The stored query and the assigned weight associated with the matching at least one stored document are retrieved. At least one cluster is formed based on the stored query and the assigned weight associated with the matching at least one stored document. The stored query associated with the matching at least one stored document are scored for the at least one cluster relative to at least one other cluster. At least one such scored search query is suggested as a set of query refinements.

    Abstract translation: 提出了一种提供搜索查询优化的系统和方法。 存储的查询和存储的文档被关联为逻辑配对。 权重被分配给逻辑配对。 发出搜索查询,并生成一组搜索文档。 至少一个搜索文档与至少一个存储的文档匹配。 检索存储的查询和与匹配的至少一个存储的文档相关联的分配的权重。 基于存储的查询和与匹配至少一个存储的文档相关联的分配的权重,形成至少一个群集。 与至少一个存储的文档匹配的存储查询相对于至少一个其他集群对于至少一个集群进行评分。 建议至少一个这样的计分搜索查询作为一组查询优化。

    Using synthetic descriptive text to rank search results
    15.
    发明授权
    Using synthetic descriptive text to rank search results 有权
    使用合成描述性文本来排序搜索结果

    公开(公告)号:US09208233B1

    公开(公告)日:2015-12-08

    申请号:US13731957

    申请日:2012-12-31

    Applicant: Google Inc.

    CPC classification number: G06F17/30864

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for using synthetic descriptive text to rank search results. One of the methods includes receiving a search query from a user device; receiving data identifying a plurality of search result resources and respective initial scores for each of the search result resources; determining, from a search engine index, that a particular search result resource of the plurality of search result resources is associated with one or more pieces of synthetic descriptive text, wherein each piece of synthetic descriptive text is generated by applying a respective template to a respective linking resource that links to the particular search result resource; computing a synthetic descriptive text score for the particular search result resource; and adjusting the initial score for the particular search result resource based at least in part on the synthetic descriptive text score.

    Abstract translation: 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于使用合成描述性文本来对搜索结果排序。 其中一种方法包括从用户设备接收搜索查询; 接收识别多个搜索结果资源的数据和每个搜索结果资源的各自的初始分数; 从搜索引擎索引确定所述多个搜索结果资源中的特定搜索结果资源与一个或多个合成描述文本相关联,其中每条合成描述文本通过将相应的模板应用于相应的 链接到特定搜索结果资源的链接资源; 计算特定搜索结果资源的合成描述性文本分数; 以及至少部分地基于所述合成描述性文本分数来调整所述特定搜索结果资源的初始分数。

    Propagating information among web pages
    16.
    发明授权
    Propagating information among web pages 有权
    在网页之间传播信息

    公开(公告)号:US08990210B2

    公开(公告)日:2015-03-24

    申请号:US13968339

    申请日:2013-08-15

    Applicant: Google Inc.

    CPC classification number: G06F17/30321 G06F17/30864 G06F17/3089

    Abstract: Web pages of a Website may be processed to improve search results. For example, information likely to pertain to more than just the Web page it is directly associated with may be identified. One or more other, related, Web pages that such information is likely to pertain to is also identified. The identified information is associated with the identified other Web page(s) and this association is saved in a way to affect a search result score of the Web page(s).

    Abstract translation: 可能会处理网站的网页以改进搜索结果。 例如,可能会识别与其直接相关联的网页不仅仅在于其信息的信息。 也可以识别出这样的信息可能涉及的一个或多个其他相关的网页。 所识别的信息与所标识的其他网页相关联,并且以影响网页的搜索结果分数的方式保存该关联。

    Locating meaningful stopwords or stop-phrases in keyword-based retrieval systems
    17.
    发明授权
    Locating meaningful stopwords or stop-phrases in keyword-based retrieval systems 有权
    在基于关键字的检索系统中找到有意义的词汇或停止词组

    公开(公告)号:US08965919B1

    公开(公告)日:2015-02-24

    申请号:US14143161

    申请日:2013-12-30

    Applicant: Google Inc.

    Abstract: A stopword detection component detects stopwords (also stop-phrases) in search queries input to keyword-based information retrieval systems. Potential stopwords are initially identified by comparing the terms in the search query to a list of known stopwords. Context data is then retrieved based on the search query and the identified stopwords. In one implementation, the context data includes documents retrieved from a document index. In another implementation, the context data includes categories relevant to the search query. Sets of retrieved context data are compared to one another to determine if they are substantially similar. If the sets of context data are substantially similar, this fact may be used to infer that the removal of the potential stopword(s) is not material to the search. If the sets of context data are not substantially similar, the potential stopword can be considered material to the search and should not be removed from the query.

    Abstract translation: 停止词检测组件在输入到基于关键字的信息检索系统的搜索查询中检测到停止词(也称为停止词)。 最初通过将搜索查询中的术语与已知无效词列表进行比较来识别潜在的禁忌词。 然后基于搜索查询和所识别的无效词来检索上下文数据。 在一个实现中,上下文数据包括从文档索引检索的文档。 在另一实现中,上下文数据包括与搜索查询相关的类别。 将检索到的上下文数据的集合彼此进行比较以确定它们是否基本相似。 如果上下文数据集合基本相似,则可以使用该事实来推断潜在的停止词的移除对搜索不重要。 如果上下文数据集基本上不相似,潜在的停用词可以被认为是搜索的重要内容,不应该从查询中移除。

Patent Agency Ranking