PROVISION OF QUERY SUGGESTIONS INDEPENDENT OF QUERY LOGS
    1.
    发明申请
    PROVISION OF QUERY SUGGESTIONS INDEPENDENT OF QUERY LOGS 有权
    提供QUERY LOGS独立查询建议

    公开(公告)号:US20130151533A1

    公开(公告)日:2013-06-13

    申请号:US13314166

    申请日:2011-12-07

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30646

    摘要: Described herein are various technologies pertaining to provision of query suggestions to a user independent of a query log. Key phrases are automatically identified in documents of a document corpus, and a forward index and inverted index are generated. The forward index indexes key phrases by documents, and the inverted index indexes documents by key phrases. A query is received from a user, and documents relevant to the query are retrieved. Key phrases in the retrieved documents are identified via the forward index, and a subset of the key phrases are selected as query suggestions by determining coverage of the key phrases as identified in the inverted index.

    摘要翻译: 这里描述的是关于向独立于查询日志的用户提供查询建议的各种技术。 关键短语在文档语料库的文档中自动识别,并生成前向索引和反向索引。 前瞻性指标按文件索引关键短语,并通过关键短语索引反向索引索引文档。 从用户接收到查询,并检索与查询相关的文档。 通过前向索引识别检索到的文档中的关键短语,并通过确定在反向索引中标识的关键短语的覆盖率来选择关键短语的子集作为查询建议。

    Provision of query suggestions independent of query logs
    2.
    发明授权
    Provision of query suggestions independent of query logs 有权
    提供独立于查询日志的查询建议

    公开(公告)号:US08566340B2

    公开(公告)日:2013-10-22

    申请号:US13314166

    申请日:2011-12-07

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30646

    摘要: Described herein are various technologies pertaining to provision of query suggestions to a user independent of a query log. Key phrases are automatically identified in documents of a document corpus, and a forward index and inverted index are generated. The forward index indexes key phrases by documents, and the inverted index indexes documents by key phrases. A query is received from a user, and documents relevant to the query are retrieved. Key phrases in the retrieved documents are identified via the forward index, and a subset of the key phrases are selected as query suggestions by determining coverage of the key phrases as identified in the inverted index.

    摘要翻译: 这里描述的是关于向独立于查询日志的用户提供查询建议的各种技术。 关键短语在文档语料库的文档中自动识别,并生成前向索引和反向索引。 前瞻性指标按文件索引关键短语,并通过关键短语索引反向索引索引文档。 从用户接收到查询,并检索与查询相关的文档。 通过前向索引识别检索到的文档中的关键短语,并通过确定在反向索引中标识的关键短语的覆盖率来选择关键短语的子集作为查询建议。

    Semantic object characterization and search
    3.
    发明授权
    Semantic object characterization and search 有权
    语义对象表征和搜索

    公开(公告)号:US08543598B2

    公开(公告)日:2013-09-24

    申请号:US12715174

    申请日:2010-03-01

    IPC分类号: G06F7/00 G06F17/30

    CPC分类号: G06F17/3061 G06F17/3069

    摘要: Semantic object characterization and its use in indexing and searching a database directory is presented. In general, a first binary hash code is generated to represent a first representation or view of a semantic object which when compared to a characterized version of a second representation or view of the same semantic object in the form of a second binary hash code, the first and second binary hash codes exhibit a degree of similarity indicative of the objects being the same object. In one implementation the semantic objects correspond to peoples' names and the first and second representations or views correspond to two different languages. Thus, a user can search a database of information in one language with a search query in another language.

    摘要翻译: 介绍了语义对象表征及其在索引和搜索数据库目录中的用途。 通常,生成第一个二进制哈希码以表示语义对象的第一表示或视图,当与第二二进制散列码的形式的相同语义对象的第二表示或视图的特征版本进行比较时, 第一和第二二进制散列码表现出相似度的程度,表示对象是相同的对象。 在一个实现中,语义对象对应于人们的名字,并且第一和第二表示或视图对应于两种不同的语言。 因此,用户可以使用另一种语言的搜索查询来搜索具有一种语言的信息的数据库。

    Enriched search features based in part on discovering people-centric search intent
    4.
    发明授权
    Enriched search features based in part on discovering people-centric search intent 有权
    丰富的搜索功能部分基于以人为中心的搜索意图

    公开(公告)号:US08510322B2

    公开(公告)日:2013-08-13

    申请号:US13162620

    申请日:2011-06-17

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30867

    摘要: A search environment of an embodiment includes name mining and matching features used in part to identify people-centric queries and provide an enriched search experience, but is not so limited. A method of an embodiment operates to provide an expanded query based in part on a geometric similarity measure, an edit distance measure, a string similarity measure, and a cumulative similarity measure. A search system of an embodiment includes a mined candidate generator component and a name matcher component used in part to identify name queries and provide an expanded query that includes original query terms and one or more valid mined names. Other embodiments are also disclosed.

    摘要翻译: 实施例的搜索环境包括名称挖掘和匹配特征,其部分用于识别以人为中心的查询并提供丰富的搜索体验,但不限于此。 实施例的方法用于部分地基于几何相似性度量,编辑距离度量,字符串相似性度量和累积相似性度量来提供扩展查询。 实施例的搜索系统包括挖掘的候选生成器组件和名称匹配器组件,其部分地用于标识名称查询,并提供包括原始查询项和一个或多个有效的挖掘名称的扩展查询。 还公开了其他实施例。

    IDENTIFYING TOPICALLY-RELATED PHRASES IN A BROWSING SEQUENCE
    5.
    发明申请
    IDENTIFYING TOPICALLY-RELATED PHRASES IN A BROWSING SEQUENCE 有权
    在浏览序列中识别与主题相关的短语

    公开(公告)号:US20120053927A1

    公开(公告)日:2012-03-01

    申请号:US12873660

    申请日:2010-09-01

    IPC分类号: G06F17/27

    CPC分类号: G06F17/30884

    摘要: Browsing sequence phrase identification technique embodiments are presented that generally extract topically-related phrases from the pages visited by a user in a browsing session. The topically-related phrases can be used for a variety of purposes, including aiding a user in re-finding previously visited sites. This phrase identification task is performed by considering not just the pages of a user's browsing sequence individually, but also pages visited immediately before and immediately after each page. In this way, phrases found in a page can be analyzed in the context in which the page was viewed, rather than in isolation. The identified phrases are further filtered by picking those that appear on a pre-populated topic list, and then clustering to find the most informative ones.

    摘要翻译: 提供了浏览序列短语识别技术实施例,其通常从浏览会话中由用户访问的页面提取局部相关的短语。 局部相关的短语可以用于各种目的,包括帮助用户重新查找以前访问过的网站。 这个短语识别任务是通过单独考虑用户的浏览序列的页面,而不仅仅考虑每个页面之前和之后立即访问的页面来执行的。 以这种方式,可以在页面被查看的上下文中而不是孤立地分析页面中发现的短语。 通过挑选出现在预先填充的主题列表中的那些进一步过滤所识别的短语,然后进行聚类以找到最有帮助的。

    SEMANTIC OBJECT CHARACTERIZATION AND SEARCH
    6.
    发明申请
    SEMANTIC OBJECT CHARACTERIZATION AND SEARCH 有权
    语义对象特征和搜索

    公开(公告)号:US20110213784A1

    公开(公告)日:2011-09-01

    申请号:US12715174

    申请日:2010-03-01

    IPC分类号: G06F17/30

    CPC分类号: G06F17/3061 G06F17/3069

    摘要: Semantic object characterization and its use in indexing and searching a database directory is presented. In general, a first binary hash code is generated to represent a first representation or view of a semantic object which when compared to a characterized version of a second representation or view of the same semantic object in the form of a second binary hash code, the first and second binary hash codes exhibit a degree of similarity indicative of the objects being the same object. In one implementation the semantic objects correspond to peoples' names and the first and second representations or views correspond to two different languages. Thus, a user can search a database of information in one language with a search query in another language.

    摘要翻译: 介绍了语义对象表征及其在索引和搜索数据库目录中的用途。 通常,生成第一个二进制哈希码以表示语义对象的第一表示或视图,当与第二二进制散列码的形式的相同语义对象的第二表示或视图的特征版本进行比较时, 第一和第二二进制散列码表现出相似度的程度,表示对象是相同的对象。 在一个实现中,语义对象对应于人们的名字,并且第一和第二表示或视图对应于两种不同的语言。 因此,用户可以使用另一种语言的搜索查询来搜索具有一种语言的信息的数据库。

    MINING TRANSLITERATIONS FOR OUT-OF-VOCABULARY QUERY TERMS
    7.
    发明申请
    MINING TRANSLITERATIONS FOR OUT-OF-VOCABULARY QUERY TERMS 有权
    用于超越查询条款的采矿翻译

    公开(公告)号:US20100185670A1

    公开(公告)日:2010-07-22

    申请号:US12350981

    申请日:2009-01-09

    IPC分类号: G06F17/30 G06F17/20

    摘要: An approach is described for using a query expressed in a source language to retrieve information expressed in a target language. The approach uses a translation dictionary to convert terms in the query from the source language to appropriate terms in the target language. The approach determines viable transliterations for out-of-vocabulary (OOV) query terms by retrieving a body of information based on an in-vocabulary component of the query, and then mining the body of information to identify the viable transliterations for the OOV query terms. The approach then adds the viable transliterations to the translation dictionary. The retrieval, mining, and adding operations can be repeated one or more or times.

    摘要翻译: 描述了使用以源语言表示的查询来检索以目标语言表达的信息的方法。 该方法使用翻译字典将查询中的术语从源语言转换为目标语言的适当术语。 该方法通过基于查询的词汇组成部分检索一组信息,然后挖掘信息主体来识别OOV查询词语的可行音译,从而确定词汇(OOV)查询词语的可行音译 。 然后,该方法将可行的音译添加到翻译字典中。 检索,挖掘和添加操作可以重复一次或多次。

    KEYWORD EXTRACTION FROM UNIFORM RESOURCE LOCATORS (URLS)
    8.
    发明申请
    KEYWORD EXTRACTION FROM UNIFORM RESOURCE LOCATORS (URLS) 审中-公开
    关键字从均匀资源定位点(URL)提取

    公开(公告)号:US20120239667A1

    公开(公告)日:2012-09-20

    申请号:US13048678

    申请日:2011-03-15

    IPC分类号: G06F17/30

    CPC分类号: G06F16/958 G06F16/955

    摘要: The keyword extraction technique described herein extracts keywords from Uniform Resource Locators (URLs) in web logs. The technique leverages the content and the structure of URLs to extract relevant keywords. First, a URL is divided into multiple components based on its structure. A set of keywords are extracted from each component of the URL independently with the help of a controlled vocabulary. Then a second set of keywords are generated by forming combinations of terms from different segments of the URL. Only those combinations which are present in the controlled vocabulary are retained as keywords. Finally, the keywords are scored with a function which took into account of a wide set of features.

    摘要翻译: 本文描述的关键词提取技术从Web日志中的统一资源定位符(URL)中提取关键字。 该技术利用URL的内容和结构来提取相关关键字。 首先,URL根据其结构分为多个组件。 借助于受控词汇表,独立地从URL的每个组件中提取一组关键字。 然后,通过形成来自URL的不同段的术语的组合来生成第二组关键字。 只有存在于受控词汇中的组合才被保留为关键词。 最后,关键词得到了一个考虑到广泛功能的功能。

    Context dependent keyword suggestion for advertising
    9.
    发明授权
    Context dependent keyword suggestion for advertising 有权
    背景相关关键字广告宣传

    公开(公告)号:US08700599B2

    公开(公告)日:2014-04-15

    申请号:US13300678

    申请日:2011-11-21

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30663 G06Q30/0251

    摘要: Various technologies described herein pertain to suggesting context dependent keywords for advertising. A set of seed queries can be identified from a context, where the context is a source keyword, a search query, a category, or a landing page. Moreover, the set of seed queries can be inputted to a search engine. A predetermined number of web pages returned by the search engine upon executing the set of seed queries can be retrieved. Candidate keywords can be extracted from the web pages returned by the search engine. Further, keywords from the candidate keywords can be selected from the candidate keywords based on relevance scores of the candidate keywords.

    摘要翻译: 本文描述的各种技术涉及建议用于广告的上下文相关关键字。 可以从上下文识别一组种子查询,其中上下文是源关键字,搜索查询,类别或着陆页。 此外,种子查询的集合可以被输入到搜索引擎。 可以检索在执行种子查询集合时由搜索引擎返回的预定数量的网页。 可以从搜索引擎返回的网页中提取候选关键字。 此外,可以基于候选关键词的相关性分数,从候选关键字中选择来自候选关键词的关键词。

    Identifying topically-related phrases in a browsing sequence
    10.
    发明授权
    Identifying topically-related phrases in a browsing sequence 有权
    在浏览序列中识别与局部相关的短语

    公开(公告)号:US08655648B2

    公开(公告)日:2014-02-18

    申请号:US12873660

    申请日:2010-09-01

    IPC分类号: G06F17/27

    CPC分类号: G06F17/30884

    摘要: Browsing sequence phrase identification technique embodiments are presented that generally extract topically-related phrases from the pages visited by a user in a browsing session. The topically-related phrases can be used for a variety of purposes, including aiding a user in re-finding previously visited sites. This phrase identification task is performed by considering not just the pages of a user's browsing sequence individually, but also pages visited immediately before and immediately after each page. In this way, phrases found in a page can be analyzed in the context in which the page was viewed, rather than in isolation. The identified phrases are further filtered by picking those that appear on a pre-populated topic list, and then clustering to find the most informative ones.

    摘要翻译: 提供了浏览序列短语识别技术实施例,其通常从浏览会话中由用户访问的页面提取局部相关的短语。 局部相关的短语可以用于各种目的,包括帮助用户重新查找以前访问过的网站。 这个短语识别任务是通过单独考虑用户的浏览序列的页面,而不仅仅考虑每个页面之前和之后立即访问的页面来执行的。 以这种方式,可以在页面被查看的上下文中而不是孤立地分析页面中发现的短语。 通过挑选出现在预先填充的主题列表中的那些进一步过滤所识别的短语,然后进行聚类以找到最有帮助的短语。