USER QUERY MINING FOR ADVERTISING MATCHING
    11.
    发明申请
    USER QUERY MINING FOR ADVERTISING MATCHING 有权
    用户查询采购广告匹配

    公开(公告)号:US20090063461A1

    公开(公告)日:2009-03-05

    申请号:US11849136

    申请日:2007-08-31

    IPC分类号: G06F7/06 G06F17/30

    摘要: Systems and methods to determine relevant keywords from a user's search query sessions are disclosed. The described method includes identifying search session logs of a user, segmenting the search session logs into one or more search sessions. After the segmentation, the search sessions are analyzed to compose a list of semantically relevant keyword sets including at least a first keyword set and a second keyword set. The described method further includes determining a semantic relevance between the first and second keyword sets according to the frequency at which the first and second keyword sets are reported in the query results and displaying one or more semantically high relevant keyword sets after being filtered by a threshold.

    摘要翻译: 公开了从用户的搜索查询会话确定相关关键词的系统和方法。 所描述的方法包括识别用户的搜索会话日志,将搜索会话日志分割成一个或多个搜索会话。 在分割之后,分析搜索会话以构成包括至少第一关键词集合和第二关键字集合的语义相关关键字集合的列表。 所描述的方法还包括根据在查询结果中报告第一和第二关键字集合的频率来确定第一和第二关键字集合之间的语义相关性,并且在被阈值过滤之后显示一个或多个语义上相关的关键字集合 。

    IDENTIFICATION OF EVENTS OF SEARCH QUERIES
    12.
    发明申请
    IDENTIFICATION OF EVENTS OF SEARCH QUERIES 有权
    识别搜索查询的事件

    公开(公告)号:US20090006294A1

    公开(公告)日:2009-01-01

    申请号:US11770423

    申请日:2007-06-28

    IPC分类号: G06N5/00

    CPC分类号: G06F17/30864 G06Q30/02

    摘要: Techniques for analyzing and modeling the frequency of queries are provided by a query analysis system. A query analysis system analyzes frequencies of a query over time to determine whether the query is time-dependent or time-independent. The query analysis system forecasts the frequency of time-dependent queries based on their periodicities. The query analysis system forecasts the frequency of time-independent queries based on causal relationships with other queries. To forecast the frequency of time-independent queries, the query analysis system analyzes the frequency of a query over time to identify significant increases in the frequency, which are referred to as “query events” or “events.” The query analysis system forecasts frequencies of time-independent queries based on queries with events that tend to causally precede events of the query to be forecasted.

    摘要翻译: 用于分析和建模查询频率的技术由查询分析系统提供。 查询分析系统分析查询的频率,以确定查询是时间依赖还是时间无关。 查询分析系统根据其周期性预测与时间相关的查询的频率。 查询分析系统根据与其他查询的因果关系预测与时间无关的查询的频率。 为了预测与时间无关的查询的频率,查询分析系统会随着时间的推移分析查询的频率,以确定频率的显着增加,这被称为“查询事件”或“事件”。 查询分析系统基于具有事件倾向于在要预测的查询的事件之前的查询来预测与时间无关的查询的频率。

    CATEGORIZATION OF DOCUMENTS USING PART-OF-SPEECH SMOOTHING
    13.
    发明申请
    CATEGORIZATION OF DOCUMENTS USING PART-OF-SPEECH SMOOTHING 审中-公开
    使用部分语音播放的文档分类

    公开(公告)号:US20080249762A1

    公开(公告)日:2008-10-09

    申请号:US11697112

    申请日:2007-04-05

    IPC分类号: G06F17/20

    CPC分类号: G06F17/2785

    摘要: A method and system is provided for classifying documents based on the subjectivity of the content of the documents using a part-of-speech analysis to help account for unseen words. A classification system trains a classifier using the parts of speech of training documents so that the classifier can classify unseen words based on the part of speech of the unseen word. The classification system then trains a part-of-speech model using the parts of speech of the n-grams of training data and labels of the training documents, and trains a term model using the term unigrams and labels. To classify a target document, the classification system applies the part-of-speech model to the part-of-speech n-grams of the target document and the term model to term n-grams of the target document.

    摘要翻译: 提供了一种方法和系统,用于使用语音分析基于文档内容的主观性对文档进行分类,以帮助解释看不见的单词。 分类系统使用训练文档的部分语言训练分类器,以便分类器可以根据不可见词的部分语言对不可见的单词进行分类。 然后,分类系统使用训练数据的n-gram和训练文档的标签的词性训练一部分语音模型,并且使用术语单词和标签来训练术语模型。 为了对目标文件进行分类,分类系统将部分词汇模型应用于目标文档的词性n-gram和目标文档的术语模型至术语n-gram。

    Efficient Retrieval Algorithm by Query Term Discrimination
    14.
    发明申请
    Efficient Retrieval Algorithm by Query Term Discrimination 有权
    通过查询词辨别的有效检索算法

    公开(公告)号:US20080215574A1

    公开(公告)日:2008-09-04

    申请号:US12038652

    申请日:2008-02-27

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30675 G06Q10/10

    摘要: An exemplary method for use in information retrieval includes, for each of a plurality of terms, selecting a predetermined number of top scoring documents for the term to form a corresponding document set for the term; receiving a plurality of terms, optionally as a query; ranking the plurality of terms for importance based at least in part on the document sets for the plurality of terms where the ranking comprises using an inverse document frequency algorithm; selecting a number of ranked terms based on importance where each selected, ranked term comprises its corresponding document set wherein each document in a respective document set comprises a document identification number; forming a union set based on the document sets associated with the selected number of ranked terms; and, for a document identification number in the union set, scanning a document set corresponding to an unselected term for a matching document identification number. Various other exemplary systems, methods, devices, etc. are also disclosed.

    摘要翻译: 用于信息检索的示例性方法包括对于多个术语中的每一个,为该术语选择预定数量的最高评分文档以形成用于该术语的对应文档集合; 接收多个术语,可选地作为查询; 至少部分地基于所述多个术语的文档集来排序所述多个重要项,所述术语的排序包括使用逆文档频率算法; 基于重要性选择多个排名项,其中每个所选择的排名项包括其对应的文档集,其中相应文档集中的每个文档包括文档标识号; 基于与选定数量的排名项相关联的文档集合来形成联合集合; 并且对于联合集合中的文档识别号码,扫描与匹配文档识别号码的未选择的术语相对应的文档集。 还公开了各种其它示例性系统,方法,装置等。

    GUI BASED WEB SEARCH
    15.
    发明申请
    GUI BASED WEB SEARCH 有权
    基于GUI的网页搜索

    公开(公告)号:US20080208819A1

    公开(公告)日:2008-08-28

    申请号:US12038588

    申请日:2008-02-27

    IPC分类号: G06F17/30

    摘要: An exemplary computer implemented graphics-based Web search system includes a search input control and a results presentation control where the search input control is configured to receive user input to establish a relationship between a query and one or more information tags associated with search results provided by a search engine in response to the query and wherein the results presentation control is configured to re-order the search results in response to the relationship. Such a system allows a user to define and refine search intent and enhance the user's search experience. Various other exemplary systems, methods, devices, etc. are also disclosed.

    摘要翻译: 示例性的计算机实现的基于图形的Web搜索系统包括搜索输入控制和结果呈现控制,其中搜索输入控件被配置为接收用户输入以建立查询与与由搜索结果提供的搜索结果相关联的一个或多个信息标签之间的关系 搜索引擎,其响应于所述查询,并且其中所述结果呈现控制被配置为响应于所述关系重新排序所述搜索结果。 这样的系统允许用户定义和优化搜索意图并增强用户的搜索体验。 还公开了各种其它示例性系统,方法,装置等。

    INTERACTIVELY CRAWLING DATA RECORDS ON WEB PAGES
    16.
    发明申请
    INTERACTIVELY CRAWLING DATA RECORDS ON WEB PAGES 失效
    互联网络数据记录在网页上

    公开(公告)号:US20080016087A1

    公开(公告)日:2008-01-17

    申请号:US11456753

    申请日:2006-07-11

    IPC分类号: G06F7/00

    摘要: The invention provides a method of interactively crawling data records on a web page. Users may select various data records of interest on a web page to generate templates to search for similar data items on the same web page or on different web pages. A tree matching algorithm may be used to compare and extract data matching the generated template.

    摘要翻译: 本发明提供了一种在网页上交互地爬行数据记录的方法。 用户可以在网页上选择感兴趣的各种数据记录,以生成在同一网页或不同网页上搜索类似数据项的模板。 可以使用树匹配算法来比较和提取与生成的模板匹配的数据。

    Advertiser monetization modeling
    18.
    发明授权
    Advertiser monetization modeling 有权
    广告商营利建模

    公开(公告)号:US08117050B2

    公开(公告)日:2012-02-14

    申请号:US12131124

    申请日:2008-06-02

    IPC分类号: G06Q40/00 G06Q30/00 G01C21/34

    摘要: Embodiments of the claimed subject matter provide a method and system for modeling advertiser monetization. The claimed subject matter provides a method and system from which an advertisement may be evaluated according to various metrics to determine a quality relative to other advertisements. The relative quality considers the content of the advertisement, the performance of the advertisement and the history of the advertiser's bidding behavior.One embodiment of the claimed subject matter is implemented as a method for advertiser monetization modeling. One or more advertisements are received from one or more advertisers. The quality of the advertisement(s) is defined according to certain metrics, such as the quality of the content of the advertisement, the quality of the past and estimated future performance of the advertisement and the history of bidding behavior of the advertiser. After the respective quality of the advertisement(s) is determined, the advertisement(s) is ranked with other advertisements according to the determined quality.

    摘要翻译: 所要求保护的主题的实施例提供了用于对广告商获利进行建模的方法和系统。 所要求保护的主题提供了一种方法和系统,从该方法和系统可以根据各种度量来评估广告以确定相对于其他广告的质量。 相对质量考虑广告的内容,广告的表现以及广告商的投标行为的历史。 所要求保护的主题的一个实施例被实现为广告商获利建模的方法。 从一个或多个广告商接收一个或多个广告。 广告的质量根据广告内容的质量,过去的质量以及广告的未来预测以及广告主的投标行为的历史等某些指标来定义。 在确定了广告的相应质量之后,根据所确定的质量对广告进行其他广告的排序。

    Clustering aggregator for RSS feeds
    19.
    发明授权
    Clustering aggregator for RSS feeds 有权
    用于RSS源的聚类聚合器

    公开(公告)号:US07958125B2

    公开(公告)日:2011-06-07

    申请号:US12146481

    申请日:2008-06-26

    IPC分类号: G06F7/00 G06F17/30

    CPC分类号: G06F17/30705

    摘要: A method for merging really simple syndication (RSS) feeds. Stories containing one or more terms may be merged into one or more clusters based on one or more links between the stories. A cluster frequency with which the terms occur in each cluster may be determined. A diameter for each cluster may be determined. A cluster that is most similar to one of the clusters may be determined based on the cluster frequency. The most similar cluster with the one of the clusters may be determined based on each diameter, and each cluster frequency.

    摘要翻译: 一种合并真正简单的联合(RSS)馈送的方法。 包含一个或多个术语的故事可以基于故事之间的一个或多个链接合并成一个或多个集群。 可以确定在每个簇中出现术语的聚类频率。 可以确定每个簇的直径。 可以基于群集频率来确定与簇之一最相似的群集。 可以基于每个直径和每个聚类频率来确定具有一个簇的最相似的簇。

    Efficient retrieval algorithm by query term discrimination
    20.
    发明授权
    Efficient retrieval algorithm by query term discrimination 有权
    通过查询词辨别的有效检索算法

    公开(公告)号:US07925644B2

    公开(公告)日:2011-04-12

    申请号:US12038652

    申请日:2008-02-27

    IPC分类号: G06F17/30 G06F7/00

    CPC分类号: G06F17/30675 G06Q10/10

    摘要: A method and system for use in information retrieval includes, for each of a plurality of terms, selecting a predetermined number of top scoring documents for the term to form a corresponding document set for the term. When a plurality of terms are received, optionally as a query, the system ranks, using an inverse document frequency algorithm, the plurality of terms for importance based on the document sets for the plurality of terms. Then a number of ranked terms are selected based on importance and a union set is formed based on the document sets associated with the selected number of ranked terms.

    摘要翻译: 用于信息检索的方法和系统包括对于多个术语中的每一个,为术语选择预定数量的最高评分文档以形成用于该术语的相应文档集合。 当接收到多个术语时,可选地作为查询,系统使用逆文档频率算法基于多个术语的文档集来排列多个重要术语。 然后,基于重要性选择多个排名项,并且基于与所选择的排序项数相关联的文档集合形成联合集合。