-
公开(公告)号:US20090063461A1
公开(公告)日:2009-03-05
申请号:US11849136
申请日:2007-08-31
申请人: Jian Wang , Hua Li , HuaJun Zeng , Jian Hu , Zheng Chen
发明人: Jian Wang , Hua Li , HuaJun Zeng , Jian Hu , Zheng Chen
CPC分类号: G06F17/30861 , G06F17/30672 , G06Q30/02
摘要: Systems and methods to determine relevant keywords from a user's search query sessions are disclosed. The described method includes identifying search session logs of a user, segmenting the search session logs into one or more search sessions. After the segmentation, the search sessions are analyzed to compose a list of semantically relevant keyword sets including at least a first keyword set and a second keyword set. The described method further includes determining a semantic relevance between the first and second keyword sets according to the frequency at which the first and second keyword sets are reported in the query results and displaying one or more semantically high relevant keyword sets after being filtered by a threshold.
摘要翻译: 公开了从用户的搜索查询会话确定相关关键词的系统和方法。 所描述的方法包括识别用户的搜索会话日志,将搜索会话日志分割成一个或多个搜索会话。 在分割之后,分析搜索会话以构成包括至少第一关键词集合和第二关键字集合的语义相关关键字集合的列表。 所描述的方法还包括根据在查询结果中报告第一和第二关键字集合的频率来确定第一和第二关键字集合之间的语义相关性,并且在被阈值过滤之后显示一个或多个语义上相关的关键字集合 。
-
公开(公告)号:US20090006294A1
公开(公告)日:2009-01-01
申请号:US11770423
申请日:2007-06-28
申请人: Ning Liu , Jun Yan , Benyu Zhang , Zheng Chen , Jian Wang
发明人: Ning Liu , Jun Yan , Benyu Zhang , Zheng Chen , Jian Wang
IPC分类号: G06N5/00
CPC分类号: G06F17/30864 , G06Q30/02
摘要: Techniques for analyzing and modeling the frequency of queries are provided by a query analysis system. A query analysis system analyzes frequencies of a query over time to determine whether the query is time-dependent or time-independent. The query analysis system forecasts the frequency of time-dependent queries based on their periodicities. The query analysis system forecasts the frequency of time-independent queries based on causal relationships with other queries. To forecast the frequency of time-independent queries, the query analysis system analyzes the frequency of a query over time to identify significant increases in the frequency, which are referred to as “query events” or “events.” The query analysis system forecasts frequencies of time-independent queries based on queries with events that tend to causally precede events of the query to be forecasted.
摘要翻译: 用于分析和建模查询频率的技术由查询分析系统提供。 查询分析系统分析查询的频率,以确定查询是时间依赖还是时间无关。 查询分析系统根据其周期性预测与时间相关的查询的频率。 查询分析系统根据与其他查询的因果关系预测与时间无关的查询的频率。 为了预测与时间无关的查询的频率,查询分析系统会随着时间的推移分析查询的频率,以确定频率的显着增加,这被称为“查询事件”或“事件”。 查询分析系统基于具有事件倾向于在要预测的查询的事件之前的查询来预测与时间无关的查询的频率。
-
公开(公告)号:US20080249762A1
公开(公告)日:2008-10-09
申请号:US11697112
申请日:2007-04-05
申请人: Jian Wang , Jian-Tao Sun , Shen Huang , Zheng Chen
发明人: Jian Wang , Jian-Tao Sun , Shen Huang , Zheng Chen
IPC分类号: G06F17/20
CPC分类号: G06F17/2785
摘要: A method and system is provided for classifying documents based on the subjectivity of the content of the documents using a part-of-speech analysis to help account for unseen words. A classification system trains a classifier using the parts of speech of training documents so that the classifier can classify unseen words based on the part of speech of the unseen word. The classification system then trains a part-of-speech model using the parts of speech of the n-grams of training data and labels of the training documents, and trains a term model using the term unigrams and labels. To classify a target document, the classification system applies the part-of-speech model to the part-of-speech n-grams of the target document and the term model to term n-grams of the target document.
摘要翻译: 提供了一种方法和系统,用于使用语音分析基于文档内容的主观性对文档进行分类,以帮助解释看不见的单词。 分类系统使用训练文档的部分语言训练分类器,以便分类器可以根据不可见词的部分语言对不可见的单词进行分类。 然后,分类系统使用训练数据的n-gram和训练文档的标签的词性训练一部分语音模型,并且使用术语单词和标签来训练术语模型。 为了对目标文件进行分类,分类系统将部分词汇模型应用于目标文档的词性n-gram和目标文档的术语模型至术语n-gram。
-
公开(公告)号:US20080215574A1
公开(公告)日:2008-09-04
申请号:US12038652
申请日:2008-02-27
申请人: Chenxi Lin , Lei Ji , HuaJun Zeng , Benyu Zhang , Zheng Chen , Jian Wang
发明人: Chenxi Lin , Lei Ji , HuaJun Zeng , Benyu Zhang , Zheng Chen , Jian Wang
IPC分类号: G06F17/30
CPC分类号: G06F17/30675 , G06Q10/10
摘要: An exemplary method for use in information retrieval includes, for each of a plurality of terms, selecting a predetermined number of top scoring documents for the term to form a corresponding document set for the term; receiving a plurality of terms, optionally as a query; ranking the plurality of terms for importance based at least in part on the document sets for the plurality of terms where the ranking comprises using an inverse document frequency algorithm; selecting a number of ranked terms based on importance where each selected, ranked term comprises its corresponding document set wherein each document in a respective document set comprises a document identification number; forming a union set based on the document sets associated with the selected number of ranked terms; and, for a document identification number in the union set, scanning a document set corresponding to an unselected term for a matching document identification number. Various other exemplary systems, methods, devices, etc. are also disclosed.
摘要翻译: 用于信息检索的示例性方法包括对于多个术语中的每一个,为该术语选择预定数量的最高评分文档以形成用于该术语的对应文档集合; 接收多个术语,可选地作为查询; 至少部分地基于所述多个术语的文档集来排序所述多个重要项,所述术语的排序包括使用逆文档频率算法; 基于重要性选择多个排名项,其中每个所选择的排名项包括其对应的文档集,其中相应文档集中的每个文档包括文档标识号; 基于与选定数量的排名项相关联的文档集合来形成联合集合; 并且对于联合集合中的文档识别号码,扫描与匹配文档识别号码的未选择的术语相对应的文档集。 还公开了各种其它示例性系统,方法,装置等。
-
公开(公告)号:US20080208819A1
公开(公告)日:2008-08-28
申请号:US12038588
申请日:2008-02-27
申请人: Min Wang , Zheng Chen , Jian-Tao Sun , Shen Huang , Jian Wang
发明人: Min Wang , Zheng Chen , Jian-Tao Sun , Shen Huang , Jian Wang
IPC分类号: G06F17/30
CPC分类号: G06F17/30696 , G06F17/30643 , G06F17/30864
摘要: An exemplary computer implemented graphics-based Web search system includes a search input control and a results presentation control where the search input control is configured to receive user input to establish a relationship between a query and one or more information tags associated with search results provided by a search engine in response to the query and wherein the results presentation control is configured to re-order the search results in response to the relationship. Such a system allows a user to define and refine search intent and enhance the user's search experience. Various other exemplary systems, methods, devices, etc. are also disclosed.
摘要翻译: 示例性的计算机实现的基于图形的Web搜索系统包括搜索输入控制和结果呈现控制,其中搜索输入控件被配置为接收用户输入以建立查询与与由搜索结果提供的搜索结果相关联的一个或多个信息标签之间的关系 搜索引擎,其响应于所述查询,并且其中所述结果呈现控制被配置为响应于所述关系重新排序所述搜索结果。 这样的系统允许用户定义和优化搜索意图并增强用户的搜索体验。 还公开了各种其它示例性系统,方法,装置等。
-
公开(公告)号:US20080016087A1
公开(公告)日:2008-01-17
申请号:US11456753
申请日:2006-07-11
申请人: Benyu Zhang , Chenxi Lin , Hua-Jun Zeng , Jian Wang , Ke Tang , Zheng Chen
发明人: Benyu Zhang , Chenxi Lin , Hua-Jun Zeng , Jian Wang , Ke Tang , Zheng Chen
IPC分类号: G06F7/00
CPC分类号: G06F17/30864 , Y10S707/99935
摘要: The invention provides a method of interactively crawling data records on a web page. Users may select various data records of interest on a web page to generate templates to search for similar data items on the same web page or on different web pages. A tree matching algorithm may be used to compare and extract data matching the generated template.
摘要翻译: 本发明提供了一种在网页上交互地爬行数据记录的方法。 用户可以在网页上选择感兴趣的各种数据记录,以生成在同一网页或不同网页上搜索类似数据项的模板。 可以使用树匹配算法来比较和提取与生成的模板匹配的数据。
-
公开(公告)号:US20120078715A1
公开(公告)日:2012-03-29
申请号:US13310103
申请日:2011-12-02
申请人: Li Li , Tarek Najm , Ying Li , Zheng Chen , Hua-Jun Zeng , Ke Tang , Zhifeng Yang , FengPing Zeng , Xianfang Wang , Xiafeng Dai , Benyu Zang , Jian Wang
发明人: Li Li , Tarek Najm , Ying Li , Zheng Chen , Hua-Jun Zeng , Ke Tang , Zhifeng Yang , FengPing Zeng , Xianfang Wang , Xiafeng Dai , Benyu Zang , Jian Wang
IPC分类号: G06Q30/02
CPC分类号: G06Q30/0241 , G06F16/9535 , G06Q30/02 , G06Q30/0254
摘要: A system and method are disclosed for providing documents related to a search request. The search request may include a search query of one or more keywords, or the search request may be a demographic search query including one or more demographic attributes. An index containing data crawled from publisher's websites, demographic information of registered users, along with the search history of the registered users can be created. Once a search request is received, the search request can be compared to the information stored in the index, and one or more documents related to the request can be provided.
摘要翻译: 公开了一种用于提供与搜索请求相关的文档的系统和方法。 搜索请求可以包括一个或多个关键字的搜索查询,或者搜索请求可以是包括一个或多个人口统计属性的人口统计学搜索查询。 可以创建包含从发布商网站爬取的数据,注册用户的人口统计信息以及注册用户的搜索记录的索引。 一旦接收到搜索请求,可以将搜索请求与存储在索引中的信息进行比较,并且可以提供与该请求相关的一个或多个文档。
-
公开(公告)号:US08117050B2
公开(公告)日:2012-02-14
申请号:US12131124
申请日:2008-06-02
申请人: Hua Li , Zheng Chen , Jian Wang
发明人: Hua Li , Zheng Chen , Jian Wang
CPC分类号: G06Q30/02 , G06Q10/025 , G06Q30/0207 , G06Q30/0277 , G06Q40/08
摘要: Embodiments of the claimed subject matter provide a method and system for modeling advertiser monetization. The claimed subject matter provides a method and system from which an advertisement may be evaluated according to various metrics to determine a quality relative to other advertisements. The relative quality considers the content of the advertisement, the performance of the advertisement and the history of the advertiser's bidding behavior.One embodiment of the claimed subject matter is implemented as a method for advertiser monetization modeling. One or more advertisements are received from one or more advertisers. The quality of the advertisement(s) is defined according to certain metrics, such as the quality of the content of the advertisement, the quality of the past and estimated future performance of the advertisement and the history of bidding behavior of the advertiser. After the respective quality of the advertisement(s) is determined, the advertisement(s) is ranked with other advertisements according to the determined quality.
摘要翻译: 所要求保护的主题的实施例提供了用于对广告商获利进行建模的方法和系统。 所要求保护的主题提供了一种方法和系统,从该方法和系统可以根据各种度量来评估广告以确定相对于其他广告的质量。 相对质量考虑广告的内容,广告的表现以及广告商的投标行为的历史。 所要求保护的主题的一个实施例被实现为广告商获利建模的方法。 从一个或多个广告商接收一个或多个广告。 广告的质量根据广告内容的质量,过去的质量以及广告的未来预测以及广告主的投标行为的历史等某些指标来定义。 在确定了广告的相应质量之后,根据所确定的质量对广告进行其他广告的排序。
-
公开(公告)号:US07958125B2
公开(公告)日:2011-06-07
申请号:US12146481
申请日:2008-06-26
申请人: Jun Yan , Ning Liu , Lei Ji , Zheng Chen , Jian Wang
发明人: Jun Yan , Ning Liu , Lei Ji , Zheng Chen , Jian Wang
CPC分类号: G06F17/30705
摘要: A method for merging really simple syndication (RSS) feeds. Stories containing one or more terms may be merged into one or more clusters based on one or more links between the stories. A cluster frequency with which the terms occur in each cluster may be determined. A diameter for each cluster may be determined. A cluster that is most similar to one of the clusters may be determined based on the cluster frequency. The most similar cluster with the one of the clusters may be determined based on each diameter, and each cluster frequency.
摘要翻译: 一种合并真正简单的联合(RSS)馈送的方法。 包含一个或多个术语的故事可以基于故事之间的一个或多个链接合并成一个或多个集群。 可以确定在每个簇中出现术语的聚类频率。 可以确定每个簇的直径。 可以基于群集频率来确定与簇之一最相似的群集。 可以基于每个直径和每个聚类频率来确定具有一个簇的最相似的簇。
-
公开(公告)号:US07925644B2
公开(公告)日:2011-04-12
申请号:US12038652
申请日:2008-02-27
申请人: Chenxi Lin , Lei Ji , HuaJun Zeng , Benyu Zhang , Zheng Chen , Jian Wang
发明人: Chenxi Lin , Lei Ji , HuaJun Zeng , Benyu Zhang , Zheng Chen , Jian Wang
CPC分类号: G06F17/30675 , G06Q10/10
摘要: A method and system for use in information retrieval includes, for each of a plurality of terms, selecting a predetermined number of top scoring documents for the term to form a corresponding document set for the term. When a plurality of terms are received, optionally as a query, the system ranks, using an inverse document frequency algorithm, the plurality of terms for importance based on the document sets for the plurality of terms. Then a number of ranked terms are selected based on importance and a union set is formed based on the document sets associated with the selected number of ranked terms.
摘要翻译: 用于信息检索的方法和系统包括对于多个术语中的每一个,为术语选择预定数量的最高评分文档以形成用于该术语的相应文档集合。 当接收到多个术语时,可选地作为查询,系统使用逆文档频率算法基于多个术语的文档集来排列多个重要术语。 然后,基于重要性选择多个排名项,并且基于与所选择的排序项数相关联的文档集合形成联合集合。
-
-
-
-
-
-
-
-
-