-
公开(公告)号:US08285745B2
公开(公告)日:2012-10-09
申请号:US11849136
申请日:2007-08-31
申请人: Hua Li , HuaJun Zeng , Jian Hu , Zheng Chen , Jian Wang
发明人: Hua Li , HuaJun Zeng , Jian Hu , Zheng Chen , Jian Wang
IPC分类号: G06F17/30
CPC分类号: G06F17/30861 , G06F17/30672 , G06Q30/02
摘要: Systems and methods to determine relevant keywords from a user's search query sessions are disclosed. The described method includes identifying search session logs of a user, segmenting the search session logs into one or more search sessions. After the segmentation, the search sessions are analyzed to compose a list of semantically relevant keyword sets including at least a first keyword set and a second keyword set. The described method further includes determining a semantic relevance between the first and second keyword sets according to the frequency at which the first and second keyword sets are reported in the query results and displaying one or more semantically high relevant keyword sets after being filtered by a threshold.
摘要翻译: 公开了从用户的搜索查询会话确定相关关键词的系统和方法。 所描述的方法包括识别用户的搜索会话日志,将搜索会话日志分割成一个或多个搜索会话。 在分割之后,分析搜索会话以构成包括至少第一关键词集合和第二关键字集合的语义相关关键字集合的列表。 所描述的方法还包括根据在查询结果中报告第一和第二关键字集合的频率来确定第一和第二关键字集合之间的语义相关性,并且在被阈值过滤之后显示一个或多个语义上相关的关键字集合 。
-
公开(公告)号:US20090063461A1
公开(公告)日:2009-03-05
申请号:US11849136
申请日:2007-08-31
申请人: Jian Wang , Hua Li , HuaJun Zeng , Jian Hu , Zheng Chen
发明人: Jian Wang , Hua Li , HuaJun Zeng , Jian Hu , Zheng Chen
CPC分类号: G06F17/30861 , G06F17/30672 , G06Q30/02
摘要: Systems and methods to determine relevant keywords from a user's search query sessions are disclosed. The described method includes identifying search session logs of a user, segmenting the search session logs into one or more search sessions. After the segmentation, the search sessions are analyzed to compose a list of semantically relevant keyword sets including at least a first keyword set and a second keyword set. The described method further includes determining a semantic relevance between the first and second keyword sets according to the frequency at which the first and second keyword sets are reported in the query results and displaying one or more semantically high relevant keyword sets after being filtered by a threshold.
摘要翻译: 公开了从用户的搜索查询会话确定相关关键词的系统和方法。 所描述的方法包括识别用户的搜索会话日志,将搜索会话日志分割成一个或多个搜索会话。 在分割之后,分析搜索会话以构成包括至少第一关键词集合和第二关键字集合的语义相关关键字集合的列表。 所描述的方法还包括根据在查询结果中报告第一和第二关键字集合的频率来确定第一和第二关键字集合之间的语义相关性,并且在被阈值过滤之后显示一个或多个语义上相关的关键字集合 。
-
公开(公告)号:US08321448B2
公开(公告)日:2012-11-27
申请号:US11870359
申请日:2007-10-10
申请人: HuaJun Zeng , Jian Hu , Hua Li , Zeng Chen , Jian Wang
发明人: HuaJun Zeng , Jian Hu , Hua Li , Zeng Chen , Jian Wang
CPC分类号: G06F17/30648 , G06F17/30672 , G06F17/30864
摘要: Click-through log mining is described. Raw search click-through log data is processed to generate ordered query keywords, utilizing an algorithm to expand user-submitted keywords to include high frequency user queries, managing the keywords for a keyword expansion file, analyzing the algorithm performance on a bidding criteria, and identifying related phrases with similar page-click behaviors for advertisements.
摘要翻译: 描述了点击式日志挖掘。 处理原始搜索点击后日志数据以生成有序查询关键字,利用算法来扩展用户提交的关键字以包括高频用户查询,管理关键字扩展文件的关键字,以出价标准分析算法性能;以及 识别与广告相似的页面点击行为的相关短语。
-
公开(公告)号:US07925644B2
公开(公告)日:2011-04-12
申请号:US12038652
申请日:2008-02-27
申请人: Chenxi Lin , Lei Ji , HuaJun Zeng , Benyu Zhang , Zheng Chen , Jian Wang
发明人: Chenxi Lin , Lei Ji , HuaJun Zeng , Benyu Zhang , Zheng Chen , Jian Wang
CPC分类号: G06F17/30675 , G06Q10/10
摘要: A method and system for use in information retrieval includes, for each of a plurality of terms, selecting a predetermined number of top scoring documents for the term to form a corresponding document set for the term. When a plurality of terms are received, optionally as a query, the system ranks, using an inverse document frequency algorithm, the plurality of terms for importance based on the document sets for the plurality of terms. Then a number of ranked terms are selected based on importance and a union set is formed based on the document sets associated with the selected number of ranked terms.
摘要翻译: 用于信息检索的方法和系统包括对于多个术语中的每一个,为术语选择预定数量的最高评分文档以形成用于该术语的相应文档集合。 当接收到多个术语时,可选地作为查询,系统使用逆文档频率算法基于多个术语的文档集来排列多个重要术语。 然后,基于重要性选择多个排名项,并且基于与所选择的排序项数相关联的文档集合形成联合集合。
-
公开(公告)号:US20080215997A1
公开(公告)日:2008-09-04
申请号:US12038687
申请日:2008-02-27
申请人: Min Wu , Chenxi Lin , Benyu Zhang , HuaJun Zeng , Zheng Chen , Jian Wang
发明人: Min Wu , Chenxi Lin , Benyu Zhang , HuaJun Zeng , Zheng Chen , Jian Wang
IPC分类号: G06F3/048
CPC分类号: G06F3/0481
摘要: An exemplary web browser system includes a selection module for selecting a webpage block and recording information about a selected webpage block; a tracking module for tracking changes to a selected webpage block based at least in part on the recorded information for that webpage block; and a display module for displaying a selected webpage block wherein the tracking module updates the display module as to changes to the selected webpage block. Various other exemplary systems, methods, devices are also disclosed.
摘要翻译: 示例性网络浏览器系统包括用于选择网页块并记录关于所选网页块的信息的选择模块; 跟踪模块,用于至少部分地基于所述网页块的记录信息跟踪对所选网页块的改变; 以及用于显示所选网页块的显示模块,其中所述跟踪模块更新所述显示模块以改变所选择的网页块。 还公开了各种其它示例性系统,方法,装置。
-
公开(公告)号:US08280877B2
公开(公告)日:2012-10-02
申请号:US11859461
申请日:2007-09-21
申请人: Benyu Zhang , Jilin Chen , Zheng Chen , HuaJun Zeng , Jian Wang
发明人: Benyu Zhang , Jilin Chen , Zheng Chen , HuaJun Zeng , Jian Wang
IPC分类号: G06F17/30
CPC分类号: G06F17/2745 , G06F17/278
摘要: Systems and methods for implementing diverse topic phrase extraction are disclosed. According to one implementation, multiple word candidate phrases are extracted from a corpus and weighed. One or more documents are re-weighed to identify less obvious candidate topics using latent semantic analysis (LSA). Phrase diversification is then used to remove redundancy and select informative and distinct topic phrases.
摘要翻译: 公开了实现不同主题短语提取的系统和方法。 根据一个实现,从语料库中提取多个单词候选词组并称重。 使用潜在语义分析(LSA),重新衡量一个或多个文档以识别较不明显的候选主题。 然后使用短语多样化来消除冗余并选择信息丰富且不同的主题短语。
-
公开(公告)号:US20080208840A1
公开(公告)日:2008-08-28
申请号:US11859461
申请日:2007-09-21
申请人: Benyu Zhang , Jilin Chen , Zheng Chen , HuaJun Zeng , Jian Wang
发明人: Benyu Zhang , Jilin Chen , Zheng Chen , HuaJun Zeng , Jian Wang
IPC分类号: G06F7/10
CPC分类号: G06F17/2745 , G06F17/278
摘要: Systems and methods for implementing diverse topic phrase extraction are disclosed. According to one implementation, multiple word candidate phrases are extracted from a corpus and weighed. One or more documents are re-weighed to identify less obvious candidate topics using latent semantic analysis (LSA). Phrase diversification is then used to remove redundancy and select informative and distinct topic phrases.
摘要翻译: 公开了实现不同主题短语提取的系统和方法。 根据一个实现,从语料库中提取多个单词候选词组并称重。 使用潜在语义分析(LSA),重新衡量一个或多个文档以识别较不明显的候选主题。 然后使用短语多样化来消除冗余并选择信息丰富且不同的主题短语。
-
公开(公告)号:US20080215574A1
公开(公告)日:2008-09-04
申请号:US12038652
申请日:2008-02-27
申请人: Chenxi Lin , Lei Ji , HuaJun Zeng , Benyu Zhang , Zheng Chen , Jian Wang
发明人: Chenxi Lin , Lei Ji , HuaJun Zeng , Benyu Zhang , Zheng Chen , Jian Wang
IPC分类号: G06F17/30
CPC分类号: G06F17/30675 , G06Q10/10
摘要: An exemplary method for use in information retrieval includes, for each of a plurality of terms, selecting a predetermined number of top scoring documents for the term to form a corresponding document set for the term; receiving a plurality of terms, optionally as a query; ranking the plurality of terms for importance based at least in part on the document sets for the plurality of terms where the ranking comprises using an inverse document frequency algorithm; selecting a number of ranked terms based on importance where each selected, ranked term comprises its corresponding document set wherein each document in a respective document set comprises a document identification number; forming a union set based on the document sets associated with the selected number of ranked terms; and, for a document identification number in the union set, scanning a document set corresponding to an unselected term for a matching document identification number. Various other exemplary systems, methods, devices, etc. are also disclosed.
摘要翻译: 用于信息检索的示例性方法包括对于多个术语中的每一个,为该术语选择预定数量的最高评分文档以形成用于该术语的对应文档集合; 接收多个术语,可选地作为查询; 至少部分地基于所述多个术语的文档集来排序所述多个重要项,所述术语的排序包括使用逆文档频率算法; 基于重要性选择多个排名项,其中每个所选择的排名项包括其对应的文档集,其中相应文档集中的每个文档包括文档标识号; 基于与选定数量的排名项相关联的文档集合来形成联合集合; 并且对于联合集合中的文档识别号码,扫描与匹配文档识别号码的未选择的术语相对应的文档集。 还公开了各种其它示例性系统,方法,装置等。
-
-
-
-
-
-
-