-
公开(公告)号:US20070294223A1
公开(公告)日:2007-12-20
申请号:US11424589
申请日:2006-06-16
IPC分类号: G06F17/30
CPC分类号: G06F16/353
摘要: A system and method for categorizing documents with the aid of an external knowledge database. In an exemplary embodiment of the invention, an external knowledge database is used to provide concepts related to the documents of a categorized database and an input document in order to improve the ability of correctly categorizing input documents. Additionally, the above system and method can be implemented to search for documents related to an input document.
摘要翻译: 借助外部知识数据库对文件进行分类的系统和方法。 在本发明的示例性实施例中,外部知识数据库用于提供与分类数据库和输入文档的文档相关的概念,以便提高对输入文档进行正确分类的能力。 此外,可以实现上述系统和方法来搜索与输入文档相关的文档。
-
公开(公告)号:US08108204B2
公开(公告)日:2012-01-31
申请号:US11457341
申请日:2006-07-13
CPC分类号: G06F17/27 , G06F17/30616
摘要: A method of providing weighted concepts related to a sequence of one or more words, including: providing on a computer an encyclopedia with concepts and a document explaining each concept, forming a vector, which contains the frequency of the word for each concept, for each word in the encyclopedia, arranging the vector according to the frequency of appearance of the word for each concept, selecting the concepts with the highest frequencies for each word from the vector, truncating the rest of the vector, inducing a feature generator using the truncated vectors; wherein the feature generator is adapted to receive as input one or more words and provide a list of weighted concepts, which are most related to the one or more words provided as input.
摘要翻译: 提供与一个或多个单词的序列相关的加权概念的方法,包括:在计算机上向百科全书提供概念和解释每个概念的文档,形成包含每个概念的单词的频率的向量 百科全书中的单词,根据每个概念的词出现频率排列向量,从向量中选择每个单词具有最高频率的概念,截断其余的向量,使用截断向量来诱导特征生成器 ; 其中所述特征生成器适于作为输入接收一个或多个单词并且提供与作为输入提供的一个或多个单词最相关的加权概念列表。
-
公开(公告)号:US20080004864A1
公开(公告)日:2008-01-03
申请号:US11457341
申请日:2006-07-13
IPC分类号: G06F17/27
CPC分类号: G06F17/27 , G06F17/30616
摘要: A method of providing weighted concepts related to a sequence of one or more words, including: providing on a computer an encyclopedia with concepts and a document explaining each concept, forming a vector, which contains the frequency of the word for each concept, for each word in the encyclopedia, arranging the vector according to the frequency of appearance of the word for each concept, selecting the concepts with the highest frequencies for each word from the vector, truncating the rest of the vector, inducing a feature generator using the truncated vectors; wherein the feature generator is adapted to receive as input one or more words and provide a list of weighted concepts, which are most related to the one or more words provided as input.
摘要翻译: 提供与一个或多个单词的序列相关的加权概念的方法,包括:在计算机上向百科全书提供概念和解释每个概念的文档,形成包含每个概念的单词的频率的向量 百科全书中的单词,根据每个概念的词出现频率来排列向量,从向量中选择每个单词具有最高频率的概念,截断其余的向量,使用截断向量来诱导特征生成器 ; 其中所述特征生成器适于作为输入接收一个或多个单词并且提供与作为输入提供的一个或多个单词最相关的加权概念列表。
-
公开(公告)号:US08868591B1
公开(公告)日:2014-10-21
申请号:US13239236
申请日:2011-09-21
申请人: Lev Finkelstein , Artiom Myaskouvskey , Shaul Markovitch , Tomer Shmiel , Eran Ofek , Isaac Elias
发明人: Lev Finkelstein , Artiom Myaskouvskey , Shaul Markovitch , Tomer Shmiel , Eran Ofek , Isaac Elias
IPC分类号: G06F17/30
CPC分类号: G06F17/3064
摘要: The present invention relates to the identification of alternative suggestions which potentially improve on a given query suggestion, without being perceived by a user as being offensively different from the user's query. The alternative suggestions may for example be different query formulations that relate to the same topic as that of the given query suggestion. The technology disclosed uses similarity screening of the given query suggestion against unique queries which do not include the given query suggestion as a prefix, in conjunction with query utility scores representing prior user response to the unique queries.
摘要翻译: 本发明涉及对于给定的查询建议可能改进的替代建议的识别,而不被用户感知为与用户的查询有差别的差异。 替代建议可以是例如与给定查询建议相同的主题的不同查询公式。 所公开的技术使用针对不包括给定查询建议作为前缀的唯一查询的给定查询建议的相似性筛选,以及表示对唯一查询的先前用户响应的查询实用程序分数。
-
-
-