-
公开(公告)号:US09081852B2
公开(公告)日:2015-07-14
申请号:US12243050
申请日:2008-10-01
申请人: David L. Marvit , Jawahar Jain , Stergios Stergiou , Alex Gilman , B. Thomas Adler , John J. Sidorowich , Albert Reinhardt , Yannis Labrou
发明人: David L. Marvit , Jawahar Jain , Stergios Stergiou , Alex Gilman , B. Thomas Adler , John J. Sidorowich , Albert Reinhardt , Yannis Labrou
CPC分类号: G06F17/30672 , G06F17/30616 , G06F17/30646 , G06F17/30864
摘要: In one embodiment, a set of target search terms for a search is received. Candidate terms are selected, where a candidate term is selected to reduce an ontology space of the search. The candidate terms are sent to a computer to recommend the candidate terms as search terms. In another embodiment, a document stored in one or more tangible media is accessed. A set of target tags for the document is received. Terms are selected, where a term is selected to reduce an ontology space of the document. The terms are sent to a computer to recommend the terms as tags.
摘要翻译: 在一个实施例中,接收用于搜索的一组目标搜索词。 选择候选项,其中选择候选项以减少搜索的本体空间。 将候选词条发送给计算机,以推荐候选词作为搜索词。 在另一个实施例中,访问存储在一个或多个有形介质中的文档。 收到一组文档的目标标签。 选择术语,其中选择术语以减少文档的本体空间。 这些术语被发送到计算机以将术语推荐为标签。
-
公开(公告)号:US20090094020A1
公开(公告)日:2009-04-09
申请号:US12243050
申请日:2008-10-01
申请人: David L. Marvit , Jawahar Jain , Stergios Stergiou , Alex Gilman , B. Thomas Adler , John J. Sidorowich , Albert Reinhardt , Yannis Labrou
发明人: David L. Marvit , Jawahar Jain , Stergios Stergiou , Alex Gilman , B. Thomas Adler , John J. Sidorowich , Albert Reinhardt , Yannis Labrou
CPC分类号: G06F17/30672 , G06F17/30616 , G06F17/30646 , G06F17/30864
摘要: In one embodiment, a set of target search terms for a search is received. Candidate terms are selected, where a candidate term is selected to reduce an ontology space of the search. The candidate terms are to a computer to recommend the candidate terms as search terms. In another embodiment, a document stored in one or more tangible media is accessed. A set of target tags for the document is received. Terms are selected, where a term is selected to reduce an ontology space of the document. The terms are sent to a computer to recommend the terms as tags.
摘要翻译: 在一个实施例中,接收用于搜索的一组目标搜索词。 选择候选项,其中选择候选项以减少搜索的本体空间。 候选词是指计算机将候选词推荐为搜索词。 在另一个实施例中,访问存储在一个或多个有形介质中的文档。 收到一组文档的目标标签。 选择术语,其中选择术语以减少文档的本体空间。 这些术语被发送到计算机以将术语推荐为标签。
-
3.
公开(公告)号:US20090094231A1
公开(公告)日:2009-04-09
申请号:US12242984
申请日:2008-10-01
申请人: David L. Marvit , Jawahar Jain , Stergios Stergiou , Alex Gilman , B. Thomas Adler , John J. Sidorowich , Yannis Labrou
发明人: David L. Marvit , Jawahar Jain , Stergios Stergiou , Alex Gilman , B. Thomas Adler , John J. Sidorowich , Yannis Labrou
CPC分类号: G06F17/30616 , G06F17/218 , G06F17/2735 , G06F17/277
摘要: In one embodiment, assigning tags to a document includes accessing the document, where the document comprises text units that include words. The following is performed for each text unit: a subset of words of a text unit is selected as candidate tags, relatedness is established among the candidate tags, and certain candidate tags are selected according to the established relatedness to yield a candidate tag set for the text unit. Relatedness between the candidate tags of each candidate tag set and the candidate tags of other candidate tag sets is determined. At least one candidate tag is assigned to the document according to the determined relatedness.
摘要翻译: 在一个实施例中,将标签分配给文档包括访问文档,其中文档包括包括单词的文本单元。 对于每个文本单元执行以下操作:选择文本单元的单词的子集作为候选标签,在候选标签之间建立相关性,并且根据建立的相关性来选择某些候选标签,以产生用于 文字单位 确定每个候选标签集的候选标签与其他候选标签集合的候选标签之间的相关性。 根据确定的相关性,至少一个候选标签被分配给文档。
-
公开(公告)号:US20090094233A1
公开(公告)日:2009-04-09
申请号:US12243267
申请日:2008-10-01
申请人: David L. Marvit , Jawahar Jain , Stergios Stergiou , Alex Gilman , B. Thomas Adler , John J. Sidorowich , Yannis Labrou
发明人: David L. Marvit , Jawahar Jain , Stergios Stergiou , Alex Gilman , B. Thomas Adler , John J. Sidorowich , Yannis Labrou
IPC分类号: G06F17/30
CPC分类号: G06F17/3071 , G06F17/30616
摘要: In one embodiment, modeling topics includes accessing a corpus comprising documents that include words. Words of a document are selected as keywords of the document. The documents are clustered according to the keywords to yield clusters, where each cluster corresponds to a topic. A statistical distribution is generated for a cluster from words of the documents of the cluster. A topic is modeled using the statistical distribution generated for the cluster corresponding to the topic.
摘要翻译: 在一个实施例中,建模主题包括访问包括包含单词的文档的语料库。 选择文档的单词作为文档的关键字。 根据关键字对文档进行聚类,以生成集群,其中每个集群对应一个主题。 从群集文档的单词中为群集生成统计分布。 使用为该主题对应的集群生成的统计分布来建模主题。
-
公开(公告)号:US09317593B2
公开(公告)日:2016-04-19
申请号:US12243267
申请日:2008-10-01
申请人: David L. Marvit , Jawahar Jain , Stergios Stergiou , Alex Gilman , B. Thomas Adler , John J. Sidorowich , Yannis Labrou
发明人: David L. Marvit , Jawahar Jain , Stergios Stergiou , Alex Gilman , B. Thomas Adler , John J. Sidorowich , Yannis Labrou
IPC分类号: G06F17/30
CPC分类号: G06F17/3071 , G06F17/30616
摘要: In one embodiment, modeling topics includes accessing a corpus comprising documents that include words. Words of a document are selected as keywords of the document. The documents are clustered according to the keywords to yield clusters, where each cluster corresponds to a topic. A statistical distribution is generated for a cluster from words of the documents of the cluster. A topic is modeled using the statistical distribution generated for the cluster corresponding to the topic.
摘要翻译: 在一个实施例中,建模主题包括访问包括包含单词的文档的语料库。 选择文档的单词作为文档的关键字。 根据关键字对文档进行聚类,以生成集群,其中每个集群对应一个主题。 从群集文档的单词中为群集生成统计分布。 使用为该主题对应的集群生成的统计分布来建模主题。
-
6.
公开(公告)号:US08280892B2
公开(公告)日:2012-10-02
申请号:US12242984
申请日:2008-10-01
申请人: David L. Marvit , Jawahar Jain , Stergios Stergiou , Alex Gilman , B. Thomas Adler , John J. Sidorowich , Yannis Labrou
发明人: David L. Marvit , Jawahar Jain , Stergios Stergiou , Alex Gilman , B. Thomas Adler , John J. Sidorowich , Yannis Labrou
IPC分类号: G06F17/30
CPC分类号: G06F17/30616 , G06F17/218 , G06F17/2735 , G06F17/277
摘要: In one embodiment, assigning tags to a document includes accessing the document, where the document comprises text units that include words. The following is performed for each text unit: a subset of words of a text unit is selected as candidate tags, relatedness is established among the candidate tags, and certain candidate tags are selected according to the established relatedness to yield a candidate tag set for the text unit. Relatedness between the candidate tags of each candidate tag set and the candidate tags of other candidate tag sets is determined. At least one candidate tag is assigned to the document according to the determined relatedness.
摘要翻译: 在一个实施例中,将标签分配给文档包括访问文档,其中文档包括包括单词的文本单元。 对于每个文本单元执行以下操作:选择文本单元的单词的子集作为候选标签,在候选标签之间建立相关性,并且根据建立的相关性来选择某些候选标签,以产生用于 文字单位 确定每个候选标签集的候选标签与其他候选标签集合的候选标签之间的相关性。 根据确定的相关性,至少一个候选标签被分配给文档。
-
公开(公告)号:US20090204609A1
公开(公告)日:2009-08-13
申请号:US12368689
申请日:2009-02-10
CPC分类号: G06F17/3064
摘要: In one embodiment, display of a user entry window of a graphical user interface is initiated. Search terms entered into the user entry window to initiate a first search are received. One or more first search results from a corpus of documents are determined according to the search terms. Display of the search terms at a current search terms window of the graphical user interface is initiated. Display of the first search results at a search results window of the graphical user interface is initiated. Display of the first search suggestions at a search suggestion window of the graphical user interface is initiated.
摘要翻译: 在一个实施例中,启动图形用户界面的用户输入窗口的显示。 接收到输入到用户输入窗口进行第一次搜索的搜索项。 根据搜索条件确定来自文档语料库的一个或多个第一搜索结果。 开始在图形用户界面的当前搜索项窗口处显示搜索项。 开始在图形用户界面的搜索结果窗口显示第一个搜索结果。 启动在图形用户界面的搜索建议窗口显示第一个搜索建议。
-
公开(公告)号:US08280886B2
公开(公告)日:2012-10-02
申请号:US12368689
申请日:2009-02-10
IPC分类号: G06F7/00
CPC分类号: G06F17/3064
摘要: A predetermined number of temporary terms are obtained that have the highest differential affinity to each of a number of candidate terms. Each temporary term and the associated differential affinity is placed into a set of temporary terms. An average differential affinity is calculated for each temporary term of the set of temporary terms, the average differential affinity representing an average of differential affinities from the each temporary term to every term of the initial set of terms. One or more terms with an average differential affinity that fails to satisfy a predetermined threshold are removed from the temporary set. One or more terms of the temporary set with differential affinities above the threshold are placed into the set of candidate terms. One or more terms of the set of candidate terms are selected and output to a user.
摘要翻译: 获得与多个候选项中的每一个具有最高差分亲和度的预定数量的临时项。 每个临时术语和相关联的差异亲和度被放置在一组临时术语中。 对于临时项集合中的每个临时项目计算平均差分亲和度,平均差分亲和度代表从每个临时项目到初始项目集合的每个项目的差异亲和度的平均值。 具有不能满足预定阈值的平均差分亲和度的一个或多个术语从临时集合中移除。 具有高于阈值的差异亲和度的临时集合的一个或多个项被放置在候选项集合中。 选择候选项集合中的一个或多个术语并输出给用户。
-
公开(公告)号:US08171029B2
公开(公告)日:2012-05-01
申请号:US12242950
申请日:2008-10-01
IPC分类号: G06F17/30
CPC分类号: G06F17/30616 , G06F17/2735 , G06F17/277 , G06F17/30622 , G06F17/30734
摘要: In one embodiment, generating an ontology includes accessing an inverted index that comprises inverted index lists for words of a language. An inverted index list corresponding to a word indicates pages that include the word. A word pair comprises a first word and a second word. A first inverted index list and a second inverted index list are searched, where the first inverted index list corresponds to the first word and the second inverted index list corresponds to the second word. An affinity between the first word and the second word is calculated according to the first inverted index list and the second inverted index list. The affinity describes a quantitative relationship between the first word and the second word. The affinity is recorded in an affinity matrix, and the affinity matrix is reported.
摘要翻译: 在一个实施例中,生成本体包括访问包括语言的单词的反向索引列表的反向索引。 与单词相对应的反向索引列表表示包含单词的页面。 字对包括第一个字和第二个字。 搜索第一反向索引列表和第二反向索引列表,其中第一反向索引列表对应于第一个字,第二个反向索引列表对应于第二个字。 根据第一反向索引列表和第二反向索引列表计算第一字和第二字之间的亲和度。 亲和度描述了第一个单词和第二个单词之间的定量关系。 将亲和力记录在亲和矩阵中,并报告亲和矩阵。
-
公开(公告)号:US20090094262A1
公开(公告)日:2009-04-09
申请号:US12242950
申请日:2008-10-01
IPC分类号: G06F17/00
CPC分类号: G06F17/30616 , G06F17/2735 , G06F17/277 , G06F17/30622 , G06F17/30734
摘要: In one embodiment, generating an ontology includes accessing an inverted index that comprises inverted index lists for words of a language. An inverted index list corresponding to a word indicates pages that include the word. A word pair comprises a first word and a second word. A first inverted index list and a second inverted index list are searched, where the first inverted index list corresponds to the first word and the second inverted index list corresponds to the second word. An affinity between the first word and the second word is calculated according to the first inverted index list and the second inverted index list. The affinity describes a quantitative relationship between the first word and the second word. The affinity is recorded in an affinity matrix, and the affinity matrix is reported.
摘要翻译: 在一个实施例中,生成本体包括访问包括语言的单词的反向索引列表的反向索引。 与单词相对应的反向索引列表表示包含单词的页面。 字对包括第一个字和第二个字。 搜索第一反向索引列表和第二反向索引列表,其中第一反向索引列表对应于第一个字,第二个反向索引列表对应于第二个字。 根据第一反向索引列表和第二反向索引列表计算第一字和第二字之间的亲和度。 亲和度描述了第一个单词和第二个单词之间的定量关系。 将亲和力记录在亲和矩阵中,并报告亲和矩阵。
-
-
-
-
-
-
-
-
-