-
公开(公告)号:US09317593B2
公开(公告)日:2016-04-19
申请号:US12243267
申请日:2008-10-01
申请人: David L. Marvit , Jawahar Jain , Stergios Stergiou , Alex Gilman , B. Thomas Adler , John J. Sidorowich , Yannis Labrou
发明人: David L. Marvit , Jawahar Jain , Stergios Stergiou , Alex Gilman , B. Thomas Adler , John J. Sidorowich , Yannis Labrou
IPC分类号: G06F17/30
CPC分类号: G06F17/3071 , G06F17/30616
摘要: In one embodiment, modeling topics includes accessing a corpus comprising documents that include words. Words of a document are selected as keywords of the document. The documents are clustered according to the keywords to yield clusters, where each cluster corresponds to a topic. A statistical distribution is generated for a cluster from words of the documents of the cluster. A topic is modeled using the statistical distribution generated for the cluster corresponding to the topic.
摘要翻译: 在一个实施例中,建模主题包括访问包括包含单词的文档的语料库。 选择文档的单词作为文档的关键字。 根据关键字对文档进行聚类,以生成集群,其中每个集群对应一个主题。 从群集文档的单词中为群集生成统计分布。 使用为该主题对应的集群生成的统计分布来建模主题。
-
公开(公告)号:US08332439B2
公开(公告)日:2012-12-11
申请号:US12242965
申请日:2008-10-01
CPC分类号: G06F17/30616 , G06F17/30675 , G06F17/3071
摘要: In certain embodiments, generating a hierarchy of terms includes accessing a corpus comprising terms. The following is performed for one or more terms to yield parent-child relationships: one or more parent terms of a term are identified according to directional affinity; and one or more parent-child relationships are established from the parent terms and each term. A hierarchical graph is automatically generated from the parent-child relationships.
摘要翻译: 在某些实施例中,生成术语层级包括访问包含术语的语料库。 对于一个或多个术语执行以下操作以产生亲子关系:根据定向亲和性来识别术语的一个或多个父词条; 并且从父项和每个术语建立一个或多个亲子关系。 从父子关系自动生成分层图。
-
3.
公开(公告)号:US08280892B2
公开(公告)日:2012-10-02
申请号:US12242984
申请日:2008-10-01
申请人: David L. Marvit , Jawahar Jain , Stergios Stergiou , Alex Gilman , B. Thomas Adler , John J. Sidorowich , Yannis Labrou
发明人: David L. Marvit , Jawahar Jain , Stergios Stergiou , Alex Gilman , B. Thomas Adler , John J. Sidorowich , Yannis Labrou
IPC分类号: G06F17/30
CPC分类号: G06F17/30616 , G06F17/218 , G06F17/2735 , G06F17/277
摘要: In one embodiment, assigning tags to a document includes accessing the document, where the document comprises text units that include words. The following is performed for each text unit: a subset of words of a text unit is selected as candidate tags, relatedness is established among the candidate tags, and certain candidate tags are selected according to the established relatedness to yield a candidate tag set for the text unit. Relatedness between the candidate tags of each candidate tag set and the candidate tags of other candidate tag sets is determined. At least one candidate tag is assigned to the document according to the determined relatedness.
摘要翻译: 在一个实施例中,将标签分配给文档包括访问文档,其中文档包括包括单词的文本单元。 对于每个文本单元执行以下操作:选择文本单元的单词的子集作为候选标签,在候选标签之间建立相关性,并且根据建立的相关性来选择某些候选标签,以产生用于 文字单位 确定每个候选标签集的候选标签与其他候选标签集合的候选标签之间的相关性。 根据确定的相关性,至少一个候选标签被分配给文档。
-
公开(公告)号:US20090094020A1
公开(公告)日:2009-04-09
申请号:US12243050
申请日:2008-10-01
申请人: David L. Marvit , Jawahar Jain , Stergios Stergiou , Alex Gilman , B. Thomas Adler , John J. Sidorowich , Albert Reinhardt , Yannis Labrou
发明人: David L. Marvit , Jawahar Jain , Stergios Stergiou , Alex Gilman , B. Thomas Adler , John J. Sidorowich , Albert Reinhardt , Yannis Labrou
CPC分类号: G06F17/30672 , G06F17/30616 , G06F17/30646 , G06F17/30864
摘要: In one embodiment, a set of target search terms for a search is received. Candidate terms are selected, where a candidate term is selected to reduce an ontology space of the search. The candidate terms are to a computer to recommend the candidate terms as search terms. In another embodiment, a document stored in one or more tangible media is accessed. A set of target tags for the document is received. Terms are selected, where a term is selected to reduce an ontology space of the document. The terms are sent to a computer to recommend the terms as tags.
摘要翻译: 在一个实施例中,接收用于搜索的一组目标搜索词。 选择候选项,其中选择候选项以减少搜索的本体空间。 候选词是指计算机将候选词推荐为搜索词。 在另一个实施例中,访问存储在一个或多个有形介质中的文档。 收到一组文档的目标标签。 选择术语,其中选择术语以减少文档的本体空间。 这些术语被发送到计算机以将术语推荐为标签。
-
5.
公开(公告)号:US20090094231A1
公开(公告)日:2009-04-09
申请号:US12242984
申请日:2008-10-01
申请人: David L. Marvit , Jawahar Jain , Stergios Stergiou , Alex Gilman , B. Thomas Adler , John J. Sidorowich , Yannis Labrou
发明人: David L. Marvit , Jawahar Jain , Stergios Stergiou , Alex Gilman , B. Thomas Adler , John J. Sidorowich , Yannis Labrou
CPC分类号: G06F17/30616 , G06F17/218 , G06F17/2735 , G06F17/277
摘要: In one embodiment, assigning tags to a document includes accessing the document, where the document comprises text units that include words. The following is performed for each text unit: a subset of words of a text unit is selected as candidate tags, relatedness is established among the candidate tags, and certain candidate tags are selected according to the established relatedness to yield a candidate tag set for the text unit. Relatedness between the candidate tags of each candidate tag set and the candidate tags of other candidate tag sets is determined. At least one candidate tag is assigned to the document according to the determined relatedness.
摘要翻译: 在一个实施例中,将标签分配给文档包括访问文档,其中文档包括包括单词的文本单元。 对于每个文本单元执行以下操作:选择文本单元的单词的子集作为候选标签,在候选标签之间建立相关性,并且根据建立的相关性来选择某些候选标签,以产生用于 文字单位 确定每个候选标签集的候选标签与其他候选标签集合的候选标签之间的相关性。 根据确定的相关性,至少一个候选标签被分配给文档。
-
公开(公告)号:US08108392B2
公开(公告)日:2012-01-31
申请号:US12242957
申请日:2008-10-01
申请人: David L. Marvit , Jawahar Jain , Stergios Stergiou , Alex Gilman , B. Thomas Adler , John J. Sidorowich
发明人: David L. Marvit , Jawahar Jain , Stergios Stergiou , Alex Gilman , B. Thomas Adler , John J. Sidorowich
IPC分类号: G06F17/30
CPC分类号: G06F17/30734
摘要: In one embodiment, identifying clusters of words includes accessing a record that records affinities. An affinity between a first and second word describes a quantitative relationship between the first and second word. Clusters of words are identified according to the affinities. A cluster comprises words that are sufficiently affine with each other. A first word is sufficiently affine with a second word if the affinity between the first and second word satisfies one or more affinity criteria. A clustering analysis is performed using the clusters.
摘要翻译: 在一个实施例中,识别字词群包括访问记录亲和力的记录。 第一个和第二个字之间的亲和度描述了第一个和第二个单词之间的定量关系。 根据亲和力识别词群。 一个群集包含彼此充分相识的单词。 如果第一个和第二个字符之间的亲和度满足一个或多个亲和度标准,则第一个字词与第二个字词充分相符。 使用群集执行聚类分析。
-
公开(公告)号:US20090094207A1
公开(公告)日:2009-04-09
申请号:US12242957
申请日:2008-10-01
申请人: David L. Marvit , Jawahar Jain , Stergios Stergiou , Alex Gilman , B. Thomas Adler , John J. Sidorowich
发明人: David L. Marvit , Jawahar Jain , Stergios Stergiou , Alex Gilman , B. Thomas Adler , John J. Sidorowich
IPC分类号: G06F17/30
CPC分类号: G06F17/30734
摘要: In one embodiment, identifying clusters of words includes accessing a record that records affinities. An affinity between a first and second word describes a quantitative relationship between the first and second word. Clusters of words are identified according to the affinities. A cluster comprises words that are sufficiently affine with each other. A first word is sufficiently affine with a second word if the affinity between the first and second word satisfies one or more affinity criteria. A clustering analysis is performed using the clusters.
摘要翻译: 在一个实施例中,识别字词群包括访问记录亲和力的记录。 第一个和第二个字之间的亲和度描述了第一个和第二个单词之间的定量关系。 根据亲和力识别词群。 集群包含彼此充分相互联系的单词。 如果第一个和第二个字符之间的亲和度满足一个或多个亲和度标准,则第一个字词与第二个字词充分相符。 使用群集执行聚类分析。
-
公开(公告)号:US20140067801A1
公开(公告)日:2014-03-06
申请号:US13601706
申请日:2012-08-31
申请人: David L. MARVIT , Jawahar JAIN , Ajay CHANDER , Alex GILMAN
发明人: David L. MARVIT , Jawahar JAIN , Ajay CHANDER , Alex GILMAN
IPC分类号: G06F17/30
CPC分类号: G06F16/29
摘要: A method of geotagging based on specified criteria is described. The method may include analyzing a data stream indicating a variable parameter associated with an object to determine data within the data stream satisfying a specified criteria. The method may also include obtaining geospatial information for the object or another object corresponding to a time the data was generated. Relevant data collected at the time the data satisfies the specified criteria may be tagged with the geospatial information. Related systems are also described.
摘要翻译: 描述了基于指定标准的地理标记的方法。 该方法可以包括分析指示与对象相关联的可变参数的数据流,以确定满足特定标准的数据流内的数据。 该方法还可以包括获得对应于数据生成时间的对象或另一个对象的地理空间信息。 在数据满足指定标准时收集的相关数据可能会被地理空间信息标记。 还描述了相关系统。
-
公开(公告)号:US20090094233A1
公开(公告)日:2009-04-09
申请号:US12243267
申请日:2008-10-01
申请人: David L. Marvit , Jawahar Jain , Stergios Stergiou , Alex Gilman , B. Thomas Adler , John J. Sidorowich , Yannis Labrou
发明人: David L. Marvit , Jawahar Jain , Stergios Stergiou , Alex Gilman , B. Thomas Adler , John J. Sidorowich , Yannis Labrou
IPC分类号: G06F17/30
CPC分类号: G06F17/3071 , G06F17/30616
摘要: In one embodiment, modeling topics includes accessing a corpus comprising documents that include words. Words of a document are selected as keywords of the document. The documents are clustered according to the keywords to yield clusters, where each cluster corresponds to a topic. A statistical distribution is generated for a cluster from words of the documents of the cluster. A topic is modeled using the statistical distribution generated for the cluster corresponding to the topic.
摘要翻译: 在一个实施例中,建模主题包括访问包括包含单词的文档的语料库。 选择文档的单词作为文档的关键字。 根据关键字对文档进行聚类,以生成集群,其中每个集群对应一个主题。 从群集文档的单词中为群集生成统计分布。 使用为该主题对应的集群生成的统计分布来建模主题。
-
公开(公告)号:US09081852B2
公开(公告)日:2015-07-14
申请号:US12243050
申请日:2008-10-01
申请人: David L. Marvit , Jawahar Jain , Stergios Stergiou , Alex Gilman , B. Thomas Adler , John J. Sidorowich , Albert Reinhardt , Yannis Labrou
发明人: David L. Marvit , Jawahar Jain , Stergios Stergiou , Alex Gilman , B. Thomas Adler , John J. Sidorowich , Albert Reinhardt , Yannis Labrou
CPC分类号: G06F17/30672 , G06F17/30616 , G06F17/30646 , G06F17/30864
摘要: In one embodiment, a set of target search terms for a search is received. Candidate terms are selected, where a candidate term is selected to reduce an ontology space of the search. The candidate terms are sent to a computer to recommend the candidate terms as search terms. In another embodiment, a document stored in one or more tangible media is accessed. A set of target tags for the document is received. Terms are selected, where a term is selected to reduce an ontology space of the document. The terms are sent to a computer to recommend the terms as tags.
摘要翻译: 在一个实施例中,接收用于搜索的一组目标搜索词。 选择候选项,其中选择候选项以减少搜索的本体空间。 将候选词条发送给计算机,以推荐候选词作为搜索词。 在另一个实施例中,访问存储在一个或多个有形介质中的文档。 收到一组文档的目标标签。 选择术语,其中选择术语以减少文档的本体空间。 这些术语被发送到计算机以将术语推荐为标签。
-
-
-
-
-
-
-
-
-