Selecting Tags For A Document By Analyzing Paragraphs Of The Document
    3.
    发明申请
    Selecting Tags For A Document By Analyzing Paragraphs Of The Document 有权
    通过分析文档的段落来选择文档的标签

    公开(公告)号:US20090094231A1

    公开(公告)日:2009-04-09

    申请号:US12242984

    申请日:2008-10-01

    IPC分类号: G06F17/30 G06F17/27

    摘要: In one embodiment, assigning tags to a document includes accessing the document, where the document comprises text units that include words. The following is performed for each text unit: a subset of words of a text unit is selected as candidate tags, relatedness is established among the candidate tags, and certain candidate tags are selected according to the established relatedness to yield a candidate tag set for the text unit. Relatedness between the candidate tags of each candidate tag set and the candidate tags of other candidate tag sets is determined. At least one candidate tag is assigned to the document according to the determined relatedness.

    摘要翻译: 在一个实施例中,将标签分配给文档包括访问文档,其中文档包括包括单词的文本单元。 对于每个文本单元执行以下操作:选择文本单元的单词的子集作为候选标签,在候选标签之间建立相关性,并且根据建立的相关性来选择某些候选标签,以产生用于 文字单位 确定每个候选标签集的候选标签与其他候选标签集合的候选标签之间的相关性。 根据确定的相关性,至少一个候选标签被分配给文档。

    Selecting tags for a document by analyzing paragraphs of the document
    6.
    发明授权
    Selecting tags for a document by analyzing paragraphs of the document 有权
    通过分析文档的段落来选择文档的标签

    公开(公告)号:US08280892B2

    公开(公告)日:2012-10-02

    申请号:US12242984

    申请日:2008-10-01

    IPC分类号: G06F17/30

    摘要: In one embodiment, assigning tags to a document includes accessing the document, where the document comprises text units that include words. The following is performed for each text unit: a subset of words of a text unit is selected as candidate tags, relatedness is established among the candidate tags, and certain candidate tags are selected according to the established relatedness to yield a candidate tag set for the text unit. Relatedness between the candidate tags of each candidate tag set and the candidate tags of other candidate tag sets is determined. At least one candidate tag is assigned to the document according to the determined relatedness.

    摘要翻译: 在一个实施例中,将标签分配给文档包括访问文档,其中文档包括包括单词的文本单元。 对于每个文本单元执行以下操作:选择文本单元的单词的子集作为候选标签,在候选标签之间建立相关性,并且根据建立的相关性来选择某些候选标签,以产生用于 文字单位 确定每个候选标签集的候选标签与其他候选标签集合的候选标签之间的相关性。 根据确定的相关性,至少一个候选标签被分配给文档。

    Determining Words Related To A Given Set Of Words
    7.
    发明申请
    Determining Words Related To A Given Set Of Words 有权
    确定与给定词组相关的词语

    公开(公告)号:US20090204609A1

    公开(公告)日:2009-08-13

    申请号:US12368689

    申请日:2009-02-10

    IPC分类号: G06F17/30 G06F7/10

    CPC分类号: G06F17/3064

    摘要: In one embodiment, display of a user entry window of a graphical user interface is initiated. Search terms entered into the user entry window to initiate a first search are received. One or more first search results from a corpus of documents are determined according to the search terms. Display of the search terms at a current search terms window of the graphical user interface is initiated. Display of the first search results at a search results window of the graphical user interface is initiated. Display of the first search suggestions at a search suggestion window of the graphical user interface is initiated.

    摘要翻译: 在一个实施例中,启动图形用户界面的用户输入窗口的显示。 接收到输入到用户输入窗口进行第一次搜索的搜索项。 根据搜索条件确定来自文档语料库的一个或多个第一搜索结果。 开始在图形用户界面的当前搜索项窗口处显示搜索项。 开始在图形用户界面的搜索结果窗口显示第一个搜索结果。 启动在图形用户界面的搜索建议窗口显示第一个搜索建议。

    Determining candidate terms related to terms of a query
    8.
    发明授权
    Determining candidate terms related to terms of a query 有权
    确定与查询条款相关的候选词

    公开(公告)号:US08280886B2

    公开(公告)日:2012-10-02

    申请号:US12368689

    申请日:2009-02-10

    IPC分类号: G06F7/00

    CPC分类号: G06F17/3064

    摘要: A predetermined number of temporary terms are obtained that have the highest differential affinity to each of a number of candidate terms. Each temporary term and the associated differential affinity is placed into a set of temporary terms. An average differential affinity is calculated for each temporary term of the set of temporary terms, the average differential affinity representing an average of differential affinities from the each temporary term to every term of the initial set of terms. One or more terms with an average differential affinity that fails to satisfy a predetermined threshold are removed from the temporary set. One or more terms of the temporary set with differential affinities above the threshold are placed into the set of candidate terms. One or more terms of the set of candidate terms are selected and output to a user.

    摘要翻译: 获得与多个候选项中的每一个具有最高差分亲和度的预定数量的临时项。 每个临时术语和相关联的差异亲和度被放置在一组临时术语中。 对于临时项集合中的每个临时项目计算平均差分亲和度,平均差分亲和度代表从每个临时项目到初始项目集合的每个项目的差异亲和度的平均值。 具有不能满足预定阈值的平均差分亲和度的一个或多个术语从临时集合中移除。 具有高于阈值的差异亲和度的临时集合的一个或多个项被放置在候选项集合中。 选择候选项集合中的一个或多个术语并输出给用户。

    Automatic generation of ontologies using word affinities
    9.
    发明授权
    Automatic generation of ontologies using word affinities 有权
    使用单词亲和力自动生成本体

    公开(公告)号:US08171029B2

    公开(公告)日:2012-05-01

    申请号:US12242950

    申请日:2008-10-01

    IPC分类号: G06F17/30

    摘要: In one embodiment, generating an ontology includes accessing an inverted index that comprises inverted index lists for words of a language. An inverted index list corresponding to a word indicates pages that include the word. A word pair comprises a first word and a second word. A first inverted index list and a second inverted index list are searched, where the first inverted index list corresponds to the first word and the second inverted index list corresponds to the second word. An affinity between the first word and the second word is calculated according to the first inverted index list and the second inverted index list. The affinity describes a quantitative relationship between the first word and the second word. The affinity is recorded in an affinity matrix, and the affinity matrix is reported.

    摘要翻译: 在一个实施例中,生成本体包括访问包括语言的单词的反向索引列表的反向索引。 与单词相对应的反向索引列表表示包含单词的页面。 字对包括第一个字和第二个字。 搜索第一反向索引列表和第二反向索引列表,其中第一反向索引列表对应于第一个字,第二个反向索引列表对应于第二个字。 根据第一反向索引列表和第二反向索引列表计算第一字和第二字之间的亲和度。 亲和度描述了第一个单词和第二个单词之间的定量关系。 将亲和力记录在亲和矩阵中,并报告亲和矩阵。

    Automatic Generation Of Ontologies Using Word Affinities
    10.
    发明申请
    Automatic Generation Of Ontologies Using Word Affinities 有权
    使用词亲和力自动生成本体

    公开(公告)号:US20090094262A1

    公开(公告)日:2009-04-09

    申请号:US12242950

    申请日:2008-10-01

    IPC分类号: G06F17/00

    摘要: In one embodiment, generating an ontology includes accessing an inverted index that comprises inverted index lists for words of a language. An inverted index list corresponding to a word indicates pages that include the word. A word pair comprises a first word and a second word. A first inverted index list and a second inverted index list are searched, where the first inverted index list corresponds to the first word and the second inverted index list corresponds to the second word. An affinity between the first word and the second word is calculated according to the first inverted index list and the second inverted index list. The affinity describes a quantitative relationship between the first word and the second word. The affinity is recorded in an affinity matrix, and the affinity matrix is reported.

    摘要翻译: 在一个实施例中,生成本体包括访问包括语言的单词的反向索引列表的反向索引。 与单词相对应的反向索引列表表示包含单词的页面。 字对包括第一个字和第二个字。 搜索第一反向索引列表和第二反向索引列表,其中第一反向索引列表对应于第一个字,第二个反向索引列表对应于第二个字。 根据第一反向索引列表和第二反向索引列表计算第一字和第二字之间的亲和度。 亲和度描述了第一个单词和第二个单词之间的定量关系。 将亲和力记录在亲和矩阵中,并报告亲和矩阵。