Inferring semantic relations
    1.
    发明授权
    Inferring semantic relations 失效
    推论语义关系

    公开(公告)号:US6138085A

    公开(公告)日:2000-10-24

    申请号:US904226

    申请日:1997-07-31

    IPC分类号: G06F17/27 G06F17/30

    CPC分类号: G06F17/2785

    摘要: The present invention provides a facility for determining, for a semantic relation that does not occur in a lexical knowledge base, whether this semantic relation should be inferred despite its absence from the lexical knowledge base. This semantic relation to be inferred is preferably made up of a first word, a second word, and a relation type relating the meanings of the first and second words. In a preferred embodiment, the facility identifies a salient semantic relation having the relation type of the semantic relation to be inferred and relating the first word to an intermediate word other than the second word. The facility then generates a quantitative measure of the similarity in meaning between the intermediate word and the second word. The facility further generates a confidence weight for the semantic relation to be inferred based upon the generated measure of similarity in meaning between the intermediate word and the second word. The facility may also generate a confidence weight for the semantic relation to be inferred based upon the weights of one or more paths connecting the first and second words.

    摘要翻译: 本发明提供了一种用于确定在词汇知识库中不发生的语义关系的设备,尽管它不存在词汇知识库,是否应该推断该语义关系。 要推断的这种语义关系优选地由与第一和第二个词的含义相关联的第一个单词,第二个单词和关系类型组成。 在优选实施例中,该设施识别具有要被推断的语义关系的关系类型的显着语义关系,并将第一个单词与除第二个单词之外的中间单词相关联。 然后,该设施产生中间词和第二个单词之间意义上的相似性的定量测量。 该设施还基于所生成的中间单词和第二单词之间的相似度测量值,产生要推断的语义关系的置信权重。 该设施还可以基于连接第一和第二单词的一个或多个路径的权重来产生要推断的语义关系的置信权重。

    MACHINE LANGUAGE TRANSLATION WITH TRANSFER MAPPINGS HAVING VARYING CONTEXT
    2.
    发明申请
    MACHINE LANGUAGE TRANSLATION WITH TRANSFER MAPPINGS HAVING VARYING CONTEXT 有权
    机器语言翻译与转换映射有变化的背景

    公开(公告)号:US20100223049A1

    公开(公告)日:2010-09-02

    申请号:US12773328

    申请日:2010-05-04

    IPC分类号: G06F17/28

    CPC分类号: G06F17/2827

    摘要: A computer-implemented machine translation system translates text from a first language to a second language. The system includes a plurality of mappings, each mapping indicative of associating a dependency structure of the first language with a dependency structure of the second language, wherein at least some of the mappings correspond to dependency structures of the first language having varying context with some common elements, and associated dependency structures of the second language to the dependency structures of the first language. A module receives input text in a first language and outputs output text in a second language based on accessing the plurality of mappings.

    摘要翻译: 计算机实现的机器翻译系统将文本从第一语言翻译成第二语言。 该系统包括多个映射,每个映射指示将第一语言的依赖结构与第二语言的依赖结构相关联,其中至少一些映射对应于具有不同上下文的第一语言的依赖结构,具有一些常见的 元素和第二语言的关联依赖结构与第一语言的依赖结构。 模块以第一语言接收输入文本,并且基于访问多个映射以第二语言输出输出文本。

    Information retrieval utilizing semantic representation of text by
identifying hypernyms and indexing multiple tokenized semantic
structures to a same passage of text
    3.
    发明授权
    Information retrieval utilizing semantic representation of text by identifying hypernyms and indexing multiple tokenized semantic structures to a same passage of text 有权
    信息检索利用文本的语义表示,通过识别多义词,并将多个标记语义结构索引到同一段文本

    公开(公告)号:US6161084A

    公开(公告)日:2000-12-12

    申请号:US366499

    申请日:1999-08-03

    IPC分类号: G06F17/27 G06F17/30

    摘要: The present invention is directed to performing information retrieval utilizing semantic representation of text. In a preferred embodiment, a tokenizer generates from an input string information retrieval tokens that characterize the semantic relationship expressed in the input string. The tokenizer first creates from the input string a primary logical form characterizing a semantic relationship between selected words in the input string. The tokenizer then identifies hypernyms that each have an "is a" relationship with one of the selected words in the input string. The tokenizer then constructs from the primary logical form one or more alternative logical forms. The tokenizer constructs each alternative logical form by, for each of one or more of the selected words in the input string, replacing the selected word in the primary logical form with an identified hypernym of the selected word. Finally, the tokenizer generates tokens representing both the primary logical form and the alternative logical forms. The tokenizer is preferably used to generate tokens for both constructing an index representing target documents and processing a query against that index.

    摘要翻译: 本发明旨在利用文本的语义表示来执行信息检索。 在优选实施例中,标记器从输入字符串生成表征输入字符串中表达的语义关系的信息检索令牌。 标记器首先从输入字符串创建表示输入字符串中所选择的单词之间的语义关系的主逻辑形式。 然后,标记器识别每个与输入字符串中所选择的一个字符之间具有“是”关系的超文本。 然后,标记器从主逻辑形式构造一个或多个替代的逻辑形式。 令牌化器通过输入字符串中的一个或多个所选择的单词中的每个替换逻辑形式来构造每个备选逻辑形式,用所选择的单词的所识别的超级词替换主逻辑形式中的所选择的单词。 最后,tokenizer生成表示主逻辑表单和替代逻辑表单的令牌。 令牌化器优选地用于生成用于构建表示目标文档的索引并针对该索引处理查询的令牌。

    Determining similarity between words
    4.
    发明授权
    Determining similarity between words 失效
    确定单词之间的相似性

    公开(公告)号:US6098033A

    公开(公告)日:2000-08-01

    申请号:US904223

    申请日:1997-07-31

    IPC分类号: G06F17/27

    CPC分类号: G06F17/277 G06F17/2785

    摘要: The present invention provides a facility for determining similarity between two input words utilizing the frequencies with which path patterns occurring between the words occur between words known to be synonyms. A preferred embodiment of the facility utilizes a training phase and a similarity determination phase. In the training phase, the facility first identifies, for a number of pairs of synonyms, the most salient semantic relation paths between each pair of synonyms. The facility then extracts from these semantic relation paths their path patterns, which each comprise a series of directional relation types. The number of times that each path pattern occurs between pairs of synonyms, called the frequency of the path pattern, is counted. In the training phase, the facility identifies the most salient semantic relation paths between the input words, and extracts their path patterns. The facility then averages the frequencies counted in the training phase for the path patterns extracted for the input words in order to obtain a quantitative measure of the similarity between the input words.

    摘要翻译: 本发明提供了一种用于确定两个输入词之间的相似性的设施,该两个输入字利用在已知是同义词的单词之间出现的词之间出现的路径模式的频率。 设施的优选实施例利用训练阶段和相似性确定阶段。 在训练阶段,设施首先识别多对同义词,即每对同义词之间最突出的语义关系路径。 然后,该设施从这些语义关系路径中提取它们的路径模式,每个路径模式包括一系列方向关系类型。 计算每个路径模式发生在同义词对之间的次数,称为路径模式的频率。 在训练阶段,设备识别输入单词之间最突出的语义关系路径,并提取其路径模式。 然后,该设施对为训练阶段计数的针对输入词提取的路径模式的频率进行平均,以便获得输入单词之间的相似性的定量测量。

    Adaptive machine translation
    6.
    发明授权
    Adaptive machine translation 失效
    自适应机器翻译

    公开(公告)号:US07295963B2

    公开(公告)日:2007-11-13

    申请号:US10626925

    申请日:2003-07-25

    IPC分类号: G06F17/28

    CPC分类号: G06F17/2836

    摘要: A computer-implemented method for providing information to an automatic machine translation system to improve translation accuracy is disclosed. The method includes receiving a collection of source text. An attempted translation that corresponds to the collection of source text is received from the automatic machine translation system. A correction input, which is configured to effectuate a correction of at least one error in the attempted translation, is also received. Finally, information is provided to the automatic machine translation system to reduce the likelihood that the error will be repeated in subsequent translations generated by the automatic machine translation system.

    摘要翻译: 公开了一种用于向自动机器翻译系统提供信息以提高翻译精度的计算机实现的方法。 该方法包括接收源文本的集合。 从自动机器翻译系统接收到对应于源文本集合的尝试翻译。 被配置为实现尝试翻译中的至少一个错误的校正的校正输入也被接收。 最后,将信息提供给自动机器翻译系统,以减少由自动机器翻译系统产生的后续翻译中将重复错误的可能性。

    Machine language translation with transfer mappings having varying context
    7.
    发明授权
    Machine language translation with transfer mappings having varying context 有权
    机器语言翻译与转移映射具有不同的上下文

    公开(公告)号:US08275605B2

    公开(公告)日:2012-09-25

    申请号:US12773328

    申请日:2010-05-04

    IPC分类号: G06F17/28

    CPC分类号: G06F17/2827

    摘要: A computer-implemented machine translation system translates text from a first language to a second language. The system includes a plurality of mappings, each mapping indicative of associating a dependency structure of the first language with a dependency structure of the second language, wherein at least some of the mappings correspond to dependency structures of the first language having varying context with some common elements, and associated dependency structures of the second language to the dependency structures of the first language. A module receives input text in a first language and outputs output text in a second language based on accessing the plurality of mappings.

    摘要翻译: 计算机实现的机器翻译系统将文本从第一语言翻译成第二语言。 该系统包括多个映射,每个映射指示将第一语言的依赖结构与第二语言的依赖结构相关联,其中至少一些映射对应于具有不同上下文的第一语言的依赖结构,具有一些常见的 元素和第二语言的关联依赖结构与第一语言的依赖结构。 模块以第一语言接收输入文本,并且基于访问多个映射以第二语言输出输出文本。

    Automatic extraction of transfer mappings from bilingual corpora
    8.
    发明授权
    Automatic extraction of transfer mappings from bilingual corpora 有权
    自动提取双语语料库的转移映射

    公开(公告)号:US07734459B2

    公开(公告)日:2010-06-08

    申请号:US09899554

    申请日:2001-07-05

    IPC分类号: G06F11/28

    CPC分类号: G06F17/2827

    摘要: A method of aligning nodes of dependency structures obtained from a bilingual corpus includes a two-phase approach wherein a first phase comprises associating nodes of the dependency structures to form tentative correspondences. The nodes of the dependency structures are then aligned as a function of the tentative correspondences and structural considerations. Mappings are obtained from the aligned dependency structures. The mappings can be expanded with varying types and amounts of local context in order that a more fluent translation can be obtained when translation is performed.

    摘要翻译: 从双语语料库获得的依赖关系结构的对齐节点的方法包括两阶段方法,其中第一阶段包括关联依赖结构的节点以形成临时对应。 依赖结构的节点随后作为临时对应关系和结构考虑的函数进行对齐。 映射从对齐的依赖结构获得。 可以用不同类型和数量的本地语境来扩展映射,以便在进行翻译时可以获得更流畅的翻译。

    System and method for matching a textual input to a lexical knowledge based and for utilizing results of that match
    9.
    发明授权
    System and method for matching a textual input to a lexical knowledge based and for utilizing results of that match 失效
    用于将文本输入与基于词汇知识相匹配并用于利用该匹配的结果的系统和方法

    公开(公告)号:US07013264B2

    公开(公告)日:2006-03-14

    申请号:US10977910

    申请日:2004-10-29

    IPC分类号: G06F17/30

    摘要: The present invention can be used in a natural language processing system to determine a relationship (such as similarity in meaning) between two textual segments. The relationship can be identified or determined based on logical graphs generated from the textual segments. A relationship between first and second logical graphs is determined. This is accomplished regardless of whether there is an exact match between the first and second logical graphs. In one embodiment, the first graph represents an input textual discourse unit. The second graph, in one embodiment, represents information in a lexical knowledge base (LKB). The input graph can be matched against the second graph, if they have similar meaning, even if the two differ lexically or structurally.

    摘要翻译: 本发明可以用于自然语言处理系统中以确定两个文本段之间的关系(诸如意义上的相似性)。 可以基于从文本段生成的逻辑图来识别或确定关系。 确定第一和第二逻辑图之间的关系。 无论第一个和第二个逻辑图之间是否存在精确的匹配,这是完成的。 在一个实施例中,第一图表示输入的文本话语单元。 在一个实施例中,第二个图表示词汇知识库(LKB)中的信息。 输入图可以与第二个图匹配,如果它们具有相似的含义,即使两者在词汇或结构上不同。

    System and method for machine learning a confidence metric for machine translation
    10.
    发明授权
    System and method for machine learning a confidence metric for machine translation 有权
    用于机器学习机器翻译的置信度量的系统和方法

    公开(公告)号:US07496496B2

    公开(公告)日:2009-02-24

    申请号:US11725435

    申请日:2007-03-19

    IPC分类号: G06F17/28 G06F17/20 G10L11/00

    CPC分类号: G06F17/28

    摘要: A machine translation system is trained to generate confidence scores indicative of a quality of a translation result. A source string is translated with a machine translator to generate a target string. Features indicative of translation operations performed are extracted from the machine translator. A trusted entity-assigned translation score is obtained and is indicative of a trusted entity-assigned translation quality of the translated string. A relationship between a subset of the extracted features and the trusted entity-assigned translation score is identified.

    摘要翻译: 训练机器翻译系统以产生指示翻译结果的质量的置信度分数。 使用机器翻译器翻译源字符串以生成目标字符串。 从机器翻译器提取表示所执行的翻译操作的特征。 获得受信任的实体分配的翻译分数,并且指示被翻译的字符串的受信任的实体分配的翻译质量。 识别提取的特征的子集与可信实体分配的翻译分数之间的关系。