Bi-dimensional rewriting rules for natural language processing
    1.
    发明授权
    Bi-dimensional rewriting rules for natural language processing 有权
    自然语言处理的二维重写规则

    公开(公告)号:US07822597B2

    公开(公告)日:2010-10-26

    申请号:US11018892

    申请日:2004-12-21

    IPC分类号: G06F17/27

    摘要: A linguistic rewriting rule for use in linguistic processing of an ordered sequence of linguistic tokens includes a token pattern recognition rule that matches the ordered sequence of linguistic tokens with a syntactical pattern. The token pattern recognition rule incorporates a character pattern recognition rule to match characters contained in an ambiguous portion of the ordered sequence of linguistic tokens with a character pattern defining a corresponding portion of the syntactical pattern.

    摘要翻译: 用于语言处理有序序列语言标记的语言重写规则包括令牌模式识别规则,其将语法令牌的有序序列与语法模式相匹配。 令牌模式识别规则包含字符模式识别规则,以将包含在语言令牌的有序序列的模糊部分中的字符与定义语法模式的相应部分的字符模式相匹配。

    Linguistically enhanced email detector
    2.
    发明授权
    Linguistically enhanced email detector 有权
    语言增强电子邮件检测器

    公开(公告)号:US08429141B2

    公开(公告)日:2013-04-23

    申请号:US13037450

    申请日:2011-03-01

    IPC分类号: G06F17/30

    CPC分类号: G06F17/2765

    摘要: A computer-implemented system and method are provided for warning a user of a missing attachment to an email. The method may include automatically recognizing a natural language of text of an email and selecting a keyword list from a plurality of keyword lists, based on the recognized natural language. Each keyword list is associated with a respective natural language and includes at least one keyword. At least one of the keyword lists includes a multi-sense keyword having a plurality of senses. A first of the plurality of senses is recognized as referring to an attachment and a second of the plurality of senses is recognized as not referring to an attachment. The text of the email is processed to identify an instance, where present, of a keyword that is in the selected keyword list and, for a keyword which is a multi-sense keyword, at least one sense-related rule is applied to a portion of the text which includes the instance of the multi-sense keyword. Based on the application of the at least one sense-related rule, where the email lacks an attachment, a notification is provided to the user.

    摘要翻译: 提供了一种计算机实现的系统和方法,用于向用户警告对电子邮件缺少的附件。 该方法可以包括基于所识别的自然语言自动识别电子邮件的自然语言和从多个关键字列表中选择关键词列表。 每个关键字列表与相应的自然语言相关联并且包括至少一个关键字。 关键字列表中的至少一个包括具有多个感觉的多重关键字。 多个感觉中的第一个被识别为指附件,并且多个感觉中的第二个被识别为不指附件。 处理电子邮件的文本以识别在所选择的关键字列表中的关键字的存在的实例,并且对于作为多重关键字的关键字,至少一个感觉相关规则被应用于部分 包含多重关键字实例的文本。 基于至少一个感觉相关规则的应用,其中电子邮件缺少附件,向用户提供通知。

    Linguistically enhanced email detector
    3.
    发明申请
    Linguistically enhanced email detector 有权
    语言增强电子邮件检测器

    公开(公告)号:US20120226707A1

    公开(公告)日:2012-09-06

    申请号:US13037450

    申请日:2011-03-01

    IPC分类号: G06F17/30 G06F17/27

    CPC分类号: G06F17/2765

    摘要: A computer-implemented system and method are provided for warning a user of a missing attachment to an email. The method may include automatically recognizing a natural language of text of an email and selecting a keyword list from a plurality of keyword lists, based on the recognized natural language. Each keyword list is associated with a respective natural language and includes at least one keyword. At least one of the keyword lists includes a multi-sense keyword having a plurality of senses. A first of the plurality of senses is recognized as referring to an attachment and a second of the plurality of senses is recognized as not referring to an attachment. The text of the email is processed to identify an instance, where present, of a keyword that is in the selected keyword list and, for a keyword which is a multi-sense keyword, at least one sense-related rule is applied to a portion of the text which includes the instance of the multi-sense keyword. Based on the application of the at least one sense-related rule, where the email lacks an attachment, a notification is provided to the user.

    摘要翻译: 提供了一种计算机实现的系统和方法,用于向用户警告对电子邮件缺少的附件。 该方法可以包括基于所识别的自然语言自动识别电子邮件的自然语言和从多个关键字列表中选择关键词列表。 每个关键字列表与相应的自然语言相关联并且包括至少一个关键字。 关键字列表中的至少一个包括具有多个感觉的多重关键字。 多个感觉中的第一个被识别为指附件,并且多个感觉中的第二个被识别为不指附件。 处理电子邮件的文本以识别在所选择的关键字列表中的关键字的存在的实例,并且对于作为多重关键字的关键字,至少一个感觉相关规则被应用于部分 包含多重关键字实例的文本。 基于至少一个感觉相关规则的应用,其中电子邮件缺少附件,向用户提供通知。

    Labeling of work of art titles in text for natural language processing
    4.
    发明授权
    Labeling of work of art titles in text for natural language processing 有权
    标示自然语言处理文本中的艺术作品

    公开(公告)号:US07788084B2

    公开(公告)日:2010-08-31

    申请号:US11524230

    申请日:2006-09-19

    IPC分类号: G06F17/28

    摘要: A parser for parsing text includes a tokenizing module which divides the text into an ordered sequence of linguistic tokens. A morphological module associates parts of speech with the linguistic tokens. A detection module identifies candidate titles of creative works, such as works of art. A filtering module filters the candidate titles of works to exclude citations of direct speech from the candidate titles of works. A comparison module compares any remaining candidate titles of works with titles of works in an associated knowledge base. The comparison module annotates the text when a match is found.

    摘要翻译: 用于解析文本的解析器包括令牌化模块,其将文本划分成语言令牌的有序序列。 形态学模块将词性与语言标记相结合。 检测模块识别创意作品的候选标题,例如艺术作品。 过滤模块过滤作品的候选作品,以排除作品候选人作品中直接言论的引用。 比较模块比较任何剩余的作品候选作品与相关知识库中作品的标题。 比较模块在找到匹配项时注释文本。

    Corpus-based system and method for acquiring polar adjectives
    5.
    发明授权
    Corpus-based system and method for acquiring polar adjectives 有权
    基于语料库的系统和获取极地形容词的方法

    公开(公告)号:US08532981B2

    公开(公告)日:2013-09-10

    申请号:US13052686

    申请日:2011-03-21

    申请人: Caroline Brun

    发明人: Caroline Brun

    IPC分类号: G06F17/21

    CPC分类号: G06F17/2735

    摘要: A system, method, and computer program product for generating a polar vocabulary are provided. The method includes extracting textual content from each review in a corpus of reviews. Each of the reviews includes an author's rating, e.g., of a specific product or service to which the textual content relates. A set of frequent nouns is identified from the textual content of the reviews. Adjectival terms are extracted from the textual content of the reviews. Each adjectival term is associated in the textual content with one of the frequent nouns. A polar vocabulary including at least some of the extracted adjectival terms is generated. A polarity measure is associated with each adjectival term in the vocabulary which is based on the ratings of those reviews from which the adjectival term was extracted.

    摘要翻译: 提供了一种用于生成极性词汇表的系统,方法和计算机程序产品。 该方法包括从评论语料库中的每个评论中提取文本内容。 每个审查包括作者的评级,例如与文本内容相关的特定产品或服务。 从评论的文字内容中可以看出一组经常名词。 形容词是从评论的文本内容中提取的。 每个形容词与文本内容中的一个频繁名词相关联。 生成包括提取的形容词中的至少一些的极性词汇。 极性度量与词汇中的每个形容词相关联,这是基于提取形容词的评论的评级。

    SYSTEM AND METHOD FOR SUGGESTION MINING
    6.
    发明申请
    SYSTEM AND METHOD FOR SUGGESTION MINING 有权
    用于建筑采矿的系统和方法

    公开(公告)号:US20130096909A1

    公开(公告)日:2013-04-18

    申请号:US13272553

    申请日:2011-10-13

    IPC分类号: G06F17/27

    摘要: A system and method for extraction of suggestions for improvement form a corpus of documents, such as customer reviews, are disclosed. A structured terminology provided or a topic includes a set of semantic classes, each including a set of terms. A thesaurus of terms relating to suggestions of improvement is provided. Text elements of text strings in the documents which are instances of terms in the structured terminology are labeled with the corresponding semantic class and text elements which are instances of terms in the thesaurus are also labeled. A set of patterns is applied to the labeled text strings to identify suggestions of improvement expressions. The patterns define syntactic relations between text elements, some of which are required to be instances of one of the terms in a particular semantic class or thesaurus. A set of suggestions for improvements is output based on the identified suggestions of improvement expressions.

    摘要翻译: 公开了一种用于提取改进建议的系统和方法,形成了诸如客户评论的文件语料库。 提供的结构化术语或主题包括一组语义类,每个语义类包括一组术语。 提供了与改进建议有关的术语的词典。 在结构化术语中的术语实例的文档中的文本字符串的文本元素被标记为对应的语义类和文本元素,这些元素是同义词库中的术语的实例也被标记。 将一组模式应用于标记的文本字符串,以识别改进表达的建议。 这些模式定义了文本元素之间的句法关系,其中一些必须是特定语义类或词典中的一个术语的实例。 根据确定的改进表达建议,输出一组改进建议。

    LINGUISTICALLY-ADAPTED STRUCTURAL QUERY ANNOTATION

    公开(公告)号:US20130080152A1

    公开(公告)日:2013-03-28

    申请号:US13245147

    申请日:2011-09-26

    IPC分类号: G06F17/27

    摘要: A system and method for natural language processing of queries are provided. A lexicon includes text elements that are recognized as being a proper noun when capitalized. A natural language query includes a sequence of text elements including words. The query is processed. The processing includes a preprocessing step, in which part of speech features are assigned to the text elements in the query. This includes identifying, from a lexicon, a text element in the query which starts with a lowercase letter and assigning recapitalization information to the text element in the query, based on the lexicon. This information includes a part of speech feature of the capitalized form of the text element. Then parts of speech for the text elements in the query are disambiguated, which includes applying rules for recapitalizing text elements based on the recapitalization information.

    摘要翻译: 提供了一种用于查询的自然语言处理的系统和方法。 词典包括文本元素,当大写时被认为是专有名词。 自然语言查询包括包括单词的文本元素序列。 查询被处理。 处理包括预处理步骤,其中部分语音特征被分配给查询中的文本元素。 这包括从词典中识别出以小写字母开头的查询中的文本元素,并根据词典将查询中的文本元素分配资本重组信息。 该信息包括文本元素的大写形式的一部分词性特征。 然后,查询中的文本元素的部分语义被消歧,其中包括基于资本重组信息应用用于资本化文本元素的规则。

    Bilingual authoring assistant for the “tip of the tongue” problem
    8.
    发明授权
    Bilingual authoring assistant for the “tip of the tongue” problem 有权
    双语创作助理为“舌尖”问题

    公开(公告)号:US07827026B2

    公开(公告)日:2010-11-02

    申请号:US11018758

    申请日:2004-12-21

    IPC分类号: G06F17/20

    CPC分类号: G06F17/289

    摘要: A bilingual authoring apparatus includes a user interface (20) for inputting partially translated text including a text portion in a source language and surrounding or adjacent text in a target language. A bilingual dictionary (34) associates words and phrases in the target language and words and phrases in a source language. A context sensitive translation tool (30, 32, 38) communicates with the user interface, receives the partially translated text, and provides at least one proposed translation in the target language of the text portion in the source language. The at least one proposed translation in the target language is derived from the bilingual dictionary based on contextual analysis of at least a portion of the partially translated text.

    摘要翻译: 双语制作装置包括:用户界面(20),用于输入部分翻译的文本,其包括源语言的文本部分和目标语言的周围或相邻文本。 双语词典(34)将目标语言中的单词和短语与源语言中的单词和短语相关联。 上下文相关的翻译工具(30,32,38)与用户界面通信,接收部分翻译的文本,并以源语言的文本部分的目标语言提供至少一个所提议的翻译。 目标语言中的至少一个所提出的翻译是基于至少部分翻译文本的语境分析从双语词典导出的。

    Labeling of work of art titles in text for natural language processing
    9.
    发明申请
    Labeling of work of art titles in text for natural language processing 有权
    标示自然语言处理文本中的艺术作品

    公开(公告)号:US20080071519A1

    公开(公告)日:2008-03-20

    申请号:US11524230

    申请日:2006-09-19

    IPC分类号: G06F17/27

    摘要: A parser for parsing text includes a tokenizing module which divides the text into an ordered sequence of linguistic tokens. A morphological module associates parts of speech with the linguistic tokens. A detection module identifies candidate titles of creative works, such as works of art. A filtering module filters the candidate titles of works to exclude citations of direct speech from the candidate titles of works. A comparison module compares any remaining candidate titles of works with titles of works in an associated knowledge base. The comparison module annotates the text when a match is found.

    摘要翻译: 用于解析文本的解析器包括令牌化模块,其将文本划分成语言令牌的有序序列。 形态学模块将词性与语言标记相结合。 检测模块识别创意作品的候选标题,例如艺术作品。 过滤模块过滤作品的候选作品,以排除作品候选人作品中直接言论的引用。 比较模块比较任何剩余的作品候选作品与相关知识库中作品的标题。 比较模块在找到匹配项时注释文本。

    Content-based dynamic email prioritizer
    10.
    发明申请
    Content-based dynamic email prioritizer 审中-公开
    基于内容的动态电子邮件优先级

    公开(公告)号:US20070168430A1

    公开(公告)日:2007-07-19

    申请号:US11287170

    申请日:2005-11-23

    IPC分类号: G06F15/16

    CPC分类号: G06Q10/107

    摘要: An email organizer operates in conjunction with an email system (20) and a natural language processor (42, 44). An action deadline detector (50) detects action deadlines contained in email messages (30) based on syntactic information about the email messages provided by the natural language processor. A scorer (56) assigns priority scores to the email messages based at least on the action deadlines and a current date (58).

    摘要翻译: 电子邮件组织者与电子邮件系统(20)和自然语言处理器(42,44)一起操作。 动作期限检测器(50)基于由自然语言处理器提供的电子邮件消息的句法信息来检测电子邮件消息(30)中包含的动作截止日期。 评分员(56)至少基于行动期限和当前日期(58)为电子邮件分配优先级分数。