专利检索 ap:("Robert J. Munro" OR "Rob Voigt" OR "Schuyler D. Erle" OR "Brendan D. Callahan" OR "Gary C. King" OR "Jessica D. Long" OR "Jason Brenier" OR "Tripti Saxena" OR "Stefan Krawczyk") AND inv:"Rob Voigt" 第 1 页

1.

发明申请
INTELLIGENT SYSTEM THAT DYNAMICALLY IMPROVES ITS KNOWLEDGE AND CODE-BASE FOR NATURAL LANGUAGE UNDERSTANDING 审中-公开

公开(公告)号：US20190205377A1

公开(公告)日：2019-07-04

申请号：US16056263

申请日：2018-08-06

申请人： Robert J. Munro , Rob Voigt , Schuyler D. Erle , Brendan D. Callahan , Gary C. King , Jessica D. Long , Jason Brenier , Tripti Saxena , Stefan Krawczyk

发明人： Robert J. Munro , Rob Voigt , Schuyler D. Erle , Brendan D. Callahan , Gary C. King , Jessica D. Long , Jason Brenier , Tripti Saxena , Stefan Krawczyk

IPC分类号： G06F17/27

CPC分类号： G06F17/277 , G06F17/2715 , G06F17/2785

摘要： Systems, methods, and apparatuses are presented for a novel natural language tokenizer and tagger. In some embodiments, a method for tokenizing text for natural language processing comprises: generating from a pool of documents, a set of statistical models comprising one or more entries each indicating a likelihood of appearance of a character/letter sequence in the pool of documents; receiving a set of rules comprising rules that identify character/letter sequences as valid tokens; transforming one or more entries in the statistical models into new rules that are added to the set of rules when the entries indicate a high likelihood; receiving a document to be processed; dividing the document to be processed into tokens based on the set of statistical models and the set of rules, wherein the statistical models are applied where the rules fail to unambiguously tokenize the document; and outputting the divided tokens for natural language processing.

2.

发明申请
INTELLIGENT SYSTEM THAT DYNAMICALLY IMPROVES ITS KNOWLEDGE AND CODE-BASE FOR NATURAL LANGUAGE UNDERSTANDING 有权
标题翻译：智能系统动态改进自然语言理解知识和代码

公开(公告)号：US20160162466A1

公开(公告)日：2016-06-09

申请号：US14964512

申请日：2015-12-09

申请人： Robert J. Munro , Rob Voigt , Schuyler D. Erle , Brendan D. Callahan , Gary C. King , Jessica D. Long , Jason Brenier , Tripti Saxena , Stefan Krawczyk

发明人： Robert J. Munro , Rob Voigt , Schuyler D. Erle , Brendan D. Callahan , Gary C. King , Jessica D. Long , Jason Brenier , Tripti Saxena , Stefan Krawczyk

IPC分类号： G06F17/27

CPC分类号： G06F17/277 , G06F17/2715 , G06F17/2785

摘要： Systems, methods, and apparatuses are presented for a novel natural language tokenizer and tagger. In some embodiments, a method for tokenizing text for natural language processing comprises: generating from a pool of documents, a set of statistical models comprising one or more entries each indicating a likelihood of appearance of a character/letter sequence in the pool of documents; receiving a set of rules comprising rules that identify character/letter sequences as valid tokens; transforming one or more entries in the statistical models into new rules that are added to the set of rules when the entries indicate a high likelihood; receiving a document to be processed; dividing the document to be processed into tokens based on the set of statistical models and the set of rules, wherein the statistical models are applied where the rules fail to unambiguously tokenize the document; and outputting the divided tokens for natural language processing.

摘要翻译： 系统，方法和设备被呈现给一种新颖的自然语言标记器和标签器。在一些实施例中，用于对自然语言处理的文本进行标记化的方法包括：从文档池生成包括一个或多个条目的统计模型集合，每个条目表示在文档库中出现字符/字母序列的可能性; 接收一组包含将字符/字符序列识别为有效令牌的规则的规则; 将统计模型中的一个或多个条目转换为当条目表示高可能性时添加到规则集合中的新规则; 接收待处理的文件; 基于统计模型和规则集合将要处理的文档划分为令牌，其中在规则未能明确地标记文档的情况下应用统计模型; 并输出用于自然语言处理的分割令牌。

3.

发明申请
INTELLIGENT SYSTEM THAT DYNAMICALLY IMPROVES ITS KNOWLEDGE AND CODE-BASE FOR NATURAL LANGUAGE UNDERSTANDING 审中-公开

公开(公告)号：US20180095946A1

公开(公告)日：2018-04-05

申请号：US15596855

申请日：2017-05-16

申请人： Robert Munro , Rob Voigt , Schuyler D. Erle , Brendan D. Callahan , Gary C. King , Jessica D. Long , Jason Brenier , Tripti Saxena , Stefan Krawczyk

发明人： Robert Munro , Rob Voigt , Schuyler D. Erle , Brendan D. Callahan , Gary C. King , Jessica D. Long , Jason Brenier , Tripti Saxena , Stefan Krawczyk

IPC分类号： G06F17/27

CPC分类号： G06F17/277 , G06F17/2715 , G06F17/2785

摘要： Systems, methods, and apparatuses are presented for a novel natural language tokenizer and tagger. In some embodiments, a method for tokenizing text for natural language processing comprises: generating from a pool of documents, a set of statistical models comprising one or more entries each indicating a likelihood of appearance of a character/letter sequence in the pool of documents; receiving a set of rules comprising rules that identify character/letter sequences as valid tokens; transforming one or more entries in the statistical models into new rules that are added to the set of rules when the entries indicate a high likelihood; receiving a document to be processed; dividing the document to be processed into tokens based on the set of statistical models and the set of rules, wherein the statistical models are applied where the rules fail to unambiguously tokenize the document; and outputting the divided tokens for natural language processing.

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类