-
公开(公告)号:US20160162456A1
公开(公告)日:2016-06-09
申请号:US14964517
申请日:2015-12-09
申请人: Robert J. Munro , Schuyler D. Erle , Christopher Walker , Sarah K. Luger , Jason Brenier , Gary C. King , Paul A. Tepper , Ross Mechanic , Andrew Gilchrist-Scott , Jessica D. Long , James B. Robinson , Brendan D. Callahan , Michelle Casbon , Ujjwal Sarin , Aneesh Nair , Veena Basavaraj , Tripti Saxena , Edgar Nunez , Martha G. Hinrichs , Haley Most , Tyler J. Schnoebelen
发明人: Robert J. Munro , Schuyler D. Erle , Christopher Walker , Sarah K. Luger , Jason Brenier , Gary C. King , Paul A. Tepper , Ross Mechanic , Andrew Gilchrist-Scott , Jessica D. Long , James B. Robinson , Brendan D. Callahan , Michelle Casbon , Ujjwal Sarin , Aneesh Nair , Veena Basavaraj , Tripti Saxena , Edgar Nunez , Martha G. Hinrichs , Haley Most , Tyler J. Schnoebelen
CPC分类号: G06F17/30598 , G06F3/0482 , G06F17/2241 , G06F17/241 , G06F17/272 , G06F17/2785 , G06F17/28 , G06F17/2809 , G06F17/30011 , G06F17/30401 , G06F17/30445 , G06F17/30604 , G06F17/30654 , G06F17/30705 , G06F17/30734 , G06F17/30864 , G06Q50/01
摘要: Methods are presented for generating a natural language model. The method may comprise: ingesting training data representative of documents to be analyzed by the natural language model, generating a hierarchical data structure comprising at least two topical nodes within which the training data is to be subdivided into by the natural language model, selecting a plurality of documents among the training data to be annotated, generating an annotation prompt for each document configured to elicit an annotation about said document indicating which node among the at least two topical nodes said document is to be classified into, receiving the annotation based on the annotation prompt; and generating the natural language model using an adaptive machine learning process configured to determine patterns among the annotations for how the documents in the training data are to be subdivided according to the at least two topical nodes of the hierarchical data structure.
摘要翻译: 提出了生成自然语言模型的方法。 该方法可以包括:摄取表示要由自然语言模型分析的文档的训练数据,生成包括至少两个主题节点的分层数据结构,训练数据将在该节点内被自然语言模型细分,选择多个 在要注释的训练数据中生成文档的注释提示,为每个文档生成关于所述文档的注释的注释提示,该注释指示所述文档中的至少两个主题节点中的哪个节点被分类,基于注释接收注释 提示; 以及使用自适应机器学习过程来生成所述自然语言模型,所述自适应机器学习过程被配置为根据所述分级数据结构的所述至少两个主题节点来确定所述注释中的模式如何根据所述训练数据中的文档被细分。