摘要:
Methods are presented for generating a natural language model. The method may comprise: ingesting training data representative of documents to be analyzed by the natural language model, generating a hierarchical data structure comprising at least two topical nodes within which the training data is to be subdivided into by the natural language model, selecting a plurality of documents among the training data to be annotated, generating an annotation prompt for each document configured to elicit an annotation about said document indicating which node among the at least two topical nodes said document is to be classified into, receiving the annotation based on the annotation prompt; and generating the natural language model using an adaptive machine learning process configured to determine patterns among the annotations for how the documents in the training data are to be subdivided according to the at least two topical nodes of the hierarchical data structure.
摘要:
Methods and systems are disclosed for creating and linking a series of interfaces configured to display information and receive confirmation of classifications made by a natural language modeling engine to improve organization of a collection of documents into an hierarchical structure. In some embodiments, the interfaces may display to an annotator a plurality of labels of potential classifications for a document as identified by a natural language modeling engine, collect annotated responses from the annotator, aggregate the annotated responses across other annotators, analyze the accuracy of the natural language modeling engine based on the aggregated annotated responses, and predict accuracies of the natural language modeling engine's classifications of the documents.
摘要:
Systems and methods are presented for the automatic placement of rules applied to topics in a logical hierarchy when conducting natural language processing. In some embodiments, a method includes: accessing, at a child node in a logical hierarchy, at least one rule associated with the child node; identifying a percolation criterion associated with a parent node to the child node, said percolation criterion indicating that the at least one rule associated with the child node is to be associated also with the parent node; associating the at least one rule with the parent node such that the at least one rule defines a second factor for determining whether the document is to also be classified into the parent node; accessing the document for natural language processing; and determining whether the document is to be classified into the parent node or the child node based on the at least one rule.
摘要:
Systems and methods are presented for the automatic placement of rules applied to topics in a logical hierarchy when conducting natural language processing. In some embodiments, a method includes: accessing, at a child node in a logical hierarchy, at least one rule associated with the child node; identifying a percolation criterion associated with a parent node to the child node, said percolation criterion indicating that the at least one rule associated with the child node is to be associated also with the parent node; associating the at least one rule with the parent node such that the at least one rule defines a second factor for determining whether the document is to also be classified into the parent node; accessing the document for natural language processing; and determining whether the document is to be classified into the parent node or the child node based on the at least one rule.