摘要:
Methods and systems are disclosed for creating and linking a series of interfaces configured to display information and receive confirmation of classifications made by a natural language modeling engine to improve organization of a collection of documents into an hierarchical structure. In some embodiments, the interfaces may display to an annotator a plurality of labels of potential classifications for a document as identified by a natural language modeling engine, collect annotated responses from the annotator, aggregate the annotated responses across other annotators, analyze the accuracy of the natural language modeling engine based on the aggregated annotated responses, and predict accuracies of the natural language modeling engine's classifications of the documents.
摘要:
Methods are presented for generating a natural language model. The method may comprise: ingesting training data representative of documents to be analyzed by the natural language model, generating a hierarchical data structure comprising at least two topical nodes within which the training data is to be subdivided into by the natural language model, selecting a plurality of documents among the training data to be annotated, generating an annotation prompt for each document configured to elicit an annotation about said document indicating which node among the at least two topical nodes said document is to be classified into, receiving the annotation based on the annotation prompt; and generating the natural language model using an adaptive machine learning process configured to determine patterns among the annotations for how the documents in the training data are to be subdivided according to the at least two topical nodes of the hierarchical data structure.