Methods and apparatus for creating domain-specific intended-meaning natural language processing pipelines
摘要:
A method includes receiving a dataset that includes a plurality of input texts. Each input text from the plurality of texts is associated with a content category from a plurality of content categories based on a comparison between that input text and an intended meaning that is common for each comparison. For each model in a plurality of models, and for each content category from the plurality of content categories, that model is executed on each input text from the plurality of input texts to generate an average similarity/dissimilarity score for that content category. At least one model from the plurality of models is selected, based on the average similarity score for each content category from the plurality of content categories for each model in the plurality of models, to determine whether an input text is similar/dissimilar to the intended meaning.
信息查询
0/0