SEMI-SUPERVISED METHOD AND APPARATUS FOR PUBLIC OPINION TEXT ANALYSIS

    公开(公告)号:US20230351212A1

    公开(公告)日:2023-11-02

    申请号:US17837233

    申请日:2022-06-10

    Applicant: ZHEJIANG LAB

    CPC classification number: G06N5/022

    Abstract: The disclosure provides a semi-supervised method and apparatus for public opinion text analysis. The semi-supervised method includes: first acquiring a public opinion data set, and preprocessing the data set; performing a data augmentation algorithm on preprocessed samples to generate data augmented samples; generating category labels for the unlabeled samples in the data set in an unsupervised extraction and clustering manner; calculating similarities of word vector latent semantic spaces and performing linear interpolation operation to generate, according to an operation result, similarity interpolation samples; constructing a final training sample set; adopting a semi-supervised method, inputting the final training sample set into a pre-trained language model to train the model to obtain a classification model; and predicting the test set by using the classification model to obtain a classification result.

Patent Agency Ranking