METHOD AND APPARATUS FOR SEQUENCE LABELING ON ENTITY TEXT, AND NON-TRANSITORY COMPUTER-READABLE RECORDING MEDIUM

    公开(公告)号:US20220164536A1

    公开(公告)日:2022-05-26

    申请号:US17455967

    申请日:2021-11-22

    IPC分类号: G06F40/295

    摘要: A method and an apparatus for sequence labeling on an entity text, and a non-transitory computer-readable recording medium are provided. In the method, a start position of an entity text within a target text is determined. Then, a first matrix is generated based on the start position of the entity text. Elements in the first matrix indicates focusable weights of each word with respect to other words in the target text. Then, a named entity recognition model is generated using the first matrix. The named entity recognition model is obtained by training using first training data, the first training data includes word embeddings corresponding to respective texts in a training text set, and the texts are texts whose entity label has been labeled. Then, the target text is input to the named entity recognition model, and probability distribution of the entity label is output.

    METHOD AND APPARATUS FOR RECOGNIZING INTENTION, AND NON-TRANSITORY COMPUTER-READABLE RECORDING MEDIUM

    公开(公告)号:US20200242486A1

    公开(公告)日:2020-07-30

    申请号:US16739311

    申请日:2020-01-10

    摘要: A method and an apparatus for recognizing an intention, and a non-transitory computer-readable recording medium are provided. The method includes learning vectors of knowledge base elements in corpus samples, and converting the corpus samples into row vectors composed of the vectors of the knowledge base elements in a knowledge base; extracting feature vectors from respective pooling windows in the corpus samples by hierarchical pooling, determining weights positively correlated with similarities between texts within the respective pooling windows and the respective corpus samples, performing weighting on the extracted feature vectors to obtain feature vectors of the respective pooling windows, and obtaining feature vectors of the respective corpus samples composed of the feature vectors of the pooling windows; training a vector-based intention recognition classifier, based on the feature vectors of the corpus samples; and recognizing an intention in querying a corpus, using the trained intention recognition classifier.

    METHOD AND APPARATUS FOR NAMED ENTITY RECOGNITION, AND NON-TRANSITORY COMPUTER-READABLE RECORDING MEDIUM

    公开(公告)号:US20230394240A1

    公开(公告)日:2023-12-07

    申请号:US18326292

    申请日:2023-05-31

    IPC分类号: G06F40/295 G06F40/40

    CPC分类号: G06F40/295 G06F40/40

    摘要: A method and an apparatus for named entity recognition, and a non-transitory computer-readable recording medium are provided. In the method, text elements are traversed according to a text span to obtain candidate entity words. Then, a class to which the candidate entity word belongs is recognized. The recognizing of the class includes generating a prompt template corresponding to the candidate entity word, and concatenating the text to be recognized and the prompt template to obtain a concatenated text; generating vector representations of the text elements in the concatenated text; generating the vector representation of the candidate entity word according to the vector representations of the text elements of each candidate entity word in the concatenated text, and the vector representation of the text element of the mask word; and classifying the vector representation of the candidate entity word to obtain the class of the candidate entity word.