-
1.
公开(公告)号:US20230342667A1
公开(公告)日:2023-10-26
申请号:US18179266
申请日:2023-03-06
Inventor: Zenan LIN , Huapeng QIN , Min ZHAO , Guoxin ZHANG , Yajuan LV
IPC: G06N20/00
CPC classification number: G06N20/00
Abstract: A semantic classification model training method includes that a sample query template and a label category of at least one category to be predicted in the sample query template are acquired, where the sample query template is constructed according to a sample query statement and a number of the at least one category to be predicted; the sample query template is input to the pre-constructed semantic classification model to obtain a sample semantic category of the at least one category to be predicted; and the semantic classification model is trained according to the sample semantic category and the label category of the at least one category to be predicted.
-
公开(公告)号:US20230023789A1
公开(公告)日:2023-01-26
申请号:US17956558
申请日:2022-09-29
Inventor: Huapeng QIN , Min ZHAO , Guoxin ZHANG
Abstract: The method for identifying noise samples, includes: obtaining an original sample set; obtaining a target sample set by adding masks to original training corpora in the original sample set using a preset adjustment rule; performing mask prediction on a plurality of target training corpora in the target sample set using a pre-trained language model to obtain a first mask prediction character corresponding to each target training corpus; matching the first mask prediction character corresponding to each target training corpus with a preset condition; and according to target training corpora of which first mask prediction characters do not match the preset condition in the target sample set, determining corresponding original training corpora in the original sample set as noise samples.
-