Text processing model training method, and text processing method and apparatus

    公开(公告)号:US12182507B2

    公开(公告)日:2024-12-31

    申请号:US17682145

    申请日:2022-02-28

    Abstract: A text processing model training method, and a text processing method and apparatus in the natural language processing field in the artificial intelligence field are disclosed. The training method includes: obtaining training text; separately inputting the training text into a teacher model and a student model to obtain sample data output by the teacher model and prediction data output by the student model; the sample data includes a sample semantic feature and a sample label; the prediction data includes a prediction semantic feature and a prediction label; and the teacher model is a pre-trained language model used for text classification; and training a model parameter of the student model based on the sample data and the prediction data, to obtain a target student model. The method enables the student model to effectively perform knowledge transfer, thereby improving accuracy of a text processing result of the student model.

    MODEL DISTILLATION METHOD AND RELATED DEVICE
    13.
    发明公开

    公开(公告)号:US20240185086A1

    公开(公告)日:2024-06-06

    申请号:US18443052

    申请日:2024-02-15

    CPC classification number: G06N3/096 G06N3/045

    Abstract: This disclosure relates to the field of artificial intelligence, and provides model distillation methods and apparatuses. In an implementation, a method including: obtaining first input data and second input data from a second computing node, wherein the first input data is output data of the third sub-model, and the second input data is output data processed by the fourth sub-model, processing the first input data by using the first sub-model, to obtain a first intermediate output, processing the second input data by using the second sub-model, to obtain a second intermediate output, wherein the first intermediate output and the second intermediate output are used to determine a first gradient, and distilling the first sub-model based on the first gradient, to obtain an updated first sub-model.

    Paraphrase sentence generation method and apparatus

    公开(公告)号:US11586814B2

    公开(公告)日:2023-02-21

    申请号:US16856450

    申请日:2020-04-23

    Abstract: A paraphrase sentence generation method and apparatus relating to the research field of natural language processing include generating m second sentences based on a first sentence and a paraphrase generation model, determining a matching degree between each of the m second sentences and the first sentence based on a paraphrase matching model, and determining n second sentences from the m second sentences based on matching degrees among the m second sentences and the first sentence, where the paraphrase generation model is obtained through reinforcement learning-based training based on a reward of the paraphrase matching model.

    VOICE INTERACTION METHOD AND ELECTRONIC DEVICE

    公开(公告)号:US20230017274A1

    公开(公告)日:2023-01-19

    申请号:US17952401

    申请日:2022-09-26

    Abstract: Embodiments of this application provide a voice interaction method and an electronic device, and relate to the field of artificial intelligence AI technologies and the field of voice processing technologies. A specific solution includes: An electronic device may receive first voice information sent by a second user, and the electronic device recognizes the first voice information in response to the first voice information. The first voice information is used to request a voice conversation with a first user. The electronic device may have, on a basis that the electronic device recognizes that the first voice information is voice information of the second user, a voice conversation with the second user by imitating a voice of the first user and in a mode in which the first user has a voice conversation with the second user.

    Sentence paraphrase method and apparatus, and method and apparatus for training sentence paraphrase model

    公开(公告)号:US12175188B2

    公开(公告)日:2024-12-24

    申请号:US17701775

    申请日:2022-03-23

    Abstract: This disclosure relates to a natural language processing technology, and provides a sentence paraphrase method and apparatus. The method includes: paraphrasing an input sentence by using a sentence paraphrase model, to generate a plurality of candidate paraphrased sentences; and determining a similarity between each of the plurality of candidate paraphrased sentences and the input sentence, to obtain an output sentence whose similarity to the input sentence is greater than or equal to a preset threshold, where each of a plurality of paraphrased sentence generators in the sentence paraphrase model includes one neural network, the plurality of paraphrased sentence generators are trained by using source information and similarity information as a first reward, and the paraphrased sentence is obtained by paraphrasing the training sentence by using the plurality of paraphrased sentence generators. In the sentence paraphrase method, diversity of a paraphrased sentence and quality of the paraphrased sentence can be improved.

    Data Processing Method and Related Device

    公开(公告)号:US20240386274A1

    公开(公告)日:2024-11-21

    申请号:US18787328

    申请日:2024-07-29

    Abstract: A data processing method includes processing target data through a target neural network to obtain a data processing result, where a target header of the target neural network is used to process, through a first transformation matrix, a first vector corresponding to first subdata, and process, through a second transformation matrix, a second vector corresponding to the first subdata, where the first vector corresponds to position information of the first subdata in the target data, and the second vector corresponds to semantic information of the first subdata.

    DATA PROCESSING METHOD AND RELATED DEVICE
    19.
    发明公开

    公开(公告)号:US20230229898A1

    公开(公告)日:2023-07-20

    申请号:US18186942

    申请日:2023-03-20

    CPC classification number: G06N3/0499 G06N3/08

    Abstract: A data processing method includes: obtaining to-be-processed data and a target neural network model, where the target neural network model includes a first transformer layer, the first transformer layer includes a first residual branch and a second residual branch, the first residual branch includes a first attention head, and the second residual branch includes a target feed-forward network (FFN) layer; and performing target task related processing on the to-be-processed data based on the target neural network model, to obtain a data processing result, where the target neural network model is for performing a target operation on an output of the first attention head and a first weight value to obtain an output of the first residual branch, and/or the target neural network model is for performing a target operation on an output of the target FFN and a second weight value to obtain an output of the second residual branch.

    Learning-to-rank method based on reinforcement learning and server

    公开(公告)号:US11500954B2

    公开(公告)日:2022-11-15

    申请号:US16538174

    申请日:2019-08-12

    Abstract: A learning-to-rank method based on reinforcement learning, including obtaining, by a server, a historical search word, and obtaining M documents corresponding to the historical search word; ranking, by the server, the M documents to obtain a target document ranking list; obtaining, by the server, a ranking effect evaluation value of the target document ranking list; using, by the server, the historical search word, the M documents, the target document ranking list, and the ranking effect evaluation value as a training sample, and adding the training sample into a training sample set.

Patent Agency Ranking