Invention Grant
- Patent Title: Method and apparatus for processing dataset
-
Application No.: US17133869Application Date: 2020-12-24
-
Publication No.: US11663258B2Publication Date: 2023-05-30
- Inventor: Zhe Hu , Cheng Peng , Xuefeng Luo
- Applicant: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO LTD
- Applicant Address: CN Beijing
- Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.
- Current Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.
- Current Assignee Address: CN Beijing
- Agency: Lippes Mathias LLP
- Priority: CN 2010430339.0 2020.05.20
- Main IPC: G06F16/35
- IPC: G06F16/35 ; G06F16/242 ; G06F16/22 ; G06F16/2455 ; G06V30/414 ; G06F18/214

Abstract:
The present disclosure discloses a method and apparatus for processing a dataset. The method includes: obtaining a first text set meeting a preset similarity matching condition with a target text from multiple text blocks provided by a target user; obtaining a second text set from the first text set, in which each text in the second text set does not belong to a same text block as the target text; generating a negative sample set of the target text based on content of a candidate text block to which each text in the second text set belongs; generating a positive sample set of the target text based on content of a target text block to which the target text belongs; and generating a dataset of the target user based on the negative sample set and the positive sample set, and training a matching model based on the dataset.
Public/Granted literature
- US20210365444A1 METHOD AND APPARATUS FOR PROCESSING DATASET Public/Granted day:2021-11-25
Information query