-
公开(公告)号:US11151179B2
公开(公告)日:2021-10-19
申请号:US16362584
申请日:2019-03-22
Inventor: Shuangjie Li , Yabing Shi , Haijin Liang , Yang Zhang , Yong Zhu
IPC: G06F16/335
Abstract: Provided are a method, an apparatus and an electronic device for determining a knowledge sample data set, the method includes: acquiring a preset number of SPO triplet formats and source texts; acquiring, according to the SPO triplet formats, n SPO entries corresponding to the SPO triplet formats; searching, in the source texts, m first texts that match the n SPO entries, and generating a first knowledge sample data set; determining k second texts that meet the SPO triplet formats from the m first texts and generating a second knowledge sample data set; generating a target knowledge sample data set according to the first knowledge sample data set and the second knowledge sample data set. In the embodiments, the knowledge sample data set is automatically generated, the volume generation speed is fast, the cost is low, and the data size that can be produced is large, thus meeting the training requirement.