-
公开(公告)号:US20250036885A1
公开(公告)日:2025-01-30
申请号:US18758669
申请日:2024-06-28
Applicant: Beijing Volcano Engine Technology Co., Ltd.
Inventor: Guolong SONG , Xuan LUO
IPC: G06F40/40
Abstract: The present application discloses a language model processing method and apparatus. The method includes: constructing N storage structures to store an N-gram language model. For an ith structure, if i is greater than or equal to 1 and less than or equal to N−1, the ith storage structure includes a plurality of first nodes, the first node is used to carry information about an ith-order word in a first gram; or if i is equal to N, the ith storage structure includes a plurality of second nodes, the second node carries information about a second gram, the second gram is an N-gram, and the information carried by the second node includes: an identifier of an Nth-order word in the second gram and an N-gram probability of the second gram.
-
公开(公告)号:US20250021878A1
公开(公告)日:2025-01-16
申请号:US18764952
申请日:2024-07-05
Applicant: Beijing Volcano Engine Technology Co., Ltd.
Inventor: Guolong SONG , Xuan LUO
IPC: G06N20/00
Abstract: The present invention relates to the technical field of data processing, and discloses a method and apparatus, computer device, and storage medium for generating an augmented sample. The method includes: obtaining a reference sample set to be augmented, and selecting at least two parent samples from the reference sample set; generating a new sample based on the at least two parent samples; updating feature values of the first type of features based on first statistical data of the first type of features in the reference sample set, and updating classification options of the second type of features based on second statistical data of the second type of features in the reference sample set; and generating an augmented sample based on the updated feature values of the first type of features and the updated classification options of the second type of features.
-