-
公开(公告)号:US20230080904A1
公开(公告)日:2023-03-16
申请号:US18054608
申请日:2022-11-11
Inventor: Yaqian HAN , Shuohuan WANG , Yu SUN
Abstract: A method for generating a cross-lingual textual semantic model includes: acquiring a set of training data that includes pieces of monolingual non-parallel text and pieces of bilingual parallel text; determining a semantic vector of each piece of text in the set of training data by inputting each piece of text into an initial textual semantic model; determining a distance between semantic vectors of each two pieces of text in the set of training data based on the semantic vector of each piece of text in the set of training data; determining a gradient modification based on a parallel relationship between each two pieces of text in the set of training data and the distance between the semantic vectors of each two pieces of text in the set of training data; and acquiring a modified textual semantic model by modifying the initial textual semantic model based on the gradient modification.
-
公开(公告)号:US20220300697A1
公开(公告)日:2022-09-22
申请号:US17835717
申请日:2022-06-08
Inventor: Yukun LI , Han ZHANG , Weichong YIN , Dongling XIAO , Yu SUN , Hao TIAN
Abstract: A method for generating a target object is provided. A first discrete encoded sequence corresponding to an original object is generated by performing discrete encoding on the original object. The original object is of an image type, a text type, or a text-image-combined type. A second discrete encode sequence is obtained by inputting the first discrete encoded sequence into a generative model. A target object is generated based on the second discrete encoded sequence. The target object is of an image type or a text type. When the original object is of the image type, the target object is of the text type. When the original object is of the text type, the target object is of the image type.
-
公开(公告)号:US20220129753A1
公开(公告)日:2022-04-28
申请号:US17572921
申请日:2022-01-11
Inventor: Yuxiang LU , Jiaxiang LIU , Xuyi CHEN , Shikun FENG , Shuohuan WANG , Yu SUN , Shiwei HUANG , Jingzhou HE
Abstract: A pre-training method of a neural network model, an electronic device, and a medium. The pre-training data is inputted to the initial neural network model, and the initial neural network model is pre-trained in the first training mode, in the first training mode, the plurality of hidden layers share one hidden layer parameter, and the loss value of the initial neural network model is obtained, if the loss value of the initial neural network model is less than a preset threshold, the initial neural network model continues to be pre-trained in the second training mode, in the second training mode, each of the plurality of hidden layers has its own hidden layer parameter.
-
14.
公开(公告)号:US20250094534A1
公开(公告)日:2025-03-20
申请号:US18968798
申请日:2024-12-04
Inventor: Linhao ZHANG , Yilong CHEN , Junyuan SHANG , Yinqi YANG , Shuohuan WANG , Yu SUN
IPC: G06F17/16
Abstract: A task execution method for a large model relates to fields of artificial intelligence, deep learning and large model technologies, and includes executing attention tasks in a task group to be fused using a target computing unit to obtain attention features, where the attention task corresponds to a weighted matrix to be fused, the weighted matrix to be fused is obtained by weighting a matrix to be fused using a weight; obtaining a processing result according to the attention features; determining a loss information according to the processing result; and weighting and fusing matrices to be fused using the target computing unit according to weights for the task group to be fused if the loss information converges, to obtain a fusion matrix for a target task group, where a target task in the target task group is executed by the target computing unit according to the fusion matrix.
-
公开(公告)号:US20240028909A1
公开(公告)日:2024-01-25
申请号:US18478833
申请日:2023-09-29
IPC: G06N3/096
CPC classification number: G06N3/096
Abstract: A data generation method based on a deep learning model and a training method is provided. The data generation method includes: determining an initial input of the deep learning model based on input data; obtaining a first output of the model, where in response to the model determining that generating a reply based on the initial input requires calling a first functional component different from the deep learning model, the first output includes a first token for calling the first functional component and a first intermediate inquiry determined based on the initial input and recognizable by the first functional component; obtaining a first intermediate result determined by the first functional component based on the first intermediate inquiry; determining a second input for the model based on the initial input and the first intermediate result; and obtaining a second output of the model for generating a reply to the initial input.
-
公开(公告)号:US20230206080A1
公开(公告)日:2023-06-29
申请号:US18118339
申请日:2023-03-07
Inventor: Shuohuan WANG , Weibao GONG , Zhihua WU , Yu SUN , Siyu DING , Yaqian HAN , Yanbin ZHAO , Yuang LIU , Dianhai YU
Abstract: A model training system includes at least one first cluster and a second cluster communicating with the at least first cluster. The at least one first cluster is configured to acquire a sample data set, generate training data according to the sample data set, and send the training data to the second cluster; and the second cluster is configured to train a pre-trained model according to the training data sent by the at least one first cluster.
-
公开(公告)号:US20210312139A1
公开(公告)日:2021-10-07
申请号:US17353884
申请日:2021-06-22
Inventor: Shuohuan WANG , Siyu DING , Junyuan SHANG , Yu SUN
Abstract: A method and apparatus of generating a semantic feature, a method and apparatus of training a model, an electronic device, and a storage medium are provided. The method of generating the semantic feature includes: segmenting a target document to obtain a segment sequence of the target document; generating a semantic feature of each document segment in the segment sequence of the target document by using a pre-trained bidirectional semantic encoding model; and acquiring the semantic feature of the target document based on the semantic feature of the each document segment in the segment sequence of the target document. The present disclosure further provides a method of training a bidirectional semantic encoding model.
-
-
-
-
-
-