-
公开(公告)号:US20250094792A1
公开(公告)日:2025-03-20
申请号:US18968790
申请日:2024-12-04
Inventor: Bo KE , Xuyi CHEN , Zhengjie HUANG , Shikun FENG , Weibin LI , Shiwei HUANG
IPC: G06N3/0495 , G06N3/0475 , G06N3/0499 , G06N3/09
Abstract: A task execution method for a large model, an electronic device, and a storage medium are provided, which relate to a field of artificial intelligence technology, particularly to fields of deep learning technology and large model technology. The method includes: executing a modality routing task by using a target computing unit based on a target feature to be processed to obtain a modality recognition result; executing a field routing task by using the target computing unit based on the target feature to be processed and a target field gating model parameter to obtain a field recognition result; and executing a feedforward task by using the target computing unit based on the target feature to be processed and a target feedforward task model parameter to obtain a task execution result
-
公开(公告)号:US20230030471A1
公开(公告)日:2023-02-02
申请号:US17698242
申请日:2022-03-18
Inventor: Jiaxiang LIU , Shikun FENG
IPC: G06F40/284
Abstract: The present disclosure provides a text processing method and apparatus, an electronic device and a storage medium, and relates to the field of artificial intelligence technologies such as deep learning and natural language processing. The method may include: configuring, for a to-be-processed text, attention patterns corresponding to heads in a Transformer model using a multi-head-attention mechanism respectively, wherein at least one head corresponds to a different attention pattern from the other N−1 heads, and N denotes a number of heads and is a positive integer greater than 1; and processing the text by using the Transformer model. Model performance and a corresponding text processing effect can be improved by using the solutions according to the present disclosure.
-
公开(公告)号:US20230177821A1
公开(公告)日:2023-06-08
申请号:US18063564
申请日:2022-12-08
Inventor: Qiming PENG , Bin LUO , Yuhui CAO , Shikun FENG , Yongfeng CHEN
CPC classification number: G06V10/82 , G06V30/19147 , G06V30/1444
Abstract: A neural network training method and a document image understanding method is provided. The neural network training method includes: acquiring text comprehensive features of a plurality of first texts in an original image; replacing at least one original region in the original image to obtain a sample image including a plurality of first regions and a ground truth label for indicating whether each first region is a replaced region; acquiring image comprehensive features of the plurality of first regions; inputting the text comprehensive features of the plurality of first texts and the image comprehensive features of the plurality of first regions into a neural network model together to obtain text representation features of the plurality of first texts; determining a predicted label based on the text representation features of the plurality of first texts; and training the neural network model based on the ground truth label and the predicted label.
-
公开(公告)号:US20230004774A1
公开(公告)日:2023-01-05
申请号:US17578683
申请日:2022-01-19
Inventor: Weibin LI , Zhifan ZHU , Shikun FENG , Shiwei HUANG , Jingzhou HE
Abstract: The present disclosure provides a method and apparatus for generating a node representation, an electronic device and a readable storage medium, and relates to the field of deep learning technologies. The method for generating a node representation includes: acquiring a heterogeneous graph to be processed; performing a sampling operation in the heterogeneous graph to be processed according to a first meta path, so as to obtain at least one first walk path; obtaining an initial node representation of each node in the heterogeneous graph to be processed according to the at least one first walk path; and generating the final node representation of each node according to the initial node representation of each node and initial node representations of neighbor nodes of each node. With the present disclosure, accuracy of the generated node representation may be improved.
-
5.
公开(公告)号:US20220382991A1
公开(公告)日:2022-12-01
申请号:US17883908
申请日:2022-08-09
Inventor: Qiming PENG , Bin LUO , Yuhui CAO , Shikun FENG , Yongfeng CHEN
IPC: G06F40/30 , G06V30/414 , G06V30/14
Abstract: The present disclosure provides a training method and apparatus for a document processing model, a device, a storage medium and a program, which relate to the field of artificial intelligence, and in particular, to technologies such as deep learning, natural language processing and text recognition. The specific implementation is: acquiring a first sample document; determining element features of a plurality of document elements in the first sample document and positions corresponding to M position types of each document element according to the first sample document; where the document element corresponds to a character or a document area in the first sample document; and performing training on a basic model according to the element features of the plurality of document elements and the positions corresponding to the M position types of each document element to obtain the document processing model.
-
公开(公告)号:US20230177359A1
公开(公告)日:2023-06-08
申请号:US18063348
申请日:2022-12-08
Inventor: Sijin WU , Han LIU , Teng HU , Shikun FENG , Yongfeng CHEN
IPC: G06N5/022 , G06F40/174 , G06F40/205
CPC classification number: G06N5/022 , G06F40/174 , G06F40/205 , G06F40/30
Abstract: The present disclosure provides a method and apparatus for training a document information extraction model and method and apparatus for extracting document information, and relates to the field of artificial intelligence, and more particularly to the field of natural language processing. A specific implementation solution is: acquiring training data labeled with an answer corresponding to a preset question and a document information extraction model, the training data includes layout document training data and streaming document training data; extracting at least one feature from the training data; fusing at least one feature to obtain a fused feature; inputting the preset question, the fused feature and the training data into the document information extraction model to obtain a predicted result; and adjusting network parameters of the document information extraction model based on the predicted result and the answer.
-
公开(公告)号:US20230135536A1
公开(公告)日:2023-05-04
申请号:US18089395
申请日:2022-12-27
Inventor: Chenhui LI , Teng HU , Shikun FENG , Yongfeng CHEN
Abstract: A method and an apparatus for processing a table are provided. The method includes: obtaining text information of cells in the table; obtaining structure information of the cells in the table; and inputting a query word, the text information, and the structure information of the table into a table information extraction model to obtain an answer output from the table information extraction model, wherein the output answer corresponds to the query word in the table.
-
公开(公告)号:US20220129753A1
公开(公告)日:2022-04-28
申请号:US17572921
申请日:2022-01-11
Inventor: Yuxiang LU , Jiaxiang LIU , Xuyi CHEN , Shikun FENG , Shuohuan WANG , Yu SUN , Shiwei HUANG , Jingzhou HE
Abstract: A pre-training method of a neural network model, an electronic device, and a medium. The pre-training data is inputted to the initial neural network model, and the initial neural network model is pre-trained in the first training mode, in the first training mode, the plurality of hidden layers share one hidden layer parameter, and the loss value of the initial neural network model is obtained, if the loss value of the initial neural network model is less than a preset threshold, the initial neural network model continues to be pre-trained in the second training mode, in the second training mode, each of the plurality of hidden layers has its own hidden layer parameter.
-
公开(公告)号:US20250028958A1
公开(公告)日:2025-01-23
申请号:US18908380
申请日:2024-10-07
Inventor: Xuyi CHEN , Bo KE , Chenhui LI , Zhengjie HUANG , Shiwei HUANG , Weibin LI , Shikun FENG
IPC: G06N3/08
Abstract: A data processing method, and a data processing model and a training method therefor are provided, and relate to the field of artificial intelligence, and specifically, to natural language processing, deep learning technologies, and large model technologies. An implementation solution includes: determining input data, where the input data includes a plurality of tokens; determining a correlation between each of the plurality of tokens and each of a plurality of expert networks based on a gating matrix, where the plurality of expert networks are used to reinforce the plurality of tokens; allocating the plurality of tokens to the plurality of expert networks in a uniform manner based on the correlation and a preset capacity of each expert network, to reinforce the plurality of tokens; and determining a data processing result based on the plurality of reinforced tokens.
-
公开(公告)号:US20220293092A1
公开(公告)日:2022-09-15
申请号:US17828773
申请日:2022-05-31
Inventor: Siyu DING , Chao PANG , Shuohuan WANG , Yanbin ZHAO , Junyuan SHANG , Yu SUN , Shikun FENG , Hao TIAN , Hua WU , Haifeng WANG
Abstract: The present application provides a method of training a natural language processing model, which relates to a field of artificial intelligence, and in particular to a field of natural language processing. A specific implementation scheme includes: performing a semantic learning for multi-tasks on an input text, so as to obtain a semantic feature for the multi-tasks, wherein the multi-tasks include a plurality of branch tasks; performing a feature learning for each branch task based on the semantic feature, so as to obtain a first output result for each branch task; calculating a loss for each branch task according to the first output result for the branch task; and adjusting a parameter of the natural language processing model according to the loss for each branch task. The present application further provides a method of processing a natural language, an electronic device, and a storage medium.
-
-
-
-
-
-
-
-
-