-
公开(公告)号:US20250094877A1
公开(公告)日:2025-03-20
申请号:US18969719
申请日:2024-12-05
Inventor: Fan WANG , Hua WU , Yingzhan LIN , Zengfeng ZENG , Yufeng HU , Jianhui DING , Haifeng WANG
IPC: G06N20/00
Abstract: A large model-based method of generating a text, a method of training a text generation model, a device, and a medium are provided, which relate to a field of artificial intelligence technology, specifically to fields of deep learning, natural language processing and large model technologies. The large model-based method of generating a text includes: acquiring a memory state for a text to be processed, where the memory state is generated based on a previous text of the text to be processed; determining an embedding feature of the text to be processed as an initial hidden state, and processing the memory state and the initial hidden state by using a first attention mechanism to obtain an updated hidden state; and generating a subsequent text for the text to be processed based on the updated hidden state.
-
公开(公告)号:US20250094713A1
公开(公告)日:2025-03-20
申请号:US18967529
申请日:2024-12-03
Inventor: Shuohuan WANG , Yekun CHAI , Siyu DING , Junyuan SHANG , Zhenyu ZHANG , Yu SUN , Hao TIAN , Hua WU , Haifeng WANG
IPC: G06F40/284 , G06F16/3329
Abstract: A multimodal data generation method is provided. The method includes: inputting a query data sequence into a multimodal model, to obtain a plurality of tokens in a response data sequence, where a current token is generated through the following operations: inputting the query data sequence and a current response data sequence into the multimodal model, so that the multimodal model generates the current token based on the query data sequence and the current response data sequence, in response to determining that the current token belongs to a first data modality; or inputting the query data sequence and a current response data sequence into the multimodal model, so that the multimodal model denoises an initial token sequence based on the query data sequence and the current response data sequence, to generate a result token sequence, in response to determining that the current token belongs to a second data modality.
-
公开(公告)号:US20250004771A1
公开(公告)日:2025-01-02
申请号:US18755148
申请日:2024-06-26
Inventor: Haifeng WANG , Hua WU , Dai DAI , Jing LIU , Hongyu LI , Gangqiang HU
Abstract: A method, apparatus, device, and medium for generating instruction data is provided. The method includes: obtaining a natural language-based reference instruction to direct a large model to generate response data meeting multiple first requirements; obtaining a structured disassembly result of the reference instruction to derive several reference slots and slot values corresponding to these requirements; determining multiple sample slots and sample slot values based on the reference slots, slot values, and a predetermined rule; and generating a natural language-based sample instruction from these sample slots and values, which directs the large model to generate response data that fulfills multiple second requirements.
-
4.
公开(公告)号:US20240338564A1
公开(公告)日:2024-10-10
申请号:US18744501
申请日:2024-06-14
Inventor: Zhifan FENG , Hua WU , Qiaoqiao SHE , Tian WU
IPC: G06N3/08
CPC classification number: G06N3/08
Abstract: A large model optimization training method in the artificial intelligence fields, such as large models, deep learning, natural language processing, may include: taking, as candidate queries, queries collected from a predetermined data source and capable of serving as input to a large model in response to determining that an optimization triggering condition is met; screening out target queries from the candidate queries, the target queries being queries which cannot be correctly processed by the large model; and constructing respectively corresponding training samples according to the target queries, the training samples being used for carrying out optimization training on the large model.
-
公开(公告)号:US20230140997A1
公开(公告)日:2023-05-11
申请号:US18089392
申请日:2022-12-27
Inventor: Ruiqing ZHANG , Xiyang WANG , Zhongjun HE , Zhi LI , Hua WU
IPC: G06F40/58
CPC classification number: G06F40/58
Abstract: A method and apparatus for selecting a sample corpus used to optimize a translation model, an electronic device, a computer readable storage medium, and a computer program product are provided. The method includes: after acquiring a first corpus, translating the first corpus by using a to-be-optimized translation model to acquire a second corpus with different types of languages, then translating the second corpus by using the to-be-optimized translation model to acquire a third corpus, then determining a difficulty level of the first corpus based on a similarity between the first corpus and the third corpus, and finally determining the first corpus as a sample corpus used to perform optimization training on the to-be-optimized translation model in response to the difficulty level satisfying requirements of a difficulty level threshold.
-
公开(公告)号:US20230008897A1
公开(公告)日:2023-01-12
申请号:US17932598
申请日:2022-09-15
Inventor: Wenbin JIANG , Yajuan LYU , Yong ZHU , Hua WU , Haifeng WANG
IPC: G06F16/735
Abstract: An information search method includes: obtaining search words at least including a question to be searched and obtaining an initial text vector representation of the search words; obtaining a video corresponding to the search words, and obtaining multi-modality vector representations of the video; starting from the initial text vector representation, performing N rounds of interaction between the video and the search words based on the multi-modality vector representations and a text vector representation of the search words of a current round, to generate a target fusion vector representation, where N is an integer greater than or equal to 1; and obtaining target video frames matching the question to be searched by annotating the video based on the target fusion vector representation.
-
公开(公告)号:US20250061311A1
公开(公告)日:2025-02-20
申请号:US18746532
申请日:2024-06-18
Inventor: Zeyang LEI , Siqi BAO , Hua WU , Haifeng WANG
IPC: G06N3/0475 , G06N3/08
Abstract: A data generation method is provided. The data generation method includes: generating first answer data based on first question data from a user; determining, in response to receiving negative feedback from the user for the first answer data, a first reflection result for the first answer data based on the first answer data and the negative feedback, wherein the first reflection result indicates a diagnosis reason why feedback from the user for the first answer data is negative; and generating second answer data for the first question data based on the first question data and the first reflection result.
-
8.
公开(公告)号:US20240338530A1
公开(公告)日:2024-10-10
申请号:US18745550
申请日:2024-06-17
Inventor: Zhen GUO , Wenquan WU , Hua WU , Haifeng WANG
Abstract: A generative dialog model training method in the fields of artificial intelligence, such as deep learning, natural language processing, intelligent dialogs, is disclosed. The generative dialog model training method may include: in response to determination of an update of a safety specification, taking an updated safety specification as a target safety specification, and determining a dialog input corresponding to a current optimization according to the target safety specification, the update being performed on a previous safety specification when a generative dialog model after last optimization is determined not to meet a launch requirement; and optimizing the generative dialog model according to the dialog input and a principle that a reply generated by the generative dialog model conforms to the target safety specification, the generative dialog model being configured to generate the reply corresponding to the dialog input.
-
公开(公告)号:US20240028909A1
公开(公告)日:2024-01-25
申请号:US18478833
申请日:2023-09-29
IPC: G06N3/096
CPC classification number: G06N3/096
Abstract: A data generation method based on a deep learning model and a training method is provided. The data generation method includes: determining an initial input of the deep learning model based on input data; obtaining a first output of the model, where in response to the model determining that generating a reply based on the initial input requires calling a first functional component different from the deep learning model, the first output includes a first token for calling the first functional component and a first intermediate inquiry determined based on the initial input and recognizable by the first functional component; obtaining a first intermediate result determined by the first functional component based on the first intermediate inquiry; determining a second input for the model based on the initial input and the first intermediate result; and obtaining a second output of the model for generating a reply to the initial input.
-
公开(公告)号:US20230088445A1
公开(公告)日:2023-03-23
申请号:US18059386
申请日:2022-11-28
Inventor: Zeming LIU , Hao LIU , Zhengyu Niu , Hua WU , Haifeng WANG , Hui XIONG
Abstract: A conversational recommendation method, a method of training a conversational recommendation model, an electronic device, and a storage medium are provided, which are related to a technical field of data processing, in particular to technical fields of voice interaction, deep learning, artificial intelligence and the like. The conversational recommendation method includes: acquiring a historical conversation information; determining a target conversation object to be generated, from a conversation target graph based on the historical conversation information, the conversation target graph includes an object node, the object node is configured to represent a conversation object, and the target conversation object is determined based on the object node; and generating a target conversation information for recommendation based on the target conversation object.
-
-
-
-
-
-
-
-
-