-
公开(公告)号:US20240411992A1
公开(公告)日:2024-12-12
申请号:US18335898
申请日:2023-06-15
Applicant: Salesforce, Inc.
Inventor: Shiva Kumar Pentyala , Prafulla Kumar Choubey , Shashank Harinath , Sitaram Asur , Chien-Sheng Jason Wu , Zachary Alexander , Caiming Xiong
IPC: G06F40/284 , G06N3/08
Abstract: Embodiments described herein provide a training framework for generative NLP models. Specifically, the training input, e.g., in the form of a sequence of tokens representing a user-agent dialogue, may be randomly masked for a few spans, which can be one or more tokens, one or more words, one or more sentences, or one or more paragraphs. These masked spans are replaced with their embeddings generated from pre-trained large language models are then used for training the NLP model.
-
公开(公告)号:US20240411991A1
公开(公告)日:2024-12-12
申请号:US18330216
申请日:2023-06-06
Applicant: Salesforce, Inc.
Inventor: Shiva Kumar Pentyala , Prafulla Kumar Choubey , Shashank Harinath , Sitaram Asur , Chien-Sheng Jason Wu , Zachary Alexander , Caiming Xiong
IPC: G06F40/284
Abstract: Embodiments described herein provide a training framework for generative NLP models that operate on previously learnt knowledge from pretrained large language models. Specifically, to train an NLP model to generate a response to a user utterance (e.g., “resolve login issue”), document embeddings of support IT documents encoded by a pretrained LLM are fed to an NLP decoder together with a training dialogue (e.g., a dialogue between the chat agent on how to “resolve login issue”). The NLP decoder can thus be trained by a causal language modeling loss computed based on the predicted next token and the ground-truth token from the training dialogue.
-