-
公开(公告)号:US20240346290A1
公开(公告)日:2024-10-17
申请号:US18299841
申请日:2023-04-13
申请人: Google LLC
发明人: Zhe Dong , Jianmo Ni , Imed Zitouni , Enrique Alfonseca , Daniel Martin Bikel , Chen Qu
IPC分类号: G06N3/0455
CPC分类号: G06N3/0455
摘要: Aspects of the technology provide systems and methods for implementing an asymmetric dual encoder architecture. The architecture includes a token embedder layer section having a first token embedding section associated with a first input and a second token embedding section associated with a second input, and an encoder layer section having a first encoder section receiving token embeddings from the first token embedding section and a second encoder section receiving token embeddings from the second token embedding section. A shared projection layer receives encodings from both the first and second encoder sections and generates a set of projections. An embedding space is configured, based on the set of projections, to generate a question embedding and an answer embedding, in which the question and answer embeddings are used in identifying a set of candidate answers to an input answer.
-
公开(公告)号:US20240273294A1
公开(公告)日:2024-08-15
申请号:US18166806
申请日:2023-02-09
申请人: Google LLC
发明人: Siamak Shakeri , Cicero Nogueira dos Santos , Daniel Matthew Cer , Zhe Dong , Jianmo Ni , Yun-Hsuan Sung , John Nham
IPC分类号: G06F40/295 , G06N3/0455 , G06N3/084
CPC分类号: G06F40/295 , G06N3/0455 , G06N3/084
摘要: The technology employs soft knowledge prompts (KPs) to inject relevant world knowledge into language models. This includes training KPs via self-supervised learning on data from one or more knowledge bases. KPs are task independent and can function as an external memory of the language models. KPs may be entity-centric, meaning that each prompt primarily encodes information about one entity from a given knowledge base. A method includes identifying a KP in response to a received input text, concatenating that KP to a sequence of word embeddings of the input text, applying the concatenated information to a trained language model, predicting an object entity name, computing a cross-entropy loss, and updating the identified KP based on the computed cross-entropy loss.
-