-
公开(公告)号:US20240419726A1
公开(公告)日:2024-12-19
申请号:US18210535
申请日:2023-06-15
Applicant: Adobe Inc.
Inventor: Simon Jenni , Fabian David Caba Heilbron , Chun-Hsiao Yeh , Bryan Russell , Josef Sivic
IPC: G06F16/58 , G06F16/535 , G06F16/538
Abstract: Techniques for learning to personalize vision-language models through meta-personalization are described. In one embodiment, one or more processing devices lock a pre-trained vision-language model (VLM) during a training phase. The processing devices train the pre-trained VLM to augment a text encoder of the pre-trained VLM with a set of general named video instances to form a meta-personalized VLM, the meta-personalized VLM to include global category features. The processing devices test the meta-personalized VLM to adapt the text encoder with a set of personal named video instances to form a personal VLM, the personal VLM comprising the global category features personalized with a set of personal instance weights to form a personal instance token associated with the user. Other embodiments are described and claimed.