Learning to Personalize Vision-Language Models through Meta-Personalization

    公开(公告)号:US20240419726A1

    公开(公告)日:2024-12-19

    申请号:US18210535

    申请日:2023-06-15

    Applicant: Adobe Inc.

    Abstract: Techniques for learning to personalize vision-language models through meta-personalization are described. In one embodiment, one or more processing devices lock a pre-trained vision-language model (VLM) during a training phase. The processing devices train the pre-trained VLM to augment a text encoder of the pre-trained VLM with a set of general named video instances to form a meta-personalized VLM, the meta-personalized VLM to include global category features. The processing devices test the meta-personalized VLM to adapt the text encoder with a set of personal named video instances to form a personal VLM, the personal VLM comprising the global category features personalized with a set of personal instance weights to form a personal instance token associated with the user. Other embodiments are described and claimed.

Patent Agency Ranking