-
公开(公告)号:US20240152767A1
公开(公告)日:2024-05-09
申请号:US18497079
申请日:2023-10-30
Applicant: NEC Laboratories America, Inc.
Inventor: Vijay Kumar Baikampady Gopalkrishna , Samuel Schulter , Xiang Yu , Zaid Khan , Manmohan Chandraker
Abstract: Systems and methods for training a visual question answer model include training a teacher model by performing image conditional visual question generation on a visual language model (VLM) and a targeted visual question answer dataset using images to generate question and answer pairs. Unlabeled images are pseudolabeled using the teacher model to decode synthetic question and answer pairs for the unlabeled images. The synthetic question and answer pairs for the unlabeled images are merged with real data from the targeted visual question answer dataset to generate a self-augmented training set. A student model is trained using the VLM and the self-augmented training set to return visual answers to text queries.
-
公开(公告)号:US20250139527A1
公开(公告)日:2025-05-01
申请号:US18930402
申请日:2024-10-29
Applicant: NEC Laboratories America, Inc.
IPC: G06N20/00
Abstract: Systems and methods for a self-improving model for agentic visual program synthesis. An agent can be continuously trained using an optimal training tuple to perform a corrective action to a monitored entity which in turn generates new input data for the training. To train the agent, an input question can be decomposed into vision model tasks to generate task outputs. The task outputs can be corrected based on feedback to obtain corrected task outputs. The optimal training tuple can be generated by comparing an optimal tuple threshold with a similarity score of the input image, the input question, and the corrected task outputs.
-