-
公开(公告)号:US12293577B2
公开(公告)日:2025-05-06
申请号:US17651771
申请日:2022-02-18
Applicant: Adobe Inc.
Inventor: Seunghyun Yoon , Trung Huu Bui , Franck Dernoncourt , Hyounghun Kim , Doo Soon Kim
Abstract: Embodiments of the disclosure provide a machine learning model for generating a predicted executable command for an image. The learning model includes an interface configured to obtain an utterance indicating a request associated with the image, an utterance sub-model, a visual sub-model, an attention network, and a selection gate. The machine learning model generates a segment of the predicted executable command from weighted probabilities of each candidate token in a predetermined vocabulary determined based on the visual features, the concept features, current command features, and the utterance features extracted from the utterance or the image.
-
公开(公告)号:US20230267726A1
公开(公告)日:2023-08-24
申请号:US17651771
申请日:2022-02-18
Applicant: Adobe Inc.
Inventor: Seunghyun Yoon , Trung Huu Bui , Franck Dernoncourt , Hyounghun Kim , Doo Soon Kim
CPC classification number: G06V10/86 , G06V10/82 , G06V10/806 , G06V10/7715 , G06N3/088 , G06N3/0445 , G06F40/284
Abstract: Embodiments of the disclosure provide a machine learning model for generating a predicted executable command for an image. The learning model includes an interface configured to obtain an utterance indicating a request associated with the image, an utterance sub-model, a visual sub-model, an attention network, and a selection gate. The machine learning model generates a segment of the predicted executable command from weighted probabilities of each candidate token in a predetermined vocabulary determined based on the visual features, the concept features, current command features, and the utterance features extracted from the utterance or the image.
-