REWARD FEEDBACK FOR LEARNING CONTROL POLICIES USING NATURAL LANGUAGE AND VISION DATA

    公开(公告)号:US20240028949A1

    公开(公告)日:2024-01-25

    申请号:US17869528

    申请日:2022-07-20

    Applicant: Hitachi, Ltd.

    CPC classification number: G06N20/00

    Abstract: Example implementations described herein involve systems and methods for providing a reward to a machine learning algorithm, which can include receiving an image, and a task description defined in text; slicing the image into a plurality of sub-images; executing an embedding model to embed the text of the task description and the sub-images to generate a distribution for the sub-images based on relevance to the task description; and generating the reward from the distribution for the sub-images.

Patent Agency Ranking