-
公开(公告)号:US20240288870A1
公开(公告)日:2024-08-29
申请号:US18475442
申请日:2023-09-27
发明人: Chiori Hori , Jonathan Le Roux , Devesh Jha , Siddarth Jain , Radu Ioan Corcodel , Diego Romeres , Puyuang Peng , Xinyu Liu , David Harwath
CPC分类号: G05D1/0246 , G06V10/82 , G06V20/41 , G06V20/46 , G06V20/49 , G10L15/02 , G10L15/16 , G10L15/1815
摘要: A method, a system and a computer program product are provided for applying a neural network including an action sequence decoder for generating an action sequence for a robot to perform a task. The neural network is applied to generate the action sequence based on recordings demonstrating humans performing tasks. In an example, the method comprises collecting a recording and a sequence of captions describing scenes in the recording; extracting feature data from the recording; encoding the extracted feature data to produce a sequence of encoded features; and applying the action sequence decoder to produce a sequence of actions for the robot based on the sequence of encoded features having a semantic meaning corresponding to a semantic meaning of the sequence of captions. The feature data includes features of a video signal, an audio signal, and/or text transcription capturing a performance of the task.