NATURAL LANGUAGE OBJECT TRACKING
    1.
    发明申请
    NATURAL LANGUAGE OBJECT TRACKING 审中-公开
    自然语言对象跟踪

    公开(公告)号:WO2018089158A1

    公开(公告)日:2018-05-17

    申请号:PCT/US2017/056195

    申请日:2017-10-11

    Abstract: A method of tracking an object across a sequence of video frames using a natural language query includes receiving the natural language query and identifying an initial target in an initial frame of the sequence of video frames based on the natural language query. The method also includes adjusting the natural language query, for a subsequent frame, based on content of the subsequent frame and/or a likelihood of a semantic property of the initial target appearing in the subsequent frame. The method further includes identifying a text driven target and a visual driven target in the subsequent frame. The method still further includes combining the visual driven target with the text driven target to obtain a final target in the subsequent frame.

    Abstract translation: 使用自然语言查询在视频帧序列上追踪对象的方法包括接收自然语言查询并基于自然语言查询识别视频帧序列的初始帧中的初始目标 语言查询。 该方法还包括基于后续帧的内容和/或初始目标的语义属性出现在后续帧中的可能性来调整后续帧的自然语言查询。 该方法还包括在随后的帧中识别文本驱动的目标和视觉驱动的目标。 该方法还包括将视觉驱动目标与文本驱动目标相结合以获得后续帧中的最终目标。

    VIDEO ANALYSIS WITH CONVOLUTIONAL ATTENTION RECURRENT NEURAL NETWORKS
    2.
    发明申请
    VIDEO ANALYSIS WITH CONVOLUTIONAL ATTENTION RECURRENT NEURAL NETWORKS 审中-公开
    用卷积型注意递归神经网络进行视频分析

    公开(公告)号:WO2017155661A1

    公开(公告)日:2017-09-14

    申请号:PCT/US2017/017188

    申请日:2017-02-09

    Abstract: A method of processing data within a convolutional attention recurrent neural network (RNN) includes generating a current multi-dimensional attention map. The current multi-dimensional attention map indicates areas of interest in a first frame from a sequence of spatio-temporal data. The method further includes receiving a multi-dimensional feature map. The method also includes convolving the current multi-dimensional attention map and the multi-dimensional feature map to obtain a multi-dimensional hidden state and a next multi-dimensional attention map. The method identifies a class of interest in the first frame based on the multi-dimensional hidden state and training data.

    Abstract translation: 处理卷积式注意递归神经网络(RNN)内的数据的方法包括生成当前多维注意图。 当前多维注意映射表示来自时空数据序列的第一帧中的感兴趣区域。 该方法还包括接收多维特征地图。 该方法还包括对当前多维注意图和多维特征图进行卷积以获得多维隐藏状态和下一个多维注意图。 该方法基于多维隐藏状态和训练数据来识别第一帧中的感兴趣类。

    RECURRENT NETWORKS WITH MOTION-BASED ATTENTION FOR VIDEO UNDERSTANDING
    3.
    发明申请
    RECURRENT NETWORKS WITH MOTION-BASED ATTENTION FOR VIDEO UNDERSTANDING 审中-公开
    利用基于视频理解的基于运动的重现网络

    公开(公告)号:WO2017155663A1

    公开(公告)日:2017-09-14

    申请号:PCT/US2017/017192

    申请日:2017-02-09

    Abstract: A method of predicting action labels for a video stream includes receiving the video stream and calculating an optical flow of consecutive frames of the video stream. An attention map is generated from the current frame of the video stream and the calculated optical flow. An action label is predicted for the current frame based on the optical flow, a previous hidden state and the attention map.

    Abstract translation: 一种预测视频流的动作标签的方法包括接收视频流并计算视频流的连续帧的光流。 从视频流的当前帧和计算的光流生成注意图。 根据光流,先前的隐藏状态和注意图预测当前帧的动作标签。

    ACTION LOCALIZATION IN SEQUENTIAL DATA WITH ATTENTION PROPOSALS FROM A RECURRENT NETWORK
    4.
    发明申请
    ACTION LOCALIZATION IN SEQUENTIAL DATA WITH ATTENTION PROPOSALS FROM A RECURRENT NETWORK 审中-公开
    循环数据中的行为定位与来自循环网络的注意提示

    公开(公告)号:WO2017155660A1

    公开(公告)日:2017-09-14

    申请号:PCT/US2017/017185

    申请日:2017-02-09

    Abstract: A method generates bounding-boxes within frames of a sequence of frames. The bounding-boxes may be generated via a recurrent neural network (RNN) such as a long short-term memory (LSTM) network. The method includes receiving the sequence of frames and generating an attention feature map for each frame of the sequence of frames. Each attention feature map indicates at least one potential moving object. The method also includes up-sampling each attention feature map to determine an attention saliency for pixels in each frame of the sequence of frames. The method further includes generating a bounding-box within each frame based on the attention saliency and temporally smoothing multiple bounding-boxes along the sequence of frames to obtain a smooth sequence of bounding-boxes. The method still further includes localizing an action location within each frame based on the smooth sequence of bounding-boxes.

    Abstract translation: 一种方法在帧序列的帧内生成边界框。 边界框可以经由诸如长期短期存储器(LSTM)网络的递归神经网络(RNN)来生成。 该方法包括接收帧序列并且为帧序列的每个帧生成关注特征图。 每个关注特征图表示至少一个潜在的移动物体。 该方法还包括上采样每个关注特征图以确定帧序列的每个帧中的像素的关注显着性。 该方法进一步包括基于注意力显着性在每个帧内生成边界框并沿着帧序列对多个边界框进行时间平滑以获得平滑的边界框序列。 该方法还包括基于边界框的平滑顺序来定位每个帧内的动作位置。

Patent Agency Ranking