Visual tracking by colorization
    1.
    发明授权

    公开(公告)号:US11335093B2

    公开(公告)日:2022-05-17

    申请号:US16966102

    申请日:2019-06-12

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for performing visual tracking. In one aspect, a method comprises receiving: (i) one or more reference video frames, (ii) respective reference labels for each of a plurality of reference pixels in the reference video frames, and (iii) a target video frame. The reference video frames and the target video frame are processed using a colorization machine learning model to generate respective pixel similarity measures between each of (i) a plurality of target pixels in the target video frame, and (ii) the reference pixels in the reference video frames. A respective target label is determined for each target pixel in the target video frame, comprising: combining (i) the reference labels for the reference pixels in the reference video frames, and (ii) the pixel similarity measures.

    Action localization in images and videos using relational features

    公开(公告)号:US11163989B2

    公开(公告)日:2021-11-02

    申请号:US16637960

    申请日:2019-08-06

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for performing action localization in images and videos. In one aspect, a system comprises a data processing apparatus; a memory in data communication with the data processing apparatus and storing instructions that cause the data processing apparatus to perform image processing and video processing operations comprising: receiving an input comprising an image depicting a person; identifying a plurality of context positions from the image; determining respective feature representations of each of the context positions; providing a feature representation of the person and the feature representations of each of the context positions to a context neural network to obtain relational features, wherein the relational features represent relationships between the person and the context positions; and determining an action performed by the person using the feature representation of the person and the relational features.

    ACTION LOCALIZATION USING RELATIONAL FEATURES

    公开(公告)号:US20210166009A1

    公开(公告)日:2021-06-03

    申请号:US16637960

    申请日:2019-08-06

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for performing action localization. In one aspect, a system comprises a data processing apparatus; a memory in data communication with the data processing apparatus and storing instructions that cause the data processing apparatus to perform operations comprising: receiving an input comprising an image depicting a person; identifying a plurality of context positions from the image; determining respective feature representations of each of the context positions; providing a feature representation of the person and the feature representations of each of the context positions to a context neural network to obtain relational features, wherein the relational features represent relationships between the person and the context positions; and determining an action performed by the person using the feature representation of the person and the relational features.

    VISUAL TRACKING BY COLORIZATION
    4.
    发明申请

    公开(公告)号:US20210089777A1

    公开(公告)日:2021-03-25

    申请号:US16966102

    申请日:2019-06-12

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for performing visual tracking. In one aspect, a method comprises receiving: (i) one or more reference video frames, (ii) respective reference labels for each of a plurality of reference pixels in the reference video frames, and (iii) a target video frame. The reference video frames and the target video frame are processed using a colorization machine learning model to generate respective pixel similarity measures between each of (i) a plurality of target pixels in the target video frame, and (ii) the reference pixels in the reference video frames. A respective target label is determined for each target pixel in the target video frame, comprising: combining (i) the reference labels for the reference pixels in the reference video frames, and (ii) the pixel similarity measures.

Patent Agency Ranking