Visual tracking by colorization
    1.
    发明授权

    公开(公告)号:US11335093B2

    公开(公告)日:2022-05-17

    申请号:US16966102

    申请日:2019-06-12

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for performing visual tracking. In one aspect, a method comprises receiving: (i) one or more reference video frames, (ii) respective reference labels for each of a plurality of reference pixels in the reference video frames, and (iii) a target video frame. The reference video frames and the target video frame are processed using a colorization machine learning model to generate respective pixel similarity measures between each of (i) a plurality of target pixels in the target video frame, and (ii) the reference pixels in the reference video frames. A respective target label is determined for each target pixel in the target video frame, comprising: combining (i) the reference labels for the reference pixels in the reference video frames, and (ii) the pixel similarity measures.

    Action localization in images and videos using relational features

    公开(公告)号:US11163989B2

    公开(公告)日:2021-11-02

    申请号:US16637960

    申请日:2019-08-06

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for performing action localization in images and videos. In one aspect, a system comprises a data processing apparatus; a memory in data communication with the data processing apparatus and storing instructions that cause the data processing apparatus to perform image processing and video processing operations comprising: receiving an input comprising an image depicting a person; identifying a plurality of context positions from the image; determining respective feature representations of each of the context positions; providing a feature representation of the person and the feature representations of each of the context positions to a context neural network to obtain relational features, wherein the relational features represent relationships between the person and the context positions; and determining an action performed by the person using the feature representation of the person and the relational features.

    Adaptive object tracking policy
    3.
    发明授权

    公开(公告)号:US11688077B2

    公开(公告)日:2023-06-27

    申请号:US16954153

    申请日:2017-12-15

    Applicant: GOOGLE LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for implementing a machine-learned object tracking policy. One of the methods includes receiving a current video frame by a user device having a plurality of installed object trackers, wherein each object tracker is configured to perform a different object tracking procedure on the current video frame rent video frame. The current video frame and one or more object tracks previously generated by the one or more object trackers are provided as input to a trained policy engine that implements a reinforcement learning model to generate a particular object tracking plan. A particular object tracking plan is selected based on the output of the reinforcement learning model, and the selected object tracking plan is performed on the current video frame to generate one or more updated object tracks for the current video frame.

    ADAPTIVE OBJECT TRACKING POLICY
    4.
    发明申请

    公开(公告)号:US20210166402A1

    公开(公告)日:2021-06-03

    申请号:US16954153

    申请日:2017-12-15

    Applicant: GOOGLE LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for implementing a machine-learned object tracking policy. One of the methods includes receiving a current video frame by a user device having a plurality of installed object trackers, wherein each object tracker is configured to perform a different object tracking procedure on the current video frame rent video frame. The current video frame and one or more object tracks previously generated by the one or more object trackers are provided as input to a trained policy engine that implements a reinforcement learning model to generate a particular object tracking plan. A particular object tracking plan is selected based on the output of the reinforcement learning model, and the selected object tracking plan is performed on the current video frame to generate one or more updated object tracks for the current video frame.

    ACTION LOCALIZATION USING RELATIONAL FEATURES

    公开(公告)号:US20210166009A1

    公开(公告)日:2021-06-03

    申请号:US16637960

    申请日:2019-08-06

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for performing action localization. In one aspect, a system comprises a data processing apparatus; a memory in data communication with the data processing apparatus and storing instructions that cause the data processing apparatus to perform operations comprising: receiving an input comprising an image depicting a person; identifying a plurality of context positions from the image; determining respective feature representations of each of the context positions; providing a feature representation of the person and the feature representations of each of the context positions to a context neural network to obtain relational features, wherein the relational features represent relationships between the person and the context positions; and determining an action performed by the person using the feature representation of the person and the relational features.

    VISUAL TRACKING BY COLORIZATION
    8.
    发明申请

    公开(公告)号:US20210089777A1

    公开(公告)日:2021-03-25

    申请号:US16966102

    申请日:2019-06-12

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for performing visual tracking. In one aspect, a method comprises receiving: (i) one or more reference video frames, (ii) respective reference labels for each of a plurality of reference pixels in the reference video frames, and (iii) a target video frame. The reference video frames and the target video frame are processed using a colorization machine learning model to generate respective pixel similarity measures between each of (i) a plurality of target pixels in the target video frame, and (ii) the reference pixels in the reference video frames. A respective target label is determined for each target pixel in the target video frame, comprising: combining (i) the reference labels for the reference pixels in the reference video frames, and (ii) the pixel similarity measures.

    SEMANTICALLY-CONSISTENT IMAGE STYLE TRANSFER

    公开(公告)号:US20200342643A1

    公开(公告)日:2020-10-29

    申请号:US16759689

    申请日:2018-10-29

    Applicant: GOOGLE LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for semantically-consistent image style transfer. One of the methods includes: receiving an input source domain image; processing the source domain image using one or more source domain low-level encoder neural network layers to generate a low-level representation; processing the low-level representation using one more high-level encoder neural network layers to generate an embedding of the input source domain image; processing the embedding using one or more high-level decoder neural network layers to generate a high-level feature representation of features of the input source domain image; and processing the high-level feature representation of the features of the input source domain image using one or more target domain low-level decoder neural network layers to generate an output target domain image that is from the target domain but that has similar semantics to the input source domain image.

    Neural architecture search using a performance prediction neural network

    公开(公告)号:US11087201B2

    公开(公告)日:2021-08-10

    申请号:US16861491

    申请日:2020-04-29

    Applicant: Google LLC

    Abstract: A method for determining an architecture for a task neural network configured to perform a particular machine learning task is described. The method includes obtaining data specifying a current set of candidate architectures for the task neural network; for each candidate architecture in the current set: processing the data specifying the candidate architecture using a performance prediction neural network having multiple performance prediction parameters, the performance prediction neural network being configured to process the data specifying the candidate architecture in accordance with current values of the performance prediction parameters to generate a performance prediction that characterizes how well a neural network having the candidate architecture would perform after being trained on the particular machine learning task; and generating an updated set of candidate architectures by selecting one or more of the candidate architectures in the current set based on the performance predictions for the candidate architectures in the current set.

Patent Agency Ranking