-
公开(公告)号:US20240095927A1
公开(公告)日:2024-03-21
申请号:US18255186
申请日:2021-03-04
Applicant: Google LLC
Inventor: Jonathan Chung-Kuan Huang , Vighnesh Nandan Birodkar , Siyang Li , Zhichao Lu , Vivek Rathod
IPC: G06T7/11 , G06V10/77 , G06V10/774 , G06V10/82
CPC classification number: G06T7/11 , G06V10/7715 , G06V10/774 , G06V10/82 , G06T2207/20021 , G06T2207/20081 , G06T2207/20084 , G06T2207/20132
Abstract: A computer-implemented method for partially supervised image segmentation having improved strong mask generalization includes obtaining, by a computing system including one or more computing devices, a machine-learned segmentation model, the machine-learned segmentation model including an anchor-free detector model and a deep mask head network, the deep mask head network including an encoder-decoder structure having a plurality of layers. The computer-implemented method includes obtaining, by the computing system, input data including tensor data. The computer-implemented method includes providing, by the computing system, the input data as input to the machine-learned segmentation model. The computer-implemented method includes receiving, by the computing system, output data from the machine-learned segmentation model, the output data including a segmentation of the tensor data, the segmentation including one or more instance masks.
-
公开(公告)号:US20230419538A1
公开(公告)日:2023-12-28
申请号:US18464912
申请日:2023-09-11
Applicant: Google LLC
Inventor: Yinxiao Li , Zhichao Lu , Xuehan Xiong , Jonathan Huang
IPC: G06T7/73
CPC classification number: G06T7/73 , G06T2207/20081 , G06T2207/30196 , G06T2207/20084 , G06T2207/10016
Abstract: A method includes receiving video data that includes a series of frames of image data. Here, the video data is representative of an actor performing an activity. The method also includes processing the video data to generate a spatial input stream including a series of spatial images representative of spatial features of the actor performing the activity, a temporal input stream representative of motion of the actor performing the activity, and a pose input stream including a series of images representative of a pose of the actor performing the activity. Using at least one neural network, the method also includes processing the temporal input stream, the spatial input stream, and the pose input stream. The method also includes classifying, by the at least one neural network, the activity based on the temporal input stream, the spatial input stream, and the pose input stream.
-
公开(公告)号:US11776156B2
公开(公告)日:2023-10-03
申请号:US17303969
申请日:2021-06-11
Applicant: Google LLC
Inventor: Yinxiao Li , Zhichao Lu , Xuehan Xiong , Jonathan Huang
IPC: G06T7/73
CPC classification number: G06T7/73 , G06T2207/10016 , G06T2207/20081 , G06T2207/20084 , G06T2207/30196
Abstract: A method includes receiving video data that includes a series of frames of image data. Here, the video data is representative of an actor performing an activity. The method also includes processing the video data to generate a spatial input stream including a series of spatial images representative of spatial features of the actor performing the activity, a temporal input stream representative of motion of the actor performing the activity, and a pose input stream including a series of images representative of a pose of the actor performing the activity. Using at least one neural network, the method also includes processing the temporal input stream, the spatial input stream, and the pose input stream. The method also includes classifying, by the at least one neural network, the activity based on the temporal input stream, the spatial input stream, and the pose input stream.
-
4.
公开(公告)号:US20230274527A1
公开(公告)日:2023-08-31
申请号:US18015301
申请日:2020-10-06
Applicant: Google LLC
Inventor: Huizhong Chen , Zhichao Lu , Jonathan Zwi Ben-Meshulam
IPC: G06V10/764 , G06V10/776
CPC classification number: G06V10/764 , G06V10/776
Abstract: Systems and methods of the present disclosure are directed to a computer-implemented method for training a machine-learned multi-class object classification model with partially labeled training data. The method can include obtaining image data depicting objects and ground truth data comprising a subset of object class annotations respectively associated with a subset of object classes of a plurality of object classes. The method can include processing the image data with the machine-learned multi-class object classification model to obtain object classification data. The method can include evaluating a loss function that evaluates a multi-class classification loss and adjusting one or more parameters of the multi-class object classification model based on the loss function.
-
公开(公告)号:US20210390733A1
公开(公告)日:2021-12-16
申请号:US17303969
申请日:2021-06-11
Applicant: Google LLC
Inventor: Yinxiao Li , Zhichao Lu , Xuehan Xiong , Jonathan Huang
IPC: G06T7/73
Abstract: A method includes receiving video data that includes a series of frames of image data. Here, the video data is representative of an actor performing an activity. The method also includes processing the video data to generate a spatial input stream including a series of spatial images representative of spatial features of the actor performing the activity, a temporal input stream representative of motion of the actor performing the activity, and a pose input stream including a series of images representative of a pose of the actor performing the activity. Using at least one neural network, the method also includes processing the temporal input stream, the spatial input stream, and the pose input stream. The method also includes classifying, by the at least one neural network, the activity based on the temporal input stream, the spatial input stream, and the pose input stream.
-
-
-
-