-
公开(公告)号:US11100646B2
公开(公告)日:2021-08-24
申请号:US16562819
申请日:2019-09-06
Applicant: Google LLC
Inventor: Suhani Vora , Reza Mahjourian , Soeren Pirk , Anelia Angelova
Abstract: A method for generating a predicted segmentation map for potential objects in a future scene depicted in a future image is described. The method includes receiving input images that depict a same scene; processing a current input image to generate a segmentation map for potential objects in the current input image and a respective depth map; generating a point cloud for the current input image; processing the input images to generate, for each pair of two input images in the sequence, a respective ego-motion output that characterizes motion of the camera between the two input images; processing the ego-motion outputs to generate a future ego-motion output; processing the point cloud of the current input image and the future ego-motion output to generate a future point cloud; and processing the future point cloud to generate the predicted segmentation map for potential objects in the future scene depicted in the future image.
-
公开(公告)号:US20210233265A1
公开(公告)日:2021-07-29
申请号:US17150291
申请日:2021-01-15
Applicant: Google LLC
Inventor: Anelia Angelova , Martin Wicke , Reza Mahjourian
Abstract: A system includes an image depth prediction neural network implemented by one or more computers. The image depth prediction neural network is a recurrent neural network that is configured to receive a sequence of images and, for each image in the sequence: process the image in accordance with a current internal state of the recurrent neural network to (i) update the current internal state and (ii) generate a depth output that characterizes a predicted depth of a future image in the sequence.
-
公开(公告)号:US12260576B2
公开(公告)日:2025-03-25
申请号:US18367888
申请日:2023-09-13
Applicant: Google LLC
Inventor: Vincent Michael Casser , Soeren Pirk , Reza Mahjourian , Anelia Angelova
Abstract: A system for generating a depth output for an image is described. The system receives input images that depict the same scene, each input image including one or more potential objects. The system generates, for each input image, a respective background image and processes the background images to generate a camera motion output that characterizes the motion of the camera between the input images. For each potential object, the system generates a respective object motion output for the potential object based on the input images and the camera motion output. The system processes a particular input image of the input images using a depth prediction neural network (NN) to generate a depth output for the particular input image, and updates the current values of parameters of the depth prediction NN based on the particular depth output, the camera motion output, and the object motion outputs for the potential objects.
-
公开(公告)号:US12136262B2
公开(公告)日:2024-11-05
申请号:US18379532
申请日:2023-10-12
Applicant: Google LLC
Inventor: Weicheng Kuo , Anelia Angelova , Tsung-Yi Lin
IPC: G06V10/82 , G06T7/10 , G06V10/25 , G06V10/26 , G06V10/44 , G06V10/764 , G06V10/77 , G06V10/774 , G06V20/10
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for performing instance segmentation by detecting and segmenting individual objects in an image. In one aspect, a method comprises: processing an image to generate data identifying a region of the image that depicts a particular object; obtaining data defining a plurality of example object segmentations; generating a respective weight value for each of the example object segmentations; for each of a plurality of pixels in the region of the image, determining a score characterizing a likelihood that the pixel is included in the particular object depicted in the region of the image using: (i) the example object segmentations, and (ii) the weight values for the example object segmentations; and generating a segmentation of the particular object depicted in the region of the image using the scores for the pixels in the region of the image.
-
公开(公告)号:US20230409899A1
公开(公告)日:2023-12-21
申请号:US17845753
申请日:2022-06-21
Applicant: Google LLC
Inventor: Michael Sahngwon Ryoo , Anthony Jacob Piergiovanni , Anelia Angelova , Anurag Arnab , Mostafa Dehghani
IPC: G06N3/08
CPC classification number: G06N3/08
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing a network input using a computer vision neural network with learned tokenization.
-
公开(公告)号:US20220366257A1
公开(公告)日:2022-11-17
申请号:US17620451
申请日:2020-09-16
Applicant: Google LLC
Inventor: Anthony J. Piergiovanni , Anelia Angelova , Michael Sahngwon Ryoo
Abstract: Generally, the present disclosure is directed to a neural architecture search process for finding small and fast video processing networks for understanding of video data. The neural architecture search process can automatically design networks that provide comparable video processing performance at a fraction of the computational and storage cost of larger existing models, thereby conserving computing resources such as memory and processor usage.
-
公开(公告)号:US20210374453A1
公开(公告)日:2021-12-02
申请号:US17290814
申请日:2019-08-14
Applicant: Google LLC
Inventor: Weicheng Kuo , Anelia Angelova , Tsung-Yi Lin
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for performing instance segmentation by detecting and segmenting individual objects in an image. In one aspect, a method comprises: processing an image to generate data identifying a region of the image that depicts a particular object; obtaining data defining a plurality of example object segmentations; generating a respective weight value for each of the example object segmentations; for each of a plurality of pixels in the region of the image, determining a score characterizing a likelihood that the pixel is included in the particular object depicted in the region of the image using: (i) the example object segmentations, and (ii) the weight values for the example object segmentations; and generating a segmentation of the particular object depicted in the region of the image using the scores for the pixels in the region of the image.
-
公开(公告)号:US20200027002A1
公开(公告)日:2020-01-23
申请号:US16511637
申请日:2019-07-15
Applicant: Google LLC
Inventor: Steven Hickson , Anelia Angelova , Irfan Aziz Essa , Rahul Sukthankar
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for determining a clustering of images into a plurality of semantic categories. In one aspect, a method comprises: training a categorization neural network, comprising, at each of a plurality of iterations: processing an image depicting an object using the categorization neural network to generate (i) a current prediction for whether the image depicts an object or a background region, and (ii) a current embedding of the image; determining a plurality of current cluster centers based on the current values of the categorization neural network parameters, wherein each cluster center represents a respective semantic category; and determining a gradient of an objective function that includes a classification loss and a clustering loss, wherein the clustering loss depends on a similarity between the current embedding of the image and the current cluster centers.
-
公开(公告)号:US12046025B2
公开(公告)日:2024-07-23
申请号:US17605783
申请日:2020-05-22
Applicant: Google LLC
Inventor: Michael Sahngwon Ryoo , Anthony Jacob Piergiovanni , Mingxing Tan , Anelia Angelova
IPC: G06V10/82 , G06N3/045 , G06T1/20 , G06T3/4046 , G06T7/207 , G06V10/776
CPC classification number: G06V10/82 , G06N3/045 , G06T1/20 , G06T3/4046 , G06T7/207 , G06V10/776 , G06T2207/10016 , G06T2207/20081 , G06T2207/20084
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for determining one or more neural network architectures of a neural network for performing a video processing neural network task. In one aspect, a method comprises: at each of a plurality of iterations: selecting a parent neural network architecture from a set of neural network architectures; training a neural network having the parent neural network architecture to perform the video processing neural network task, comprising determining trained values of connection weight parameters of the parent neural network architecture; generating a new neural network architecture based at least in part on the trained values of the connection weight parameters of the parent neural network architecture; and adding the new neural network architecture to the set of neural network architectures.
-
20.
公开(公告)号:US20230394306A1
公开(公告)日:2023-12-07
申请号:US18328464
申请日:2023-06-02
Applicant: Google LLC
Inventor: Anthony J. Piergiovanni , Wei-Cheng Kuo , Anelia Angelova
IPC: G06N3/08 , G06N3/0464 , G06N3/048 , G06N3/0455
CPC classification number: G06N3/08 , G06N3/0464 , G06N3/048 , G06N3/0455
Abstract: Provided is an efficient multi-modal processing model. The multi-modal processing model can process input data from multiple different domains to generate a prediction for a multi-modal processing task. A machine-learned multi-modal processing model can include an adaptive tokenization layer that is configured to adaptively tokenize features generated from the multi-modal inputs into sets of tokens. Specifically, the tokens may have a smaller data size relative to the features from the inputs, thereby enabling a reduced number of processing operations to be performed overall, thereby improving the efficiency of model.
-
-
-
-
-
-
-
-
-