-
公开(公告)号:US11544498B2
公开(公告)日:2023-01-03
申请号:US17194090
申请日:2021-03-05
Applicant: Google LLC
Inventor: Ariel Gordon , Soeren Pirk , Anelia Angelova , Vincent Michael Casser , Yao Lu , Anthony Brohan , Zhao Chen , Jan Dlabal
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a neural network using consistency measures. One of the methods includes processing a particular training example from a mediator training data set using a first neural network to generate a first output for a first machine learning task; processing the particular training example in the mediator training data set using each of one or more second neural networks, wherein each second neural network is configured to generate a second output for a respective second machine learning task; determining, for each second machine learning task, a consistency target output for the first machine learning task; determining, for each second machine learning task, an error between the first output and the consistency target output corresponding to the second machine learning task; and generating a parameter update for the first neural network from the determined errors.
-
公开(公告)号:US10810752B2
公开(公告)日:2020-10-20
申请号:US16861441
申请日:2020-04-29
Applicant: Google LLC
Inventor: Anelia Angelova , Martin Wicke , Reza Mahjourian
Abstract: A system includes a neural network implemented by one or more computers, in which the neural network includes an image depth prediction neural network and a camera motion estimation neural network. The neural network is configured to receive a sequence of images. The neural network is configured to process each image in the sequence of images using the image depth prediction neural network to generate, for each image, a respective depth output that characterizes a depth of the image, and to process a subset of images in the sequence of images using the camera motion estimation neural network to generate a camera motion output that characterizes the motion of a camera between the images in the subset. The image depth prediction neural network and the camera motion estimation neural network have been jointly trained using an unsupervised learning technique.
-
公开(公告)号:US20190279383A1
公开(公告)日:2019-09-12
申请号:US16332991
申请日:2017-09-12
Applicant: Google LLC
Inventor: Anelia Angelova , Martin Wicke , Reza Mahjourian
Abstract: A system includes an image depth prediction neural network implemented by one or more computers. The image depth prediction neural network is a recurrent neural network that is configured to receive a sequence of images and, for each image in the sequence: process the image in accordance with a current internal state of the recurrent neural network to (i) update the current internal state and (ii) generate a depth output that characterizes a predicted depth of a future image in the sequence.
-
公开(公告)号:US10013640B1
公开(公告)日:2018-07-03
申请号:US14976147
申请日:2015-12-21
Applicant: Google LLC
Inventor: Anelia Angelova , Ivan Bogun
IPC: G06K9/62
CPC classification number: G06K9/00624 , G06K9/4628 , G06K9/6271
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for identifying an object from a video. One of the methods includes obtaining multiple frames from a video, where each frame of the multiple frames depicts an object to be recognized, and processing, using an object recognition model, the multiple frames to generate data that represents a classification of the object to be recognized.
-
25.
公开(公告)号:US20240289981A1
公开(公告)日:2024-08-29
申请号:US18173557
申请日:2023-02-23
Applicant: Google LLC
Inventor: Wei-Cheng Kuo , Fred Bertsch , Wei Li , Anthony J. Piergiovanni , Mohammad Taghi Saffar , Anelia Angelova
IPC: G06T7/73 , G06F40/126 , G06F40/40 , G06V10/77 , G06V10/80
CPC classification number: G06T7/73 , G06F40/126 , G06F40/40 , G06V10/7715 , G06V10/806
Abstract: Generally, the disclosure is directed to generalized objected location, where the located object is in accordance to a natural language (NL) query. More specifically, the embodiments include a unified generalized visual localization architecture. The architecture achieves enhanced performance on the following three tasks: referring expression comprehension, object localization, and object detection. The embodiments employ machine-learned NL models and/or image models. The architecture is enabled to understand and answer natural localization questions towards an image, to output multiple boxes, provide no output if the object is not present (e.g., a null result), as well as, solve general detection tasks.
-
公开(公告)号:US20230419521A1
公开(公告)日:2023-12-28
申请号:US18367888
申请日:2023-09-13
Applicant: Google LLC
Inventor: Vincent Michael Casser , Soeren Pirk , Reza Mahjourian , Anelia Angelova
CPC classification number: G06T7/55 , G06T7/248 , G06N3/088 , G06T3/0093 , G06N3/045 , G06T2207/20081 , G06T2207/20084
Abstract: A system for generating a depth output for an image is described. The system receives input images that depict the same scene, each input image including one or more potential objects. The system generates, for each input image, a respective background image and processes the background images to generate a camera motion output that characterizes the motion of the camera between the input images. For each potential object, the system generates a respective object motion output for the potential object based on the input images and the camera motion output. The system processes a particular input image of the input images using a depth prediction neural network (NN) to generate a depth output for the particular input image, and updates the current values of parameters of the depth prediction NN based on the particular depth output, the camera motion output, and the object motion outputs for the potential objects.
-
公开(公告)号:US11769269B2
公开(公告)日:2023-09-26
申请号:US17878535
申请日:2022-08-01
Applicant: Google LLC
Inventor: Guy Satat , Michael Quinlan , Sean Kirmani , Anelia Angelova , Ariel Gordon
CPC classification number: G06T7/593 , B25J13/089 , G05D1/0231 , G06T3/20 , H04N13/128 , G06T2207/10028 , H04N2013/0081
Abstract: A method includes receiving a first depth map that includes a plurality of first pixel depths and a second depth map that includes a plurality of second pixel depths. The first depth map corresponds to a reference depth scale and the second depth map corresponds to a relative depth scale. The method includes aligning the second pixel depths with the first pixel depths. The method includes transforming the aligned region of the second pixel depths such that transformed second edge pixel depths of the aligned region are coextensive with first edge pixel depths surrounding the corresponding region of the first pixel depths. The method includes generating a third depth map. The third depth map includes a first region corresponding to the first pixel depths and a second region corresponding to the transformed and aligned region of the second pixel depths.
-
公开(公告)号:US11734847B2
公开(公告)日:2023-08-22
申请号:US17150291
申请日:2021-01-15
Applicant: Google LLC
Inventor: Anelia Angelova , Martin Wicke , Reza Mahjourian
CPC classification number: G06T7/55 , G06N3/044 , G06N3/045 , G06N3/08 , G06T3/40 , G06T15/205 , G06T7/579 , G06T2207/10016 , G06T2207/10028 , G06T2207/20084 , G06T2207/30244
Abstract: A system includes an image depth prediction neural network implemented by one or more computers. The image depth prediction neural network is a recurrent neural network that is configured to receive a sequence of images and, for each image in the sequence: process the image in accordance with a current internal state of the recurrent neural network to (i) update the current internal state and (ii) generate a depth output that characterizes a predicted depth of a future image in the sequence.
-
公开(公告)号:US20230114556A1
公开(公告)日:2023-04-13
申请号:US17909581
申请日:2021-07-14
Applicant: Google LLC
Inventor: Michael Sahngwon Ryoo , Anthony Jacob Piergiovanni , Anelia Angelova
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for processing a network input using a neural network to generate a network output. In one aspect, a method comprises processing a network input sing a neural network to generate a network output, where the neural network has multiple blocks, wherein each block is configured to process a block input to generate a block output, the method comprising, for each target block of the neural network: generating attention-weighted representations of multiple first block outputs, comprising, for each first block output: processing multiple second block outputs to generate attention factors; and generating the attention-weighted representation of each first block output by applying the respective attention factors to the corresponding first block output; and generating the target block input from the attention-weighted representations; and processing the target block input using the target block to generate a target block output.
-
30.
公开(公告)号:US20220305647A1
公开(公告)日:2022-09-29
申请号:US17638469
申请日:2019-08-27
Applicant: GOOGLE LLC
Inventor: Anthony Jacob Piergiovanni , Anelia Angelova , Alexander Toshev , Michael Ryoo
Abstract: Techniques are disclosed that enable the generation of predicted sequences of terminals using a generator model portion of a prediction model. Various implementations include controlling actuators of a robot based on the predicted sequences of terminals. Additional or alternative implementations include jointly training the generator model portion of the prediction model using a discriminator model portion of the prediction model using, for example, stochastic adversarial based sampling.
-
-
-
-
-
-
-
-
-