-
公开(公告)号:US20220180549A1
公开(公告)日:2022-06-09
申请号:US17545987
申请日:2021-12-08
Applicant: Waymo LLC
Inventor: Longlong Jing , Ruichi Yu , Jiyang Gao , Henrik Kretzschmar , Kang Li , Ruizhongtai Qi , Hang Zhao , Alper Ayvaci , Xu Chen , Dillon Cower , Congcong Li
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for predicting three-dimensional object locations from images. One of the methods includes obtaining a sequence of images that comprises, at each of a plurality of time steps, a respective image that was captured by a camera at the time step; generating, for each image in the sequence, respective pseudo-lidar features of a respective pseudo-lidar representation of a region in the image that has been determined to depict a first object; generating, for a particular image at a particular time step in the sequence, image patch features of the region in the particular image that has been determined to depict the first object; and generating, from the respective pseudo-lidar features and the image patch features, a prediction that characterizes a location of the first object in a three-dimensional coordinate system at the particular time step in the sequence.
-
公开(公告)号:US11061406B2
公开(公告)日:2021-07-13
申请号:US16167007
申请日:2018-10-22
Applicant: Waymo LLC
Inventor: Junhua Mao , Congcong Li , Alper Ayvaci , Chen Sun , Kevin Murphy , Ruichi Yu
IPC: G05D1/02 , B60W30/095 , G01S17/93 , G05D1/00 , G06K9/00 , G06K9/62 , G01S17/931
Abstract: Aspects of the disclosure relate to training and using a model for identifying actions of objects. For instance, LIDAR sensor data frames including an object bounding box corresponding to an object as well as an action label for the bounding box may be received. Each sensor frame is associated with a timestamp and is sequenced with respect to other sensor frames. Each given sensor data frame may be projected into a camera image of the object based on the timestamp associated with the given sensor data frame in order to provide fused data. The model may be trained using the fused data such that the model is configured to, in response to receiving fused data, the model outputs an action label for each object bounding box of the fused data. This output may then be used to control a vehicle in an autonomous driving mode.
-
公开(公告)号:US20250103844A1
公开(公告)日:2025-03-27
申请号:US18973983
申请日:2024-12-09
Applicant: Waymo LLC
Inventor: Justin Thorsen , Changchang Wu , Alper Ayvaci , Tiffany Chen , Lo Po Tsui , Zhinan Xu , Chen Wu , Sean Rafferty
IPC: G06K19/067 , G09F3/00
Abstract: Aspects of the disclosure provide for automatically generating labels for sensor data. For instance, first sensor data for a vehicle may be identified. This first sensor data may have been captured by a first sensor of the vehicle at a first location during a first point in time and may be associated with a first label for an object. Second sensor data for the vehicle may be identified. The second sensor data may have been captured by a second sensor of the vehicle at a second location at a second point in time outside of the first point in time. The second location is different from the first location. A determination may be made as to whether the object is a static object. Based on the determination that the object is a static object, the first label may be used to automatically generate a second label for the second sensor data.
-
14.
公开(公告)号:US20250078531A1
公开(公告)日:2025-03-06
申请号:US18242928
申请日:2023-09-06
Applicant: Waymo LLC
Inventor: Hang Yan , Zhengyu Zhang , Yan Wang , Jingxiao Zheng , Dmitry Kalenichenko , Vasiliy Igorevich Karasev , Alper Ayvaci , Xu Chen
IPC: G06V20/56 , B60W60/00 , G01S13/89 , G06V10/764 , G06V10/80
Abstract: A method includes obtaining, by a processing device, input data derived from a set of sensors of an autonomous vehicle (AV), generating, by the processing device using a set of lane detection classifier heads, at least one heatmap based on a fused bird's eye view (BEV) feature generated from the input data, obtaining, by the processing device, a set of polylines using the at least one heatmap, wherein each polyline of the set of polylines corresponds to a respective track of a first set of tracks for a first frame, and generating, by the processing device, a second set of tracks for a second frame after the first frame by using a statistical filter based on a set of extrapolated tracks for the second frame and a set of track measurements for the second frame, wherein each track measurement of the set of track measurements corresponds to a respective updated polyline obtained for the second frame.
-
公开(公告)号:US12229972B2
公开(公告)日:2025-02-18
申请号:US17721288
申请日:2022-04-14
Applicant: Waymo LLC
Inventor: Daniel Rudolf Maurer , Austin Charles Stone , Alper Ayvaci , Anelia Angelova , Rico Jonschkowski
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a neural network to predict optical flow. One of the methods includes obtaining a batch of one or more training image pairs; for each of the pairs: processing the first training image and the second training image using the neural network to generate a final optical flow estimate; generating a cropped final optical flow estimate from the final optical flow estimate; and training the neural network using the cropped optical flow estimate.
-
公开(公告)号:US11756309B2
公开(公告)日:2023-09-12
申请号:US17148148
申请日:2021-01-13
Applicant: Waymo LLC
Inventor: Alper Ayvaci , Feiyu Chen , Justin Yu Zheng , Bayram Safa Cicek , Vasiliy Igorevich Karasev
CPC classification number: G06V20/58 , B60W60/001 , G06N3/08 , B60W2420/52 , B60W2554/4049
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a neural network using contrastive learning. One of the methods includes obtaining a network input representing an environment; processing the network input using a first subnetwork of the neural network to generate a respective embedding for each location in the environment; processing the embeddings for each location in the environment using a second subnetwork of the neural network to generate a respective object prediction for each location; determining, for each of a plurality of pairs of the plurality of locations in the environment, whether the respective object predictions of the pair of locations characterize the same possible object or different possible objects; computing a respective contrastive loss value for each of the plurality of pairs of locations; and updating values for a plurality of parameters of the first subnetwork using the computed contrastive loss values.
-
公开(公告)号:US20230260266A1
公开(公告)日:2023-08-17
申请号:US18108749
申请日:2023-02-13
Applicant: Waymo LLC
Inventor: Vasiliy Igorevich Karasev , Jiakai Zhang , Alper Ayvaci , Hang Yan , James Philbin
CPC classification number: G06V10/806 , G01S13/867 , G01S7/417 , G01S7/412 , G06V20/58 , G06V10/82 , G06V10/7715
Abstract: A method includes obtaining, by a processing device, input data derived from a set of sensors associated with an autonomous vehicle (AV), extracting, by the processing device from the input data, a plurality of sets of features, generating, by the processing device using the plurality of sets of features, a fused bird's-eye view (BEV) grid. The fused BEV grid is generated based on a first BEV grid having a first scale and a second BEV grid having a second scale different from the first scale. The method further includes providing, by the processing device, the fused BEV grid for object detection.
-
公开(公告)号:US20230035454A1
公开(公告)日:2023-02-02
申请号:US17384637
申请日:2021-07-23
Applicant: Waymo LLC
Inventor: Daniel Rudolf Maurer , Alper Ayvaci , Robert William Anderson , Rico Jonschkowski , Austin Charles Stone , Anelia Angelova , Nichola Abdo , Christopher John Sweeney
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating an optical flow label from a lidar point cloud. One of the methods includes obtaining data specifying a training example, including a first image of a scene in an environment captured at a first time point and a second image of the scene in the environment captured at a second time point. For each of a plurality of lidar points, a respective second corresponding pixel in the second image is obtained and a respective velocity estimate for the lidar point at the second time point is obtained. A respective first corresponding pixel in the first image is determined using the velocity estimate for the lidar point. A proxy optical flow ground truth for the training example is generated based on an estimate of optical flow of the pixel between the first and second images.
-
公开(公告)号:US20210295555A1
公开(公告)日:2021-09-23
申请号:US17342434
申请日:2021-06-08
Applicant: Waymo LLC
Inventor: Alper Ayvaci , Yu-Han Chen , Ruichi Yu , Chen Wu , Noha Waheed Ahmed Radwan , Jonathon Shlens
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating object interaction predictions using a neural network. One of the methods includes obtaining a sensor input derived from data generated by one or more sensors that characterizes a scene. The sensor input is provided to an object interaction neural network. The object interaction neural network is configured to process the sensor input to generate a plurality of object interaction outputs. Each respective object interaction output includes main object information and interacting object information. The respective object interaction outputs corresponding to the plurality of regions in the sensor input are received as output of the object interaction neural network.
-
公开(公告)号:US20210150752A1
公开(公告)日:2021-05-20
申请号:US16686840
申请日:2019-11-18
Applicant: Waymo LLC
Inventor: Alper Ayvaci , Yu-Han Chen , Ruichi Yu , Chen Wu , Noha Waheed Ahmed Radwan , Jonathon Shlens
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating object interaction predictions using a neural network. One of the methods includes obtaining a sensor input derived from data generated by one or more sensors that characterizes a scene. The sensor input is provided to an object interaction neural network. The object interaction neural network is configured to process the sensor input to generate a plurality of object interaction outputs. Each respective object interaction output includes main object information and interacting object information. The respective object interaction outputs corresponding to the plurality of regions in the sensor input are received as output of the object interaction neural network.
-
-
-
-
-
-
-
-
-