CROSS-ATTENTION PERCEPTION MODEL TRAINED TO USE SENSOR AND/OR MAP DATA
摘要:
A transformer-based machine-learned model may use cross-attention between map data and various sensor data and/or perception data, such as an object detection, to augment perception tasks. In particular, the transformer-based machine-learned model may comprise two or more encoders, one of which may determine a first embedding from map data and a second encoder that may determine a second embedding from sensor data and/or perception data. An encoder may determine a score that may be used to determine various outputs that may improve partially occluded object detection, ground plane classification, static object detection, and suppress false positive object detections.
信息查询
0/0