Visual language models for perception

Invention Grant

US12158762B1 Visual language models for perception 有权

Please log in to see more content

Patent Title: Visual language models for perception
Application No.: US18244770

Application Date: 2023-09-11
Publication No.: US12158762B1

Publication Date: 2024-12-03
Inventor: Stephen O'Hara , Ariel Quinn
Applicant: Aurora Operations, Inc.
Applicant Address: US CA Mountain View
Assignee: Aurora Operations, Inc.
Current Assignee: Aurora Operations, Inc.
Current Assignee Address: US CA Mountain View
Agency: Gray Ice Higdon
Main IPC: G05D1/246
IPC: G05D1/246 ; G05D1/00 ; G06F40/284 ; G06F40/56 ; G06V20/56 ; G06V20/58

Abstract:

A method is provided, that includes: receiving camera data from a perception system of an autonomous vehicle; and providing the camera data to a visual language model, where the visual language model includes a mapping of a corpus of images and a corpus of text to a common parameter space. The method further includes: receiving from the visual language model an output corresponding to one or more text tokens; accessing a configuration file comprising a plurality of text tokens representing a plurality of objects or events of interest to the autonomous vehicle; and identifying a respective object or event of interest in an environment of the autonomous vehicle by determining that a text token of the output matches a respective one of the plurality of text tokens in the configuration file. The autonomous vehicle can then be controlled based at least in part on the respective object or event of interest.

Information query

Espacenet