Invention Grant
- Patent Title: Open-vocabulary object detection in images
-
Application No.: US18144045Application Date: 2023-05-05
-
Publication No.: US11928854B2Publication Date: 2024-03-12
- Inventor: Matthias Johannes Lorenz Minderer , Alexey Alexeevich Gritsenko , Austin Charles Stone , Dirk Weissenborn , Alexey Dosovitskiy , Neil Matthew Tinmouth Houlsby
- Applicant: Google LLC
- Applicant Address: US CA Mountain View
- Assignee: Google LLC
- Current Assignee: Google LLC
- Current Assignee Address: US CA Mountain View
- Agency: Fish & Richardson P.C.
- Main IPC: G06K9/00
- IPC: G06K9/00 ; G06F40/40 ; G06V10/22 ; G06V10/74 ; G06V10/764 ; G06V10/774 ; G06V10/776 ; G06V10/82

Abstract:
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for object detection. In one aspect, a method comprises: obtaining: (i) an image, and (ii) a set of one or more query embeddings, wherein each query embedding represents a respective category of object; processing the image and the set of query embeddings using an object detection neural network to generate object detection data for the image, comprising: processing the image using an image encoding subnetwork of the object detection neural network to generate a set of object embeddings; processing each object embedding using a localization subnetwork to generate localization data defining a corresponding region of the image; and processing: (i) the set of object embeddings, and (ii) the set of query embeddings, using a classification subnetwork to generate, for each object embedding, a respective classification score distribution over the set of query embeddings.
Public/Granted literature
- US20230360365A1 OPEN-VOCABULARY OBJECT DETECTION IN IMAGES Public/Granted day:2023-11-09
Information query