SYSTEM FOR PROVIDING ENHANCED VISION TRANSFORMER BLOCKS FOR COMPUTER VISION

发明公开

US20240037931A1 SYSTEM FOR PROVIDING ENHANCED VISION TRANSFORMER BLOCKS FOR COMPUTER VISION 审中-公开

请登陆查看更多内容

专利标题： SYSTEM FOR PROVIDING ENHANCED VISION TRANSFORMER BLOCKS FOR COMPUTER VISION
申请号： US18359786

申请日： 2023-07-26
公开(公告)号： US20240037931A1

公开(公告)日： 2024-02-01
发明人: Abhishek Chaurasia , Shakti Nagnath Wadekar
申请人： Micron Technology, Inc.
申请人地址： US ID Boise
专利权人： Micron Technology, Inc.
当前专利权人： Micron Technology, Inc.
当前专利权人地址： US ID Boise
主分类号： G06V10/96
IPC分类号： G06V10/96 ; G06V10/764 ; G06T7/11 ; G06V10/82 ; G06V10/80

SYSTEM FOR PROVIDING ENHANCED VISION TRANSFORMER BLOCKS FOR COMPUTER VISION

摘要：

A system for providing an enhanced vision transformer block for mobile vision transformers to perform computer vision tasks, such as image classification, segmentation, and objected detection is disclosed. A local representation block of the block applies a depthwise-separable convolutional layer to vectors of an input image to facilitate creation of local representation outputs associated with the image. The local representation output is fed into a global representation block, which unfolds the local representation outputs, applies vision transformers, and folds the result to generate a global representation output associated with the image. The global representation output is fed to a fusion block, which concatenates the local representations with the global representations, applies a point-wise convolution to the concatenation to generate a fusion block output, and fuses input features of the image with the fusion block out to generate an output to facilitate performance of a computer vision tasks.

信息查询

Global Dossier Espacenet

IPC分类:

G	物理
G06	计算；推算或计数
G06V	图像或视频识别或理解
G06V10/00	图像或视频识别或理解的安排（图像或视频中的字符识别 G06V30/10）
G06V10/96	.管理图像或视频识别任务