发明公开
- 专利标题: SYSTEM FOR PROVIDING ENHANCED VISION TRANSFORMER BLOCKS FOR COMPUTER VISION
-
申请号: US18359786申请日: 2023-07-26
-
公开(公告)号: US20240037931A1公开(公告)日: 2024-02-01
- 发明人: Abhishek Chaurasia , Shakti Nagnath Wadekar
- 申请人: Micron Technology, Inc.
- 申请人地址: US ID Boise
- 专利权人: Micron Technology, Inc.
- 当前专利权人: Micron Technology, Inc.
- 当前专利权人地址: US ID Boise
- 主分类号: G06V10/96
- IPC分类号: G06V10/96 ; G06V10/764 ; G06T7/11 ; G06V10/82 ; G06V10/80
摘要:
A system for providing an enhanced vision transformer block for mobile vision transformers to perform computer vision tasks, such as image classification, segmentation, and objected detection is disclosed. A local representation block of the block applies a depthwise-separable convolutional layer to vectors of an input image to facilitate creation of local representation outputs associated with the image. The local representation output is fed into a global representation block, which unfolds the local representation outputs, applies vision transformers, and folds the result to generate a global representation output associated with the image. The global representation output is fed to a fusion block, which concatenates the local representations with the global representations, applies a point-wise convolution to the concatenation to generate a fusion block output, and fuses input features of the image with the fusion block out to generate an output to facilitate performance of a computer vision tasks.
信息查询