-
公开(公告)号:US11989956B2
公开(公告)日:2024-05-21
申请号:US17222879
申请日:2021-04-05
Applicant: Microsoft Technology Licensing, LLC
Inventor: Xiyang Dai , Yinpeng Chen , Bin Xiao , Dongdong Chen , Mengchen Liu , Lu Yuan , Lei Zhang
CPC classification number: G06V20/64 , G02B27/0172 , G06T3/06 , G06T3/40
Abstract: Systems and methods for object detection generate a feature pyramid corresponding to image data, and rescaling the feature pyramid to a scale corresponding to a median level of the feature pyramid, wherein the rescaled feature pyramid is a four-dimensional (4D) tensor. The 4D tensor is reshaped into a three-dimensional (3D) tensor having individual perspectives including scale features, spatial features, and task features corresponding to different dimensions of the 3D tensor. The 3D tensor is used with a plurality of attention layers to update a plurality of feature maps associated with the image data. Object detection is performed on the image data using the updated plurality of feature maps.