-
公开(公告)号:US12125269B2
公开(公告)日:2024-10-22
申请号:US17584449
申请日:2022-01-26
发明人: Gaurab Banerjee , Vijay Nagasamy
IPC分类号: G06V10/00 , G01S13/86 , G01S13/89 , G01S15/86 , G01S15/89 , G01S17/86 , G01S17/89 , G06V10/80 , G06V10/82 , G06V20/58 , G01S13/931 , G01S15/931 , G01S17/931
CPC分类号: G06V10/803 , G01S13/862 , G01S13/865 , G01S13/867 , G01S13/89 , G01S15/86 , G01S15/89 , G01S17/86 , G01S17/89 , G06V10/82 , G06V20/58 , G01S13/931 , G01S15/931 , G01S17/931
摘要: A plurality of images can be acquired from a plurality of sensors and a plurality of flattened patches can be extracted from the plurality of images. An image location in the plurality of images and a sensor type token identifying a type of sensor used to acquire an image in the plurality of images from which the respective flattened patch was acquired can be added to each of the plurality of flattened patches. The flattened patches can be concatenated into a flat tensor and add a task token indicating a processing task to the flat tensor, wherein the flat tensor is a one-dimensional array that includes two or more types of data. The flat tensor can be input to a first deep neural network that includes a plurality of encoder layers and a plurality of decoder layers and outputs transformer output. The transformer output can be input to a second deep neural network that determines an object prediction indicated by the token and the object predictions can be output.
-
公开(公告)号:US20230237783A1
公开(公告)日:2023-07-27
申请号:US17584449
申请日:2022-01-26
发明人: Gaurab Banerjee , Vijay Nagasamy
IPC分类号: G06V10/80 , G01S17/86 , G01S15/86 , G01S13/86 , G01S17/89 , G01S15/89 , G01S13/89 , G06V10/82 , G06V20/58
CPC分类号: G06V10/803 , G01S17/86 , G01S15/86 , G01S13/865 , G01S13/867 , G01S13/862 , G01S17/89 , G01S15/89 , G01S13/89 , G06V10/82 , G06V20/58 , G01S17/931
摘要: A plurality of images can be acquired from a plurality of sensors and a plurality of flattened patches can be extracted from the plurality of images. An image location in the plurality of images and a sensor type token identifying a type of sensor used to acquire an image in the plurality of images from which the respective flattened patch was acquired can be added to each of the plurality of flattened patches. The flattened patches can be concatenated into a flat tensor and add a task token indicating a processing task to the flat tensor, wherein the flat tensor is a one-dimensional array that includes two or more types of data. The flat tensor can be input to a first deep neural network that includes a plurality of encoder layers and a plurality of decoder layers and outputs transformer output. The transformer output can be input to a second deep neural network that determines an object prediction indicated by the token and the object predictions can be output.
-