-
公开(公告)号:US12141981B2
公开(公告)日:2024-11-12
申请号:US17669040
申请日:2022-02-10
Applicant: QUALCOMM Incorporated
Inventor: Shuai Zhang , Xiaowen Ying , Jiancheng Lyu , Yingyong Qi
Abstract: Systems and techniques are provided for performing semantic image segmentation using a machine learning system (e.g., including one or more cross-attention transformer layers). For instance, a process can include generating one or more input image features for a frame of image data and generating one or more input depth features for a frame of depth data. One or more fused image features can be determined, at least in part, by fusing the one or more input depth features with the one or more input image features, using a first cross-attention transformer network. One or more segmentation masks can be generated for the frame of image data based on the one or more fused image features.