-
公开(公告)号:US12019726B2
公开(公告)日:2024-06-25
申请号:US17655506
申请日:2022-03-18
发明人: Debasmit Das , Sungrack Yun , Fatih Murat Porikli
摘要: Certain aspects of the present disclosure provide techniques for improved domain adaptation in machine learning. A feature tensor is generated by processing input data using a feature extractor. A first set of logits is generated by processing the feature tensor using a domain-agnostic classifier, and a second set of logits is generated by processing the feature tensor using a domain-specific classifier. A loss is computed based at least in part on the first set of logits and the second set of logits, where the loss includes a divergence loss component. The feature extractor, the domain-agnostic classifier, and the domain-specific classifier are refined using the loss.
-
公开(公告)号:US11908155B2
公开(公告)日:2024-02-20
申请号:US17203607
申请日:2021-03-16
IPC分类号: G06F18/213 , G06N20/00 , G06T7/70
CPC分类号: G06T7/70 , G06F18/213 , G06N20/00 , G06T2207/20081
摘要: Certain aspects of the present disclosure provide a method, including: processing input data with a feature extraction stage of a machine learning model to generate a feature map; applying an attention map to the feature map to generate an augmented feature map; processing the augmented feature map with a refinement stage of the machine learning model to generate a refined feature map; processing the refined feature map with a first regression stage of the machine learning model to generate multi-dimensional task output data; and processing the refined feature data with an attention stage of the machine learning model to generate an updated attention map.
-
公开(公告)号:US12100169B2
公开(公告)日:2024-09-24
申请号:US17481047
申请日:2021-09-21
CPC分类号: G06T7/269 , G01P13/00 , G06T7/248 , G06T2207/20081 , G06T2207/20084
摘要: Systems and techniques are described herein for performing optical flow estimation between one or more frames. For example, a process can include determining a subset of pixels of at least one of a first frame and a second frame, and generating a mask indicating the subset of pixels. The process can include determining, based on the mask, one or more features associated with the subset of pixels of at least the first frame and the second frame. The process can include determining optical flow vectors between the subset of pixels of the first frame and corresponding pixels of a second frame. The process can include generating an optical flow map for the second frame using the optical flow vectors.
-
公开(公告)号:US11640668B2
公开(公告)日:2023-05-02
申请号:US17344283
申请日:2021-06-10
摘要: Systems and techniques are described herein for performing optical flow estimation for one or more frames. For example, a process can include determining an optical flow prediction associated with a plurality of frames. The process can include determining a position of at least one feature associated with a first frame and determining, based on the position of the at least one feature in the first frame and the optical flow prediction, a position estimate of a search area for searching for the at least one feature in a second frame. The process can include determining, from within the search area, a position of the at least one feature in the second frame.
-
公开(公告)号:US12067777B2
公开(公告)日:2024-08-20
申请号:US17654986
申请日:2022-03-15
发明人: Hanul Kim , Mihir Jain , Juntae Lee , Sungrack Yun , Fatih Murat Porikli
摘要: Certain aspects of the present disclosure provide a method of processing video data. In one example, the method includes receiving input video data; sampling a first subset of clips from the input video data; providing the first subset of clips to a first component of a machine learning model to generate first output; sampling a second subset of clips from the input video data, wherein the second subset of clips comprises fewer clips than the first subset of clips; providing the second subset of clips to a second component of the machine learning model to generate a second output; aggregating the first output from the first component of the machine learning model with the second output from the second component of the machine learning model to generate aggregated output; and determining a characteristic of the input video data based on the aggregated output.
-
公开(公告)号:US12039742B2
公开(公告)日:2024-07-16
申请号:US17510763
申请日:2021-10-26
CPC分类号: G06T7/248 , G06N3/08 , G06T3/60 , G06T2207/10016 , G06T2207/20081 , G06T2207/20084
摘要: Systems and techniques are described for performing supervised learning (e.g., semi-supervised learning, self-supervised learning, and/or mixed supervision learning) for optical flow estimation. For example, a method can include obtaining an image associated with a sequence of images and generating an occluded image. The occluded image can include at least one of the image with an occlusion applied to the image and a different image of the sequence of images with the occlusion applied. The method can include determining a matching map based at least on matching areas of the image and the occluded image and, based on the matching map, determining a loss term associated with an optical flow loss prediction associated with the image and the occluded image. The loss term may include a matched loss and/or other loss. Based on the loss term, the method can include training a network configured to determine an optical flow between images.
-
公开(公告)号:US20220301310A1
公开(公告)日:2022-09-22
申请号:US17654986
申请日:2022-03-15
发明人: Hanul KIM , Mihir Jain , Juntae Lee , Sungrack Yun , Fatih Murat Porikli
摘要: Certain aspects of the present disclosure provide a method of processing video data. In one example, the method includes receiving input video data; sampling a first subset of clips from the input video data; providing the first subset of clips to a first component of a machine learning model to generate first output; sampling a second subset of clips from the input video data, wherein the second subset of clips comprises fewer clips than the first subset of clips; providing the second subset of clips to a second component of the machine learning model to generate a second output; aggregating the first output from the first component of the machine learning model with the second output from the second component of the machine learning model to generate aggregated output; and determining a characteristic of the input video data based on the aggregated output.
-
公开(公告)号:US12118810B2
公开(公告)日:2024-10-15
申请号:US17408779
申请日:2021-08-23
IPC分类号: G06V20/40 , G06F18/21 , G06F18/22 , G06F18/2411 , G06N3/045 , G06T1/00 , G06T7/136 , G06T7/174 , G06T7/215 , G06V10/40 , G06V10/75 , G06V10/94 , G06V30/262
CPC分类号: G06V30/274 , G06F18/21 , G06F18/22 , G06F18/2411 , G06N3/045 , G06T1/0007 , G06T7/136 , G06T7/174 , G06T7/215 , G06V10/40 , G06V10/751 , G06V10/95 , G06V20/46 , G06V20/49 , G06T2207/10016 , G06T2207/20084 , G06V10/759
摘要: Systems, methods, and non-transitory media are provided for providing spatiotemporal recycling networks (e.g., for video segmentation). For example, a method can include obtaining video data including a current frame and one or more reference frames. The method can include determining, based on a comparison of the current frame and the one or more reference frames, a difference between the current frame and the one or more reference frames. Based on the difference being below a threshold, the method can include performing semantic segmentation of the current frame using a first neural network. The semantic segmentation can be performed based on higher-spatial resolution features extracted from the current frame by the first neural network and lower-resolution features extracted from the one or more reference frames by a second neural network. The first neural network has a smaller structure and/or a lower processing cost than the second neural network.
-
公开(公告)号:US12080086B2
公开(公告)日:2024-09-03
申请号:US17407046
申请日:2021-08-19
IPC分类号: G06V30/192 , G06F16/25 , G06T7/45
CPC分类号: G06V30/195 , G06F16/258 , G06T7/45 , G06T2207/30208
摘要: Certain aspects of the present disclosure provide techniques for performing tabular convolution, including performing a tabularization operation on input data to generate a tabularized representation of the input data and performing a convolution operation using the tabularized representation of the input data to generate a convolution output.
-
公开(公告)号:US12022358B2
公开(公告)日:2024-06-25
申请号:US17229825
申请日:2021-04-13
发明人: Ilia Karmanov , Daniel Hendricus Franciscus Dijkman , Farhad Ghazvinian Zanjani , Ishaque Ashar Kadampot , Simone Merlin , Brian Michael Buesker , Vamsi Vegunta , Harshit Joshi , Fatih Murat Porikli , Joseph Binamira Soriaga , Bibhu Mohanty
CPC分类号: H04W4/029 , G01S5/013 , G01S5/0278 , G06N20/00
摘要: Disclosed are systems, methods, and non-transitory media for performing passive radio frequency (RF) location detection operations. In some aspects, RF data, such as RF signals including channel state information (CSI), can be received from a wireless device. The RF data can be provided to a self-supervised machine-learning architecture that is configured to perform object location estimation.
-
-
-
-
-
-
-
-
-