-
公开(公告)号:US20210073551A1
公开(公告)日:2021-03-11
申请号:US16566179
申请日:2019-09-10
申请人: Ruiwen LI , Peng DAI , Varshanth Ravindra RAO , Juwei LU , Wei LI , Jianpeng XU
发明人: Ruiwen LI , Peng DAI , Varshanth Ravindra RAO , Juwei LU , Wei LI , Jianpeng XU
IPC分类号: G06K9/00 , G06F17/27 , G11B27/30 , G11B27/031
摘要: Methods and systems for video segmentation and scene recognition are described. A video having a plurality of frames and a subtitle file associated with the video are received. Segmentation is performed on the video to generate a first set video frames comprising one or more video frames based on a frame-by-frame comparison of features in the frames of the video. Each video frame in the first includes a frame indicator which indicates at least a first start frame of the video frame. The subtitle file associated with the video is parsed to generate one or more subtitle segments based on a start and an end time of each dialogue in the subtitle file. A second set of video frames comprising one or more second video frames are generated based on the video frames of the first set of video frames and the e or more subtitle segments.
-
公开(公告)号:US20210191975A1
公开(公告)日:2021-06-24
申请号:US16722363
申请日:2019-12-20
申请人: Juwei LU , Sayem Mohammad SIAM , Peng DAI , Wei LI , Jin TANG
发明人: Juwei LU , Sayem Mohammad SIAM , Peng DAI , Wei LI , Jin TANG
IPC分类号: G06F16/71 , G06F3/0488 , G06F3/0482 , G06F16/783
摘要: Methods and systems for managing an image collection. Metadata associated with a captured image includes data identifying each human in the captured image. A linkage score may be generated, representing a relationship between first and second identified humans in the captured image. Records in an image collection database are updated to include the generated linkage score. The linkage information may be used to render a graphical user interface (GUI) for navigating the image collection.
-
公开(公告)号:US20210142106A1
公开(公告)日:2021-05-13
申请号:US17095257
申请日:2020-11-11
申请人: Niamul QUADER , Md Ibrahim KHALIL , Juwei LU , Peng DAI , Wei LI
发明人: Niamul QUADER , Md Ibrahim KHALIL , Juwei LU , Peng DAI , Wei LI
摘要: Methods and systems for updating the weights of a set of convolution kernels of a convolutional layer of a neural network are described. A set of convolution kernels having attention-infused weights is generated by using an attention mechanism based on characteristics of the weights. For example, a set of location-based attention multipliers is applied to weights in the set of convolution kernels, a magnitude-based attention function is applied to the weights in the set of convolution kernels, or both. An output activation map is generated using the set of convolution kernels with attention-infused weights. A loss for the neural network is computed, and the gradient is back propagated to update the attention-infused weights of the convolution kernels.
-
4.
公开(公告)号:US20220114424A1
公开(公告)日:2022-04-14
申请号:US17066220
申请日:2020-10-08
申请人: Niamul QUADER , Md Ibrahim KHALIL , Juwei LU , Peng DAI , Wei LI
发明人: Niamul QUADER , Md Ibrahim KHALIL , Juwei LU , Peng DAI , Wei LI
摘要: Methods, processing units and media for multi-bandwidth separated feature extraction convolution in a neural network are described. A convolution block splits input channels of an activation map into multiple branches, each branch undergoing convolution at a different bandwidth by using down-sampling of the inputs. The outputs are concatenated by up-sampling the outputs of the low-bandwidth branches using pixel shuffling. The concatenation operation may be a shuffled concatenation operation that preserves separated multi-bandwidth feature information for use by subsequent layers of the neural network. Embodiments are described which apply frequency-based and magnitude-based attention to the weights of the convolution kernels based on the frequency band locations of the weights.
-
公开(公告)号:US20220300823A1
公开(公告)日:2022-09-22
申请号:US17204670
申请日:2021-03-17
申请人: Hanwen LIANG , Peng DAI , Qiong ZHANG , Juwei LU
发明人: Hanwen LIANG , Peng DAI , Qiong ZHANG , Juwei LU
摘要: Methods, systems, and media for training deep neural networks for cross-domain few-shot classification are described. The methods comprise an encoder and a decoder of a deep neural network. The training of the autoencoder comprises two training stages. For each iteration in the first training stage, a batch of data samples from the source dataset are sampled and fed to the encoder to generate a plurality of source feature maps, then determining a first training stage loss, which updates the autoencoder's parameters. For each iteration in the second training stage, the novel dataset is split into a support set and a query set. The support set is fed to the encoder to determine a prototype for each class label. The query set is also fed to the encoder to calculate a query set metric classification loss. The query set metric classification loss updates the autoencoder's parameters.
-
公开(公告)号:US20220405322A1
公开(公告)日:2022-12-22
申请号:US17354786
申请日:2021-06-22
申请人: Varshanth RAO , Md Ibrahim KHALIL , Peng DAI , Juwei LU
发明人: Varshanth RAO , Md Ibrahim KHALIL , Peng DAI , Juwei LU
IPC分类号: G06F16/55 , G06F16/53 , G06F16/51 , G06F16/583 , G06K9/62
摘要: Methods, systems, and media for image searching are described. Images comprising one query image and a plurality of candidate images are received. For each candidate image, a first model similarity measure from an output of a first model configured for scene classification to perceive scenes in the images is determined. Further, for each candidate image of the plurality of candidate images, a second model similarity measure from the output of a second model configured for attribute classification to perceive attributes in the images is determined. For each candidate image of the plurality of candidate images, a similarity agglomerate index of a weighted aggregate of the first model similarity measure and the second model similarity measure is computed. The plurality of candidate images based on the respective similarity agglomerate index of each candidate image are ranked and a first ranked candidate images corresponding to the searched images are generated.
-
公开(公告)号:US20240054757A1
公开(公告)日:2024-02-15
申请号:US18327384
申请日:2023-06-01
申请人: Yanhui GUO , Deepak SRIDHAR , Peng DAI , Juwei LU
发明人: Yanhui GUO , Deepak SRIDHAR , Peng DAI , Juwei LU
CPC分类号: G06V10/62 , G06V10/24 , G06V10/44 , G06V10/764 , G06V10/806 , G06V10/82
摘要: Systems and methods for temporal action localization of video data are described. A feature representation extracted from video data has a temporal dimension and a spatial dimension. The feature representation is self-aligned in the spatial dimension. Spatial multi-sampling is performed to obtain a plurality of sparse samples of the self-aligned representation along the spatial dimension, and the multi-sampled representation is fused with the self-aligned representation. Attention-based context information aggregation is applied on the fused representation to obtain a spatially refined representation. Local temporal information aggregation is applied on the self-aligned representation to obtain a temporally refined representation. Action localization is performed on a concatenation of the spatially refined representation and the temporally refined representation.
-
公开(公告)号:US20220303560A1
公开(公告)日:2022-09-22
申请号:US17203613
申请日:2021-03-16
申请人: Deepak SRIDHAR , Niamul QUADER , Srikanth MURALIDHARAN , Yaoxin LI , Juwei LU , Peng DAI
发明人: Deepak SRIDHAR , Niamul QUADER , Srikanth MURALIDHARAN , Yaoxin LI , Juwei LU , Peng DAI
摘要: Systems, methods, and computer media of processing a video are disclosed. An example method may include: receiving a plurality of video frames of a video; generating a plurality of first input features based on the plurality of video frames; generating a plurality of second input features based on reversing a temporal order of the plurality of first input features; generating a first set of joint attention features based on the plurality of first input features; generating a second set of joint attention features based on the plurality of second input features; and concatenating the first set of joint attention features and the second set of joint attention features to generate a final set of joint attention features.
-
9.
公开(公告)号:US20210294423A1
公开(公告)日:2021-09-23
申请号:US16842717
申请日:2020-04-07
申请人: Wei ZHOU , Mona HOSSEINKHANI LOORAK , Gaganpreet SINGH , Xiu YI , Juwei LU , Wei LI
发明人: Wei ZHOU , Mona HOSSEINKHANI LOORAK , Gaganpreet SINGH , Xiu YI , Juwei LU , Wei LI
IPC分类号: G06F3/01
摘要: Methods and apparatus for gesture-based control of a device in a multi-user environment are described. The methods prioritize users or gestures based on a predetermined priority ruleset. A first-user-in-time ruleset prioritizes gestures based on when in time they were begun by a user in the camera FOV. An action-hierarchy ruleset prioritizes gestures based on the actions they correspond to, and the relative positions of those actions within an action hierarchy. A designated-master-user ruleset prioritizes gestures performed by an explicitly designated master user. Methods for designating a new master user and for providing gesture-control-related user feedback in a multi-user environment are also described.
-
10.
公开(公告)号:US20210333884A1
公开(公告)日:2021-10-28
申请号:US17085866
申请日:2020-10-30
申请人: Wei LI , Wei ZHOU , Sachi MIZOBUCHI , Ghazaleh SANIEE-MONFARED , Juwei LU , Taslim Arefin KHAN , Rafael VERAS GUIMARAES
发明人: Wei LI , Wei ZHOU , Sachi MIZOBUCHI , Ghazaleh SANIEE-MONFARED , Juwei LU , Taslim Arefin KHAN , Rafael VERAS GUIMARAES
摘要: Methods, devices, and processor-readable media for adjusting the control-display gain of a gesture-controlled device are described. Adjusting the control-display gain may facilitate user interaction with content or UI elements rendered on a display screen of the gesture-controlled device. The control-display gain may be adjusted based on a property of how a mid-air dragging gesture is being performed by a user's hand. The property may be the location of the gesture, the orientation of the hand performing the gesture, or the velocity of the gesture. A hand that becomes stationary for a threshold time period while performing the dragging gesture may adjust the control-display gain to a different level. Control-display gain may be set to a different value based on the current velocity of the hand performing the gesture. The control-display gain levels may be selected from a continuous range of values or a set of discrete values. Devices for performing the methods are described.
-
-
-
-
-
-
-
-
-