-
31.
公开(公告)号:US20080285849A1
公开(公告)日:2008-11-20
申请号:US11750099
申请日:2007-05-17
申请人: Juwei Lu , Hui Zhou , Mohanaraj Thiyagarajah
发明人: Juwei Lu , Hui Zhou , Mohanaraj Thiyagarajah
IPC分类号: G06K9/46
CPC分类号: G06K9/00234
摘要: A method and system for scanning a digital image for detecting the representation of an object, such as a face, and for reducing memory requirements of the computer system performing the image scan. One example method includes identifying an original image and downsamples the original image in an x-dimension and in a y-dimension to obtain a downsampled image that requires less storage space than the original digital image. A first scan is performed of the downsampled image to detect the representation of an object within the downsampled image. Then, the original digital image is divided into at least two image blocks, where each image block contains a portion of the original digital image. A second scan is then performed of each of the image blocks to detect the representation of the object within the image blocks.
摘要翻译: 一种用于扫描数字图像以检测诸如面部的对象的表示以及用于减少执行图像扫描的计算机系统的存储器需求的方法和系统。 一个示例性方法包括识别原始图像并且以x维度和y维度对原始图像进行下采样以获得比原始数字图像更少的存储空间的下采样图像。 执行下采样图像的第一扫描以检测下采样图像中的对象的表示。 然后,原始数字图像被分成至少两个图像块,其中每个图像块包含原始数字图像的一部分。 然后对每个图像块执行第二扫描以检测图像块内的对象的表示。
-
公开(公告)号:US20080107341A1
公开(公告)日:2008-05-08
申请号:US11556082
申请日:2006-11-02
申请人: Juwei Lu
发明人: Juwei Lu
IPC分类号: G06K9/00
CPC分类号: G06K9/00248 , G06K9/4614 , G06K9/4647 , G06K9/56 , G06K9/6256
摘要: A method of detecting faces in a digital image comprises selecting a sub-window of the digital image. Sample regions of the sub-window are then selected. The sample regions are analyzed to determine if the sub-window likely represents a face.
摘要翻译: 检测数字图像中的面部的方法包括选择数字图像的子窗口。 然后选择子窗口的样本区域。 分析样本区域以确定子窗口是否可能代表脸部。
-
公开(公告)号:US20240193866A1
公开(公告)日:2024-06-13
申请号:US18078832
申请日:2022-12-09
申请人: Yannick VERDIE , Zihao YANG , Deepak SRIDHAR , Steven George MCDONAGH , Juwei LU
发明人: Yannick VERDIE , Zihao YANG , Deepak SRIDHAR , Steven George MCDONAGH , Juwei LU
摘要: Methods and systems for estimation of a 3D hand pose are disclosed. A 2D image containing a detected hand is processed using a U-net network to obtain a global feature vector and a heatmap for the keypoints of the hand. Information from the global feature vector and the heatmap are concatenated to obtain a set of input tokens that are processed using a transformer encoder to obtain a first set of 2D keypoints representing estimated 2D locations of the keypoints in a first view. The first set of 2D keypoints are inputted as a query to a transformer decoder, to obtain a second set of 2D keypoints representing estimated 2D locations of the keypoints in a second view. The first and second sets of 2D keypoints are aggregated to output the set of estimated 3D keypoints.
-
公开(公告)号:US11966516B2
公开(公告)日:2024-04-23
申请号:US17827939
申请日:2022-05-30
申请人: Juwei Lu , Sayem Mohammad Siam , Wei Zhou , Peng Dai , Xiaofei Wu , Songcen Xu
发明人: Juwei Lu , Sayem Mohammad Siam , Wei Zhou , Peng Dai , Xiaofei Wu , Songcen Xu
摘要: Methods and systems for gesture-based control of a device are described. A virtual gesture-space is determined in a received input frame. The virtual gesture-space is associated with a primary user from a ranked user list of users. The received input frame is processed in only the virtual gesture-space, to detect and track a hand. Using a hand bounding box generated by detecting and tracking the hand, gesture classification is performed to determine a gesture input associated with the hand. A command input associated with the determined gesture input is processed. The device may be a smart television, a smart phone, a tablet, etc.
-
公开(公告)号:US20240054757A1
公开(公告)日:2024-02-15
申请号:US18327384
申请日:2023-06-01
申请人: Yanhui GUO , Deepak SRIDHAR , Peng DAI , Juwei LU
发明人: Yanhui GUO , Deepak SRIDHAR , Peng DAI , Juwei LU
CPC分类号: G06V10/62 , G06V10/24 , G06V10/44 , G06V10/764 , G06V10/806 , G06V10/82
摘要: Systems and methods for temporal action localization of video data are described. A feature representation extracted from video data has a temporal dimension and a spatial dimension. The feature representation is self-aligned in the spatial dimension. Spatial multi-sampling is performed to obtain a plurality of sparse samples of the self-aligned representation along the spatial dimension, and the multi-sampled representation is fused with the self-aligned representation. Attention-based context information aggregation is applied on the fused representation to obtain a spatially refined representation. Local temporal information aggregation is applied on the self-aligned representation to obtain a temporally refined representation. Action localization is performed on a concatenation of the spatially refined representation and the temporally refined representation.
-
公开(公告)号:US11778223B2
公开(公告)日:2023-10-03
申请号:US17406845
申请日:2021-08-19
申请人: Wentao Liu , Yuanhao Yu , Yang Wang , Juwei Lu , Xiaolin Wu , Jin Tang
发明人: Wentao Liu , Yuanhao Yu , Yang Wang , Juwei Lu , Xiaolin Wu , Jin Tang
IPC分类号: H04N19/59 , H04N19/51 , H04N19/184 , H04N19/136
CPC分类号: H04N19/51 , H04N19/136 , H04N19/184
摘要: A method, device and computer-readable medium for generating a super-resolution version of a compressed video stream. By leveraging the motion information and residual information in compressed video streams, described examples are able to skip the time-consuming motion-estimation step for most frames and make the most use of the SR results of key frames. A key frame SR module generates SR versions of I-frames and other key frames of a compressed video stream using techniques similar to existing multi-frame approaches to VSR. A non-key frame SR module generates SR version of the non-key inter frames between these key frames by making use of motion information and residual information used to encode the inter frames in the compressed video stream.
-
公开(公告)号:US11698926B2
公开(公告)日:2023-07-11
申请号:US17524862
申请日:2021-11-12
申请人: Arnab Kumar Mondal , Deepak Sridhar , Niamul Quader , Juwei Lu , Peng Dai , Chao Xing
发明人: Arnab Kumar Mondal , Deepak Sridhar , Niamul Quader , Juwei Lu , Peng Dai , Chao Xing
IPC分类号: G06F16/30 , G06F16/732 , G06N3/04 , G06F16/783 , G06V20/40
CPC分类号: G06F16/7343 , G06F16/783 , G06N3/04 , G06V20/40
摘要: Methods and systems are described for performing video retrieval together with video grounding. A word-based query for a video is and encoded into a query representation using a trained query encoder. One or more similar video representations are identified, from a plurality of video representations that are similar to the query representation. Each similar video representation represents a respective relevant video. A grounding is generated for each relevant video by forward propagating each respective similar video representation together with the query representation through a trained grounding module. The relevant videos or identifiers of the relevant videos are outputted together with the grounding generated for each relevant video.
-
公开(公告)号:US20230153352A1
公开(公告)日:2023-05-18
申请号:US17524862
申请日:2021-11-12
申请人: Arnab Kumar MONDAL , Deepak SRIDHAR , Niamul QUADER , Juwei LU , Pen DAI , Chao XING
发明人: Arnab Kumar MONDAL , Deepak SRIDHAR , Niamul QUADER , Juwei LU , Pen DAI , Chao XING
IPC分类号: G06F16/732 , G06F16/783 , G06K9/00 , G06N3/04
CPC分类号: G06F16/7343 , G06F16/783 , G06K9/00711 , G06N3/04
摘要: Methods and systems are described for performing video retrieval together with video grounding. A word-based query for a video is and encoded into a query representation using a trained query encoder. One or more similar video representations are identified, from a plurality of video representations that are similar to the query representation. Each similar video representation represents a respective relevant video. A grounding is generated for each relevant video by forward propagating each respective similar video representation together with the query representation through a trained grounding module. The relevant videos or identifiers of the relevant videos are outputted together with the grounding generated for each relevant video.
-
公开(公告)号:US20220303560A1
公开(公告)日:2022-09-22
申请号:US17203613
申请日:2021-03-16
申请人: Deepak SRIDHAR , Niamul QUADER , Srikanth MURALIDHARAN , Yaoxin LI , Juwei LU , Peng DAI
发明人: Deepak SRIDHAR , Niamul QUADER , Srikanth MURALIDHARAN , Yaoxin LI , Juwei LU , Peng DAI
摘要: Systems, methods, and computer media of processing a video are disclosed. An example method may include: receiving a plurality of video frames of a video; generating a plurality of first input features based on the plurality of video frames; generating a plurality of second input features based on reversing a temporal order of the plurality of first input features; generating a first set of joint attention features based on the plurality of first input features; generating a second set of joint attention features based on the plurality of second input features; and concatenating the first set of joint attention features and the second set of joint attention features to generate a final set of joint attention features.
-
公开(公告)号:US11430138B2
公开(公告)日:2022-08-30
申请号:US17102114
申请日:2020-11-23
申请人: Zhixiang Chi , Rasoul Mohammadi Nasiri , Zheng Liu , Jin Tang , Juwei Lu
发明人: Zhixiang Chi , Rasoul Mohammadi Nasiri , Zheng Liu , Jin Tang , Juwei Lu
摘要: Systems and methods for multi-frame video frame interpolation. Higher-order motion modeling, such as cubic motion modeling, achieves predictions of intermediate optical flow between multiple interpolated frames, assisted by relaxation of the constraints imposed by the loss function used in initial optical flow estimation. A temporal pyramidal optical flow refinement module performs coarse-to-fine refinement of the optical flow maps used to generate the intermediate frames, focusing a proportionally greater amount of refinement attention to the optical flow maps for the high-error middle frames. A temporal pyramidal pixel refinement module performs coarse-to-fine refinement of the generated intermediate frames, focusing a proportionally greater amount of refinement attention to the high-error middle frames. A generative adversarial network (GAN) module calculates a loss function for training the neural networks used in the optical flow estimation module, temporal pyramidal optical flow refinement module, and/or temporal pyramidal pixel refinement module.
-
-
-
-
-
-
-
-
-