-
公开(公告)号:US20240020948A1
公开(公告)日:2024-01-18
申请号:US17865178
申请日:2022-07-14
Applicant: Jian Ren , Yang Wen , Ju Hu , Georgios Evangelidis , Sergey Tulyakov , Yanyu Li , Geng Yuan
Inventor: Jian Ren , Yang Wen , Ju Hu , Georgios Evangelidis , Sergey Tulyakov , Yanyu Li , Geng Yuan
IPC: G06V10/764 , G06V10/82 , G06V10/44
CPC classification number: G06V10/764 , G06V10/82 , G06V10/454
Abstract: A vision transformer network having extremely low latency and usable on mobile devices, such as smart eyewear devices and other augmented reality (AR) and virtual reality (VR) devices. The transformer network processes an input image, and the network includes a convolution stem configured to patch embed the image. A first stack of stages including at least two stages of 4-Dimension (4D) metablocks (MBs) (MB4D) follow the convolution stem. A second stack of stages including at least two stages of 3-Dimension MBs (MB3D) follow the MB4D stages. Each of the MB4D stages and each of the MB3D stages include different layer configurations, and each of the MB4D stages and each of the MB3D stages include a token mixer. The MB3D stages each additionally include a multi-head self attention (MHSA) processing block.
-
公开(公告)号:US20250104290A1
公开(公告)日:2025-03-27
申请号:US18429251
申请日:2024-01-31
Applicant: Erli Ding , Colin Eles , Amir Fruchtman , Riza Alp Guler , Yanyu Li , Xian Liu , Ergeta Muca , Mohammad Rami Koujan , Jian Ren , Dhritiman Sagar , Aliaksandr Siarohin , Ivan Skorokhodov , Sergey Tulyakov
Inventor: Erli Ding , Colin Eles , Amir Fruchtman , Riza Alp Guler , Yanyu Li , Xian Liu , Ergeta Muca , Mohammad Rami Koujan , Jian Ren , Dhritiman Sagar , Aliaksandr Siarohin , Ivan Skorokhodov , Sergey Tulyakov
Abstract: Examples described herein relate to automatic image generation. A plurality of inputs is accessed. The inputs include first input data and second input data. The first input data includes a text prompt describing a desired image and the second input data is indicative of one or more structural features of the desired image. One or more intermediate outputs are generated via a first generative machine learning model that uses the plurality of inputs as first control signals. An output image is generated via a second generative machine learning model that uses at least a subset of the plurality of inputs and at least a subset of the one or more intermediate outputs as second control signals. The output image is presented at a user device of a user.
-
公开(公告)号:US20240221314A1
公开(公告)日:2024-07-04
申请号:US18090724
申请日:2022-12-29
Applicant: Menglei Chai , Riza Alp Guler , Yash Mukund Kant , Jian Ren , Aliaksandr Siarohin , Sergey Tulyakov
Inventor: Menglei Chai , Riza Alp Guler , Yash Mukund Kant , Jian Ren , Aliaksandr Siarohin , Sergey Tulyakov
Abstract: Invertible Neural Networks (INNs) are used to build an Invertible Neural Skinning (INS) pipeline for reposing characters during animation. A Pose-conditioned Invertible Network (PIN) is built to learn pose-conditioned deformations. The end-to-end Invertible Neural Skinning (INS) pipeline is produced by placing two PINs around a differentiable Linear Blend Skinning (LBS) module using a pose-free canonical representation. The PINs help capture the non-linear surface deformations of clothes across poses and alleviate the volume loss suffered from the LBS operation. Since the canonical representation remains pose-free, the expensive mesh extraction is performed exactly once, and the mesh is reposed by warping it with the learned LBS during an inverse pass through the INS pipeline.
-
公开(公告)号:US20240177414A1
公开(公告)日:2024-05-30
申请号:US18071821
申请日:2022-11-30
Applicant: Hsin-Ying Lee , Jian Ren , Aliaksandr Siarohin , Ivan Skorokhodov , Sergey Tulyakov , Yinghao Xu
Inventor: Hsin-Ying Lee , Jian Ren , Aliaksandr Siarohin , Ivan Skorokhodov , Sergey Tulyakov , Yinghao Xu
CPC classification number: G06T17/00 , G06T7/50 , G06T7/90 , G06V10/82 , G06T2207/10024
Abstract: A three-dimensional (3D) scene is generated from non-aligned generic camera priors by producing a tri-plane representation for an input scene received in random latent code, obtaining a camera posterior including posterior parameters representing color and density data from the random latent code and from generic camera priors without alignment assumptions, and volumetrically rendering an image of the input scene from the color and density data to provide a scene having pixel colors and depth values from an arbitrary camera viewpoint. A depth adaptor processes depth values to generate an adapted depth map that bridges domains of rendered and estimated depth maps for the image of the input scene. The adapted depth map, color data, and scene geometry information from an external dataset are provided to a discriminator for selection of a 3D representation of the input scene.
-
公开(公告)号:US20240112401A1
公开(公告)日:2024-04-04
申请号:US17957049
申请日:2022-09-30
Applicant: Panagiotis Achlioptas , Menglei Chai , Hsin-Ying Lee , Kyle Olszewski , Jian Ren , Sergey Tulyakov
Inventor: Panagiotis Achlioptas , Menglei Chai , Hsin-Ying Lee , Kyle Olszewski , Jian Ren , Sergey Tulyakov
CPC classification number: G06T17/10 , G06T19/00 , G06T2210/16 , G06T2210/56
Abstract: A system and method are described for generating 3D garments from two-dimensional (2D) scribble images drawn by users. The system includes a conditional 2D generator, a conditional 3D generator, and two intermediate media including dimension-coupling color-density pairs and flat point clouds that bridge the gap between dimensions. Given a scribble image, the 2D generator synthesizes dimension-coupling color-density pairs including the RGB projection and density map from the front and rear views of the scribble image. A density-aware sampling algorithm converts the 2D dimension-coupling color-density pairs into a 3D flat point cloud representation, where the depth information is ignored. The 3D generator predicts the depth information from the flat point cloud. Dynamic variations per garment due to deformations resulting from a wearer's pose as well as irregular wrinkles and folds may be bypassed by taking advantage of 2D generative models to bridge the dimension gap in a non-parametric way.
-
16.
公开(公告)号:US08005277B2
公开(公告)日:2011-08-23
申请号:US11713123
申请日:2007-03-02
Applicant: Sergey Tulyakov , Faisal Farooq , Sharat Chikkerur , Venu Govindaraju
Inventor: Sergey Tulyakov , Faisal Farooq , Sharat Chikkerur , Venu Govindaraju
IPC: G06K9/00
CPC classification number: G06K9/00073
Abstract: A method and apparatus for obtaining, hashing, storing and using fingerprint data related to fingerprint minutia including the steps of: a) determining minutia points within a fingerprint, b) determining a plurality of sets of proximate determined minutia points, c) subjecting a plurality of representations of the determined sets of minutia points to a hashing function, and d) storing or comparing resulting hashed values for fingerprint matching.
Abstract translation: 一种用于获取,散列,存储和使用与指纹细节相关的指纹数据的方法和装置,包括以下步骤:a)确定指纹内的细节点,b)确定多组邻近确定的细节点,c) 将所确定的细节集合的表示形式表示为哈希函数,以及d)存储或比较用于指纹匹配的所得散列值。
-
17.
公开(公告)号:US20070253608A1
公开(公告)日:2007-11-01
申请号:US11713123
申请日:2007-03-02
Applicant: Sergey Tulyakov , Faisal Farooq , Sharat Chikkerur , Venu Govindaraju
Inventor: Sergey Tulyakov , Faisal Farooq , Sharat Chikkerur , Venu Govindaraju
IPC: G06K9/00
CPC classification number: G06K9/00073
Abstract: A method and apparatus for obtaining, hashing, storing and using fingerprint data related to fingerprint minutia including the steps of: a) determining minutia points within a fingerprint, b) determining a plurality of sets of proximate determined minutia points, c) subjecting a plurality of representations of the determined sets of minutia points to a hashing function, and d) storing or comparing resulting hashed values for fingerprint matching.
Abstract translation: 一种用于获取,散列,存储和使用与指纹细节相关的指纹数据的方法和装置,包括以下步骤:a)确定指纹内的细节点,b)确定多组邻近确定的细节点,c) 将所确定的细节集合的表示形式表示为哈希函数,以及d)存储或比较用于指纹匹配的所得散列值。
-
公开(公告)号:US20240221258A1
公开(公告)日:2024-07-04
申请号:US18089984
申请日:2022-12-28
Applicant: Menglei Chai , Hsin-Ying Lee , Willi Menapace , Kyle Olszewski , Jian Ren , Aliaksandr Siarohin , Ivan Skorokhodov , Sergey Tulyakov
Inventor: Menglei Chai , Hsin-Ying Lee , Willi Menapace , Kyle Olszewski , Jian Ren , Aliaksandr Siarohin , Ivan Skorokhodov , Sergey Tulyakov
CPC classification number: G06T13/40 , G06T7/70 , G06T19/20 , G06T2207/20081 , G06T2207/20084 , G06T2207/30201 , G06T2219/2004 , G06T2219/2021
Abstract: Unsupervised volumetric 3D animation (UVA) of non-rigid deformable objects without annotations learns the 3D structure and dynamics of objects solely from single-view red/green/blue (RGB) videos and decomposes the single-view RGB videos into semantically meaningful parts that can be tracked and animated. Using a 3D autodecoder framework, paired with a keypoint estimator via a differentiable perspective-n-point (PnP) algorithm, the UVA model learns the underlying object 3D geometry and parts decomposition in an entirely unsupervised manner from still or video images. This allows the UVA model to perform 3D segmentation, 3D keypoint estimation, novel view synthesis, and animation. The UVA model can obtain animatable 3D objects from a single or a few images. The UVA method also features a space in which all objects are represented in their canonical, animation-ready form. Applications include the creation of lenses from images or videos for social media applications.
-
公开(公告)号:US20230410376A1
公开(公告)日:2023-12-21
申请号:US18238979
申请日:2023-08-28
Applicant: Jian Ren , Menglei Chai , Sergey Tulyakov , Qing Jin
Inventor: Jian Ren , Menglei Chai , Sergey Tulyakov , Qing Jin
Abstract: System and methods for compressing image-to-image models. Generative Adversarial Networks (GANs) have achieved success in generating high-fidelity images. An image compression system and method adds a novel variant to class-dependent parameters (CLADE), referred to as CLADE-Avg, which recovers the image quality without introducing extra computational cost. An extra layer of average smoothing is performed between the parameter and normalization layers. Compared to CLADE, this image compression system and method smooths abrupt boundaries, and introduces more possible values for the scaling and shift. In addition, the kernel size for the average smoothing can be selected as a hyperparameter, such as a 3×3 kernel size. This method does not introduce extra multiplications but only addition, and thus does not introduce much computational overhead, as the division can be absorbed into the parameters after training.
-
公开(公告)号:US11790565B2
公开(公告)日:2023-10-17
申请号:US17191970
申请日:2021-03-04
Applicant: Jian Ren , Menglei Chai , Sergey Tulyakov , Qing Jin
Inventor: Jian Ren , Menglei Chai , Sergey Tulyakov , Qing Jin
Abstract: System and methods for compressing image-to-image models. Generative Adversarial Networks (GANs) have achieved success in generating high-fidelity images. An image compression system and method adds a novel variant to class-dependent parameters (CLADE), referred to as CLADE-Avg, which recovers the image quality without introducing extra computational cost. An extra layer of average smoothing is performed between the parameter and normalization layers. Compared to CLADE, this image compression system and method smooths abrupt boundaries, and introduces more possible values for the scaling and shift. In addition, the kernel size for the average smoothing can be selected as a hyperparameter, such as a 3×3 kernel size. This method does not introduce extra multiplications but only addition, and thus does not introduce much computational overhead, as the division can be absorbed into the parameters after training.
-
-
-
-
-
-
-
-
-