-
公开(公告)号:US20240221309A1
公开(公告)日:2024-07-04
申请号:US18090657
申请日:2022-12-29
Applicant: Menglei Chai , Hsin-Ying Lee , Chieh Lin , Willi Menapace , Aliaksandr Siarohin , Sergey Tulyakov
Inventor: Menglei Chai , Hsin-Ying Lee , Chieh Lin , Willi Menapace , Aliaksandr Siarohin , Sergey Tulyakov
CPC classification number: G06T17/05 , G06T7/174 , G06T17/005 , G06T2207/20081 , G06T2207/20084
Abstract: An environment synthesis framework generates virtual environments from a synthesized two-dimensional (2D) satellite map of a geographic area, a three-dimensional (3D) voxel environment, and a voxel-based neural rendering framework. In an example implementation, the synthesized 2D satellite map is generated by a map synthesis generative adversarial network (GAN) which is trained using sample city datasets. The multi-stage framework lifts the 2D map into a set of 3D octrees, generates an octree-based 3D voxel environment, and then converts it into a texturized 3D virtual environment using a neural rendering GAN and a set of pseudo ground truth images. The resulting 3D virtual environment is texturized, lifelike, editable, traversable in virtual reality (VR) and augmented reality (AR) experiences, and very large in scale.
-
公开(公告)号:US20230214639A1
公开(公告)日:2023-07-06
申请号:US17566877
申请日:2021-12-31
Applicant: Sumant Milind Hanumante , Qing Jin , Sergei Korolev , Denys Makoviichuk , Jian Ren , Dhritiman Sagar , Patrick Timothy McSweeney Simons , Sergey Tulyakov , Yang Wen , Richard Zhuang
Inventor: Sumant Milind Hanumante , Qing Jin , Sergei Korolev , Denys Makoviichuk , Jian Ren , Dhritiman Sagar , Patrick Timothy McSweeney Simons , Sergey Tulyakov , Yang Wen , Richard Zhuang
IPC: G06N3/08
CPC classification number: G06N3/08
Abstract: Techniques for training a neural network having a plurality of computational layers with associated weights and activations for computational layers in fixed-point formats include determining an optimal fractional length for weights and activations for the computational layers; training a learned clipping-level with fixed-point quantization using a PACT process for the computational layers; and quantizing on effective weights that fuses a weight of a convolution layer with a weight and running variance from a batch normalization layer. A fractional length for weights of the computational layers is determined from current values of weights using the determined optimal fractional length for the weights of the computational layers. A fixed-point activation between adjacent computational layers is related using PACT quantization of the clipping-level and an activation fractional length from a node in a following computational layer. The resulting fixed-point weights and activation values are stored as a compressed representation of the neural network.
-
公开(公告)号:US20220292724A1
公开(公告)日:2022-09-15
申请号:US17191970
申请日:2021-03-04
Applicant: Jian Ren , Menglei Chai , Sergey Tulyakov , Qing Jin
Inventor: Jian Ren , Menglei Chai , Sergey Tulyakov , Qing Jin
Abstract: System and methods for compressing image-to-image models. Generative Adversarial Networks (GANs) have achieved success in generating high-fidelity images. An image compression system and method adds a novel variant to class-dependent parameters (CLADE), referred to as CLADE-Avg, which recovers the image quality without introducing extra computational cost. An extra layer of average smoothing is performed between the parameter and normalization layers. Compared to CLADE, this image compression system and method smooths abrupt boundaries, and introduces more possible values for the scaling and shift. In addition, the kernel size for the average smoothing can be selected as a hyperparameter, such as a 3×3 kernel size. This method does not introduce extra multiplications but only addition, and thus does not introduce much computational overhead, as the division can be absorbed into the parameters after training.
-
公开(公告)号:US12236668B2
公开(公告)日:2025-02-25
申请号:US17865178
申请日:2022-07-14
Applicant: Jian Ren , Yang Wen , Ju Hu , Georgios Evangelidis , Sergey Tulyakov , Yanyu Li , Geng Yuan
Inventor: Jian Ren , Yang Wen , Ju Hu , Georgios Evangelidis , Sergey Tulyakov , Yanyu Li , Geng Yuan
IPC: G06K9/00 , G06V10/44 , G06V10/764 , G06V10/82
Abstract: A vision transformer network having extremely low latency and usable on mobile devices, such as smart eyewear devices and other augmented reality (AR) and virtual reality (VR) devices. The transformer network processes an input image, and the network includes a convolution stem configured to patch embed the image. A first stack of stages including at least two stages of 4-Dimension (4D) metablocks (MBs) (MB4D) follow the convolution stem. A second stack of stages including at least two stages of 3-Dimension MBs (MB3D) follow the MB4D stages. Each of the MB4D stages and each of the MB3D stages include different layer configurations, and each of the MB4D stages and each of the MB3D stages include a token mixer. The MB3D stages each additionally include a multi-head self attention (MHSA) processing block.
-
公开(公告)号:US20240296606A1
公开(公告)日:2024-09-05
申请号:US18176971
申请日:2023-03-01
Applicant: Sergey Smetanin , Arnab Ghosh , Pavel Savchenkov , Jian Ren , Sergey Tulyakov , Ivan Babanin , Timur Zakirov , Roman Golobokov , Aleksandr Zakharov , Dor Ayalon , Nikita Demidov , Vladimir Gordienko , Daniel Moreno , Nikita Belosludtcev , Sofya Savinova
Inventor: Sergey Smetanin , Arnab Ghosh , Pavel Savchenkov , Jian Ren , Sergey Tulyakov , Ivan Babanin , Timur Zakirov , Roman Golobokov , Aleksandr Zakharov , Dor Ayalon , Nikita Demidov , Vladimir Gordienko , Daniel Moreno , Nikita Belosludtcev , Sofya Savinova
IPC: G06T11/60 , G06F3/0482
CPC classification number: G06T11/60 , G06F3/0482 , G06T2200/24
Abstract: Examples disclosed herein describe techniques related to automated image generation in an interaction system. An image generation request is received from a first user device associated with a first user of an interaction system. The image generation request comprises a text prompt. Responsive to receiving the image generation request, an image is automatically generated by an automated text-to-image generator, based on the text prompt. The image is caused to be presented on the first user device. An indication of user input to select the image is received from the user device. Responsive to receiving the indication of the user input to select the image, the image is associated with the first user within the interaction system, and a second user of the interaction system is enabled to be presented with the image.
-
6.
公开(公告)号:US20240193855A1
公开(公告)日:2024-06-13
申请号:US18080089
申请日:2022-12-13
Applicant: Menglei Chai , Hsin-Ying Lee , Aliaksandr Siarohin , Sergey Tulyakov , Yinghao Xu , Ivan Skorokhodov
Inventor: Menglei Chai , Hsin-Ying Lee , Aliaksandr Siarohin , Sergey Tulyakov , Yinghao Xu , Ivan Skorokhodov
Abstract: A 3D-aware generative model for high-quality and controllable scene synthesis uses an abstract object-level representation (i.e., 3D bounding boxes without semantic annotation) as the scene layout prior, which is simple to obtain, general to describe various scene contents, and yet informative to disentangle objects and background. An overall layout for the scene is identified and then each object is located in the layout to facilitate the scene composition process. The object-level representation serves as an intuitive user control for scene editing. Based on such a prior, the system spatially disentangles the whole scene into object-centric generative radiance fields by learning on only 2D images with global-local discrimination. Once the model is trained, users can generate and edit a scene by explicitly controlling the camera and the layout of objects' bounding boxes.
-
公开(公告)号:US20240221281A1
公开(公告)日:2024-07-04
申请号:US18090692
申请日:2022-12-29
Applicant: Rameen Abdal , Menglei Chai , Hsin-Ying Lee , Aliaksandr Siarohin , Sergey Tulyakov , Peihao Zhu
Inventor: Rameen Abdal , Menglei Chai , Hsin-Ying Lee , Aliaksandr Siarohin , Sergey Tulyakov , Peihao Zhu
Abstract: Domain adaptation frameworks for producing a 3D avatar generative adversarial network (GAN) capable of generating an avatar based on a single photographic image. The 3D avatar GAN is produced by training a target domain using an artistic dataset. Each artistic dataset includes a plurality of source images, each associated with a style type, such as caricature, cartoon, and comic. The domain adaptation framework in some implementations starts with a source domain that has been trained according to a 3D GAN and a target domain trained with a 2D GAN. The framework fine-tunes the 2D GAN by training it with the artistic datasets. The resulting 3D avatar GAN generates a 3D artistic avatar and an editing module for performing semantic and geometric edits.
-
公开(公告)号:US20240203114A1
公开(公告)日:2024-06-20
申请号:US18080993
申请日:2022-12-14
Applicant: Jian Ren , Yanyu Li , Ju Hu , Yang Wen , Georgios Evangelidis , Sergey Tulyakov , Kamyar Salahi
Inventor: Jian Ren , Yanyu Li , Ju Hu , Yang Wen , Georgios Evangelidis , Sergey Tulyakov , Kamyar Salahi
CPC classification number: G06V10/95 , G06V10/7715
Abstract: A mobile vision transformer network for use on mobile devices, such as smart eyewear devices and other augmented reality (AR) and virtual reality (VR) devices. The mobile vision transformer network considers factors including number of parameters, latency, and model performance, as they reflect disk storage, mobile frames per second (FPS), and application quality, respectively. The mobile vision transformer network processes images, e.g., for image classification, segmentation, and detection. The mobile vision transformer network has a fine-grained architecture including a search algorithm performing latency-driven slimming that jointly improves model size and speed.
-
公开(公告)号:US20240202869A1
公开(公告)日:2024-06-20
申请号:US18080959
申请日:2022-12-14
Applicant: Jian Ren , Pavlo Chemerys , Vladislav Shakhrai , Ju Hu , Denys Makoviichuk , Sergey Tulyakov , Junli Cao
Inventor: Jian Ren , Pavlo Chemerys , Vladislav Shakhrai , Ju Hu , Denys Makoviichuk , Sergey Tulyakov , Junli Cao
IPC: G06T3/40
CPC classification number: G06T3/4046 , G06T3/4053
Abstract: A neural light field (NeLF) that runs real-time on mobile devices for neural rendering of three dimensional (3D) scenes, referred to as MobileR2L. The MobileR2L architecture runs efficiently on mobile devices with low latency and small size, and it achieves high-resolution generation while maintaining real-time inference for both synthetic and real-world 3D scenes on mobile devices. The MobileR2L has a network backbone including a convolutional layer embedding an input image at a resolution, residual blocks uploading the embedded image, and super-resolution modules receiving the uploaded embedded image and rendering an output image having a higher resolution than the embedded image. The convolution layer generates a number of rays equal to a number of pixels in the input image, where a partial number of the rays is uploaded to the super-resolution modules.
-
公开(公告)号:US20240070521A1
公开(公告)日:2024-02-29
申请号:US17893241
申请日:2022-08-23
Applicant: Jian Ren , Sergey Tulyakov , Yanyu Li , Geng Yuan
Inventor: Jian Ren , Sergey Tulyakov , Yanyu Li , Geng Yuan
CPC classification number: G06N20/00 , G06K9/6256
Abstract: A layer freezing and data sieving technique used in a sparse training domain for object recognition, providing end-to-end dataset-efficient training. The layer freezing and data sieving methods are seamlessly incorporated into a sparse training algorithm to form a generic framework. The generic framework consistently outperforms prior approaches and significantly reduces training floating point operations per second (FLOPs) and memory costs while preserving high accuracy. The reduction in training FLOPs comes from three sources: weight sparsity, frozen layers, and a shrunken dataset. The training acceleration depends on different factors, e.g., the support of the sparse computation, layer type and size, and system overhead. The FLOPs reduction from the frozen layers and shrunken dataset leads to higher actual training acceleration than weight sparsity.
-
-
-
-
-
-
-
-
-