-
公开(公告)号:US20230385551A1
公开(公告)日:2023-11-30
申请号:US18201075
申请日:2023-05-23
Applicant: Snap Inc.
Inventor: Di Lu , Leonardo Ribas Machado das Neves , Vitor Rocha de Carvalho , Ning Zhang
IPC: G06F40/295 , G06N20/00 , G06N3/08 , G06F40/30
CPC classification number: G06F40/295 , G06N20/00 , G06N3/08 , G06F40/30
Abstract: A caption of a multimodal message (e.g., social media post) can be identified as a named entity using an entity recognition system. The entity recognition system can use a visual attention based mechanism to generate a visual context representation from an image and caption. The system can use the visual context representation to identify one or more terms of the caption as a named entity.
-
公开(公告)号:US20230237706A1
公开(公告)日:2023-07-27
申请号:US18128128
申请日:2023-03-29
Applicant: Snap Inc.
Inventor: Drake Austin Rehfeld , Rahul Bhupendra Sheth , Ning Zhang
Abstract: Disclosed are methods for encoding information in a graphic image. The information may be encoded so as to have a visual appearance that adopts a particular style, so that the encoded information is visually pleasing in the environment in which it is displayed. An encoder and decoder are trained during an integrated training process, where the encoder is tuned to minimize a loss when its encoded images are decoded. Similarly, the decoder is also trained to minimize loss when decoding the encoded images. Both the encoder and decoder may utilize a convolutional neural network in some aspects to analyze data and/or images. Once data is encoded, a style from a sample image is transferred to the encoded data. When decoding, the decoder may largely ignore the style aspects of the encoded data and decode based on a content portion of the data.
-
公开(公告)号:US20220327358A1
公开(公告)日:2022-10-13
申请号:US17808274
申请日:2022-06-22
Applicant: Snap Inc.
Inventor: Jacob Minyoung Huh , Shao-Hua Sun , Ning Zhang
IPC: G06N3/04 , G06V30/194 , G06N3/08
Abstract: Disclosed is a feedback adversarial learning framework, a recurrent framework for generative adversarial networks that can be widely adapted to not only stabilize training but also generate higher quality images. In some aspects, a discriminator's spatial outputs are distilled to improve generation quality. The disclosed embodiments model the discriminator into the generator, and the generator learns from its mistakes over time. In some aspects, a discriminator architecture encourages the model to be locally and globally consistent.
-
公开(公告)号:US20220172448A1
公开(公告)日:2022-06-02
申请号:US17651524
申请日:2022-02-17
Applicant: Snap Inc.
Inventor: Travis Chen , Samuel Edward Hare , Yuncheng Li , Tony Mathew , Jonathan Solichin , Jianchao Yang , Ning Zhang
Abstract: Systems, devices, media, and methods are presented for object detection and inserting graphical elements into an image stream in response to detecting the object. The systems and methods detect an object of interest in received frames of a video stream. The systems and methods identify a bounding box for the object of interest and estimate a three-dimensional position of the object of interest based on a scale of the object of interest. The systems and methods generate one or more graphical elements having a size based on the scale of the object of interest and a position based on the three-dimensional position estimated for the object of interest. The one or more graphical elements are generated within the video stream to form a modified video stream. The systems and methods cause presentation of the modified video stream including the object of interest and the one or more graphical elements.
-
公开(公告)号:US20210256736A1
公开(公告)日:2021-08-19
申请号:US17302361
申请日:2021-04-30
Applicant: Snap Inc. Prefix
Inventor: Drake Austin Rehfeld , Rahul Bhupendra Sheth , Ning Zhang
Abstract: Disclosed are methods for encoding information in a graphic image. The information may be encoded so as to have a visual appearance that adopts a particular style, so that the encoded information is visually pleasing in the environment in which it is displayed. An encoder and decoder are trained during an integrated training process, where the encoder is tuned to minimize a loss when its encoded images are decoded. Similarly, the decoder is also trained to minimize loss when decoding the encoded images. Both the encoder and decoder may utilize a convolutional neural network in some aspects to analyze data and/or images. Once data is encoded, a style from a sample image is transferred to the encoded data. When decoding, the decoder may largely ignore the style aspects of the encoded data and decode based on a content portion of the data.
-
公开(公告)号:US20200258313A1
公开(公告)日:2020-08-13
申请号:US15929374
申请日:2020-04-29
Applicant: Snap Inc.
Inventor: Travis Chen , Samuel Edward Hare , Yuncheng Li , Tony Mathew , Jonathan Solichin , Jianchao Yang , Ning Zhang
Abstract: Systems, devices, media, and methods are presented for object detection and inserting graphical elements into an image stream in response to detecting the object. The systems and methods detect an object of interest in received frames of a video stream. The systems and methods identify a bounding box for the object of interest and estimate a three-dimensional position of the object of interest based on a scale of the object of interest. The systems and methods generate one or more graphical elements having a size based on the scale of the object of interest and a position based on the three-dimensional position estimated for the object of interest. The one or more graphical elements are generated within the video stream to form a modified video stream. The systems and methods cause presentation of the modified video stream including the object of interest and the one or more graphical elements.
-
公开(公告)号:US10467274B1
公开(公告)日:2019-11-05
申请号:US15808617
申请日:2017-11-09
Applicant: Snap Inc.
Inventor: Zhou Ren , Xiaoyu Wang , Ning Zhang , Xutao Lv , Jia Li
Abstract: An image captioning system and method is provided for generating a caption for an image. The image captioning system utilizes a policy network and a value network to generate the caption. The policy network serves as a local guidance and the value network serves as a global and lookahead guidance.
-
公开(公告)号:US20240054687A1
公开(公告)日:2024-02-15
申请号:US18382729
申请日:2023-10-23
Applicant: Snap Inc.
Inventor: Drake Austin Rehfeld , Rahul Bhupendra Sheth , Ning Zhang
Abstract: An example system includes an encoder configured to receive a bit string and encode the bit string into a visual representation, and a decoder configured to receive an image including the visual representation and decode the bit string from the visual representation. In some examples, the encoder and decoder are trained as a pair by obtaining a training bit string, encoding the training bit string into a training visual representation using the encoder, decoding the training visual representation using the decoder to generate a decoded bit string, determining an error between the training bit string and the decoded bit string, and updating parameters of the encoder and decoder to reduce the error.
-
公开(公告)号:US11861854B2
公开(公告)日:2024-01-02
申请号:US17825994
申请日:2022-05-26
Applicant: Snap Inc.
Inventor: Shenlong Wang , Linjie Luo , Ning Zhang , Jia Li
CPC classification number: G06T7/248 , G06T7/33 , G06V10/454 , G06V10/764 , G06V10/82 , G06T7/40 , G06T2207/20084
Abstract: Dense feature scale detection can be implemented using multiple convolutional neural networks trained on scale data to more accurately and efficiently match pixels between images. An input image can be used to generate multiple scaled images. The multiple scaled images are input into a feature net, which outputs feature data for the multiple scaled images. An attention net is used to generate an attention map from the input image. The attention map assigns emphasis as a soft distribution to different scales based on texture analysis. The feature data and the attention data can be combined through a multiplication process and then summed to generate dense features for comparison.
-
公开(公告)号:US20230419512A1
公开(公告)日:2023-12-28
申请号:US18367034
申请日:2023-09-12
Applicant: Snap Inc.
Inventor: Shenlong Wang , Linjie Luo , Ning Zhang , Jia Li
IPC: G06T7/246 , G06T7/33 , G06V10/764 , G06V10/82 , G06V10/44
CPC classification number: G06T7/248 , G06T7/33 , G06V10/764 , G06V10/82 , G06V10/454 , G06T2207/20084 , G06T7/40
Abstract: Dense feature scale detection can be implemented using multiple convolutional neural networks trained on scale data to more accurately and efficiently match pixels between images. An input image can be used to generate multiple scaled images. The multiple scaled images are input into a feature net, which outputs feature data for the multiple scaled images. An attention net is used to generate an attention map from the input image. The attention map assigns emphasis as a soft distribution to different scales based on texture analysis. The feature data and the attention data can be combined through a multiplication process and then summed to generate dense features for comparison.
-
-
-
-
-
-
-
-
-