-
公开(公告)号:US10956793B1
公开(公告)日:2021-03-23
申请号:US16192419
申请日:2018-11-15
Applicant: Snap Inc.
Inventor: Xiaoyu Wang , Ning Xu , Ning Zhang , Vitor R. Carvalho , Jia Li
Abstract: Systems, methods, devices, media, and computer readable instructions are described for local image tagging in a resource constrained environment. One embodiment involves processing image data using a deep convolutional neural network (DCNN) comprising at least a first subgraph and a second subgraph, the first subgraph comprising at least a first layer and a second layer, processing, the image data using at least the first layer of the first subgraph to generate first intermediate output data; processing, by the mobile device, the first intermediate output data using at least the second layer of the first subgraph to generate first subgraph output data, and in response to a determination that each layer reliant on the first intermediate data have completed processing, deleting the first intermediate data from the mobile device. Additional embodiments involve convolving entire pixel resolutions of the image data against kernels in different layers if the DCNN.
-
公开(公告)号:US20210073597A1
公开(公告)日:2021-03-11
申请号:US16949856
申请日:2020-11-17
Applicant: Snap Inc.
Inventor: Wei Han , Jianchao Yang , Ning Zhang , Jia Li
Abstract: Systems, devices, media, and methods are presented for identifying and categorically labeling objects within a set of images. The systems and methods receive an image depicting an object of interest, detect at least a portion of the object of interest within the image using a multilayer object model, determine context information, and identify the object of interest included in two or more bounding boxes.
-
公开(公告)号:US10861170B1
公开(公告)日:2020-12-08
申请号:US16206684
申请日:2018-11-30
Applicant: Snap Inc.
Inventor: Yuncheng Li , Linjie Luo , Xuecheng Nie , Ning Zhang
Abstract: Systems, devices, media and methods are presented for a human pose tracking framework. The human pose tracking framework may identify a message with video frames, generate, using a composite convolutional neural network, joint data representing joint locations of a human depicted in the video frames, the generating of the joint data by the composite convolutional neural network done by a deep convolutional neural network operating on one portion of the video frames, a shallow convolutional neural network operating on a another portion of the video frames, and tracking the joint locations using a one-shot learner neural network that is trained to track the joint locations based on a concatenation of feature maps and a convolutional pose machine. The human pose tracking framework may store, the joint locations, and cause presentation of a rendition of the joint locations on a user interface of a client device.
-
公开(公告)号:US10726059B1
公开(公告)日:2020-07-28
申请号:US16448900
申请日:2019-06-21
Applicant: Snap Inc.
Inventor: Zhou Ren , Xiaoyu Wang , Ning Zhang , Xutao Lv , Jia Li
Abstract: An image captioning system and method is provided for generating a caption for an image. The image captioning system utilizes a policy network and a value network to generate the caption. The policy network serves as a local guidance and the value network serves as a global and lookahead guidance.
-
公开(公告)号:US12198357B2
公开(公告)日:2025-01-14
申请号:US18367034
申请日:2023-09-12
Applicant: Snap Inc.
Inventor: Shenlong Wang , Linjie Luo , Ning Zhang , Jia Li
Abstract: Dense feature scale detection can be implemented using multiple convolutional neural networks trained on scale data to more accurately and efficiently match pixels between images. An input image can be used to generate multiple scaled images. The multiple scaled images are input into a feature net, which outputs feature data for the multiple scaled images. An attention net is used to generate an attention map from the input image. The attention map assigns emphasis as a soft distribution to different scales based on texture analysis. The feature data and the attention data can be combined through a multiplication process and then summed to generate dense features for comparison.
-
公开(公告)号:US12165335B2
公开(公告)日:2024-12-10
申请号:US18460335
申请日:2023-09-01
Applicant: Snap Inc.
Inventor: Yuncheng Li , Linjie Luo , Xuecheng Nie , Ning Zhang
IPC: G06T7/246 , G06T7/73 , G06V10/764 , G06V10/82 , G06V20/40 , G06V40/20 , G06F3/04817 , H04L51/04 , H04L67/01
Abstract: Systems, devices, media and methods are presented for a human pose tracking framework. The human pose tracking framework may identify a message with video frames, generate, using a composite convolutional neural network, joint data representing joint locations of a human depicted in the video frames, the generating of the joint data by the composite convolutional neural network done by a deep convolutional neural network operating on one portion of the video frames, a shallow convolutional neural network operating on a another portion of the video frames, and tracking the joint locations using a one-shot learner neural network that is trained to track the joint locations based on a concatenation of feature maps and a convolutional pose machine. The human pose tracking framework may store, the joint locations, and cause presentation of a rendition of the joint locations on a user interface of a client device.
-
公开(公告)号:US12056454B2
公开(公告)日:2024-08-06
申请号:US18201075
申请日:2023-05-23
Applicant: Snap Inc.
Inventor: Di Lu , Leonardo Ribas Machado das Neves , Vitor Rocha de Carvalho , Ning Zhang
IPC: G06F40/295 , G06F40/30 , G06N3/08 , G06N20/00
CPC classification number: G06F40/295 , G06F40/30 , G06N3/08 , G06N20/00
Abstract: A caption of a multimodal message (e.g., social media post) can be identified as a named entity using an entity recognition system. The entity recognition system can use a visual attention based mechanism to generate a visual context representation from an image and caption. The system can use the visual context representation to identify one or more terms of the caption as a named entity.
-
公开(公告)号:US12056442B2
公开(公告)日:2024-08-06
申请号:US17811822
申请日:2022-07-11
Applicant: Snap Inc.
Inventor: Rahul Sheth , Kevin Dechau Tang , Ning Zhang
IPC: G06F40/169 , G06F3/0482 , G06F3/04842 , G06T11/60 , H04W4/80
CPC classification number: G06F40/169 , G06F3/0482 , G06F3/04842 , G06T11/60 , H04W4/80 , G06T2200/24
Abstract: A system according to various exemplary embodiments includes a processor and a user interface, communication module, and memory coupled to the processor. The memory stores instructions that, when executed by the processor, cause the system to: retrieve a digital image from a server using the communication module; present the digital image on a display of the user interface; receive edits to the digital image via the user interface; generate, based on the edits, a modified digital image, wherein generating the modified digital image includes transforming a format of the digital image to include a field containing an identifier associated with the modified digital image; and transmit the modified digital image to the server using the communication module.
-
公开(公告)号:US20240037141A1
公开(公告)日:2024-02-01
申请号:US18378376
申请日:2023-10-10
Applicant: Snap Inc.
Inventor: Xiaoyu Wang , Ning Xu , Ning Zhang , Vitor Rocha de Carvalho , Jia Li
IPC: G06F16/58 , G06T1/00 , G06N3/08 , G06F16/9038 , G06N3/04 , G06F18/24 , G06N3/045 , H04N23/63 , G06V10/764 , G06V10/82 , G06V10/75
CPC classification number: G06F16/5866 , G06T1/0007 , G06N3/08 , G06F16/9038 , G06N3/04 , G06F18/24 , G06N3/045 , H04N23/63 , G06V10/764 , G06V10/82 , G06V10/751 , G06N5/022
Abstract: Systems, methods, devices, media, and computer readable instructions are described for local image tagging in a resource constrained environment. One embodiment involves processing image data using a deep convolutional neural network (DCNN) comprising at least a first subgraph and a second subgraph, the first subgraph comprising at least a first layer and a second layer, processing, the image data using at least the first layer of the first subgraph to generate first intermediate output data; processing, by the mobile device, the first intermediate output data using at least the second layer of the first subgraph to generate first subgraph output data, and in response to a determination that each layer reliant on the first intermediate data have completed processing, deleting the first intermediate data from the mobile device. Additional embodiments involve convolving entire pixel resolutions of the image data against kernels in different layers if the DCNN.
-
公开(公告)号:US11830209B2
公开(公告)日:2023-11-28
申请号:US17651524
申请日:2022-02-17
Applicant: Snap Inc.
Inventor: Travis Chen , Samuel Edward Hare , Yuncheng Li , Tony Mathew , Jonathan Solichin , Jianchao Yang , Ning Zhang
IPC: G06T7/50 , G06T19/20 , G06T7/20 , G06T19/00 , G06T7/73 , G06V10/20 , G06V20/20 , G06V20/40 , G06V10/764 , G06V20/64
CPC classification number: G06T7/50 , G06T7/20 , G06T7/73 , G06T19/006 , G06T19/20 , G06V10/255 , G06V10/764 , G06V20/20 , G06V20/40 , G06V20/64 , G06T2207/10016 , G06T2207/20084 , G06T2210/12 , G06T2219/2016
Abstract: Systems, devices, media, and methods are presented for object detection and inserting graphical elements into an image stream in response to detecting the object. The systems and methods detect an object of interest in received frames of a video stream. The systems and methods identify a bounding box for the object of interest and estimate a three-dimensional position of the object of interest based on a scale of the object of interest. The systems and methods generate one or more graphical elements having a size based on the scale of the object of interest and a position based on the three-dimensional position estimated for the object of interest. The one or more graphical elements are generated within the video stream to form a modified video stream. The systems and methods cause presentation of the modified video stream including the object of interest and the one or more graphical elements.
-
-
-
-
-
-
-
-
-