Patent search ap:("Google LLC") AND inv:"Feng Yang" Page 1

1.

发明申请
Attribute Recognition with Image-Conditioned Prefix Language Modeling 有权

公开(公告)号：US20250054322A1

公开(公告)日：2025-02-13

申请号：US18787616

申请日：2024-07-29

Applicant: Google LLC

Inventor： Keren Ye , Yicheng Zhu , Junjie Ke , Jiahui Yu , Leonidas John Guibas , Peyman Milanfar , Feng Yang

IPC: G06V20/70 , G06F40/279

Abstract: Systems and methods for attribute recognition can include obtaining an image and a text string. The text string can be processed with a language model to generate a set of candidate attributes based on sequence based prediction. The image and the candidate attributes can be processed with an image-text model to determine a likelihood that the respective candidate attribute is depicted in the image. The likelihood determination can then be utilized to determine a predicted attribute for the object of interest.

2.

发明授权
Multi-scale transformer for image analysis 有权

公开(公告)号：US12217382B2

公开(公告)日：2025-02-04

申请号：US18527528

申请日：2023-12-04

Applicant: Google LLC

Inventor： Junjie Ke , Feng Yang , Qifei Wang , Yilin Wang , Peyman Milanfar

IPC: G06K9/00 , G06T3/04 , G06T3/40 , G06T7/00

Abstract: The technology employs a patch-based multi-scale Transformer (300) that is usable with various imaging applications. This avoids constraints on image fixed input size and predicts the quality effectively on a native resolution image. A native resolution image (304) is transformed into a multi-scale representation (302), enabling the Transformer's self-attention mechanism to capture information on both fine-grained detailed patches and coarse-grained global patches. Spatial embedding (316) is employed to map patch positions to a fixed grid, in which patch locations at each scale are hashed to the same grid. A separate scale embedding (318) is employed to distinguish patches coming from different scales in the multiscale representation. Self-attention (508) is performed to create a final image representation. In some instances, prior to performing self-attention, the system may prepend a learnable classification token (322) to the set of input tokens.

3.

发明申请
Multi-Axis Vision Transformer 有权

公开(公告)号：US20250022269A1

公开(公告)日：2025-01-16

申请号：US18902546

申请日：2024-09-30

Applicant: Google LLC

Inventor： Yinxiao Li , Feng Yang , Peyman Milanfar , Han Zhang , Zhengzhong Tu , Hossein Talebi

IPC: G06V10/82 , G06V10/77

Abstract: Provided is an efficient and scalable attention model that can be referred to as multi-axis attention. Example implementations can include two aspects: blocked local and dilated global attention. These design choices allow global-local spatial interactions on arbitrary input resolutions with only linear complexity. The present disclosure also presents a new architectural element by effectively blending the proposed multi-axis attention model with convolutions. In addition, the present disclosure proposes a simple hierarchical vision backbone, example implementations of which can be referred to as MaxViT, by simply repeating the basic building block over multiple stages. Notably, MaxViT is able to “see” globally throughout the entire network, even in earlier, high-resolution stages.

4.

发明授权
Deep palette prediction 有权

公开(公告)号：US12198229B2

公开(公告)日：2025-01-14

申请号：US17782727

申请日：2020-01-08

Applicant: GOOGLE LLC

Inventor： Xiyang Luo , Innfarn Yoo , Feng Yang

IPC: G06T11/00 , G06T7/90 , G06T9/00 , G06V10/28 , G06V10/56 , G06V10/74 , G06V10/82

Abstract: Example embodiments allow for training of encoders (e.g., artificial neural networks (ANNs)) to generate a color palette based on an input image. The color palette can then be used to generate, using the input image, a quantized, reduced color depth image that corresponds to the input image. Differences between a plurality of such input images and corresponding quantized images are used to train the encoder. Encoders trained in this manner are especially suited for generating color palettes used to convert images into different reduced color depth image file formats. Such an encoder also has benefits, with respect to memory use and computational time or cost, relative to the median-cut algorithm or other methods for producing reduced color depth color palettes for images.

5.

发明公开
Multi-scale Transformer for Image Analysis 审中-公开

公开(公告)号：US20240119555A1

公开(公告)日：2024-04-11

申请号：US18527528

申请日：2023-12-04

Applicant: Google LLC

Inventor： Junjie Ke , Feng Yang , Qifei Wang , Yilin Wang , Peyman Milanfar

IPC: G06T3/00 , G06T3/40 , G06T7/00

CPC classification number: G06T3/0012 , G06T3/40 , G06T7/0002 , G06T2207/20016 , G06T2207/20081 , G06T2207/30168

Abstract: The technology employs a patch-based multi-scale Transformer (300) that is usable with various imaging applications. This avoids constraints on image fixed input size and predicts the quality effectively on a native resolution image. A native resolution image (304) is transformed into a multi-scale representation (302), enabling the Transformer's self-attention mechanism to capture information on both fine-grained detailed patches and coarse-grained global patches. Spatial embedding (316) is employed to map patch positions to a fixed grid, in which patch locations at each scale are hashed to the same grid. A separate scale embedding (318) is employed to distinguish patches coming from different scales in the multiscale representation. Self-attention (508) is performed to create a final image representation. In some instances, prior to performing self-attention, the system may prepend a learnable classification token (322) to the set of input tokens.

6.

发明公开
ZOOM AGNOSTIC WATERMARK EXTRACTION 审中-公开

公开(公告)号：US20230325959A1

公开(公告)日：2023-10-12

申请号：US17926213

申请日：2021-06-21

Applicant: Google LLC

Inventor： Dake He , Tianhao Zhang , Elnaz Barshan Tashnizi , Xiyang Luo , Huiwen Chang , Feng Yang , Ryan Matthew Haggarty

IPC: G06T1/00 , G06T3/40 , G06T5/20 , G06V10/764

CPC classification number: G06T1/0021 , G06T3/40 , G06T5/20 , G06V10/764 , G06T2201/0065 , G06T2207/20081

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for detecting and decoding a visually imperceptible or perceptible watermark. A watermark detection apparatus determines whether the particular image includes a visually imperceptible or perceptible watermark using detector a machine learning model. If the watermark detection apparatus detects a watermark, the particular image is routed to a watermark decoder. If the watermark detection apparatus cannot detect a watermark in the particular image, the particular image is filtered from further processing. The watermark decoder decodes the visually imperceptible or perceptible watermark detected in the particular image. After decoding, an item depicted in the particular image is validated based data extracted from the decoded visually imperceptible or perceptible watermark.

7.

发明申请
IMAGE WATERMARKING 有权

公开(公告)号：US20230111326A1

公开(公告)日：2023-04-13

申请号：US17792062

申请日：2020-01-13

Applicant: GOOGLE LLC

Inventor： Ruohan Zhan , Feng Yang , Xiyang Luo , Peyman Milanfar , Huiwen Chang , Ce Liu

IPC: G06T1/00 , G06T9/00 , G06V10/75 , G06V10/82

Abstract: Methods, systems, and computer programs encoded on a computer storage medium, that relate to extracting digital watermarks from images, irrespective of distortions introduced into these images. Methods can include inputting a first data item into a channel encoder that can generate a first encoded data item that is greater in length than the first data item and that (1) includes the input data item and (2) new data this is redundant of the input data item. Based on the first encoded data item and a first image, an encoder model can generate a first encoded image into which the first encoded data is embedded as a digital watermark. A decoder model can decode the first encoded data item to generate a second data, which can be decoded by the channel decoder to generate data that is predicted to be the first data.

8.

发明申请
Systems and Techniques for Retraining Models for Video Quality Assessment and for Transcoding Using the Retrained Models 有权

公开(公告)号：US20220415039A1

公开(公告)日：2022-12-29

申请号：US17762289

申请日：2019-11-26

Applicant: Google LLC

Inventor： Yilin Wang , Hossein Talebi , Peyman Milanfar , Feng Yang , Balineedu Adsumilli

IPC: G06V10/98 , G06V10/82 , G06V20/40 , G06N3/04

Abstract: A trained model is retrained for video quality assessment and used to identify sets of adaptive compression parameters for transcoding user generated video content. Using transfer learning, the model, which is initially trained for image object detection, is retrained for technical content assessment and then again retrained for video quality assessment. The model is then deployed into a transcoding pipeline and used for transcoding an input video stream of user generated content. The transcoding pipeline may be structured in one of several ways. In one example, a secondary pathway for video content analysis using the model is introduced into the pipeline, which does not interfere with the ultimate output of the transcoding should there be a network or other issue. In another example, the model is introduced as a library within the existing pipeline, which would maintain a single pathway, but ultimately is not expected to introduce significant latency.

9.

发明授权
Image watermarking 有权

公开(公告)号：US12190403B2

公开(公告)日：2025-01-07

申请号：US17792062

申请日：2020-01-13

Applicant: GOOGLE LLC

Inventor： Ruohan Zhan , Feng Yang , Xiyang Luo , Peyman Milanfar , Huiwen Chang , Ce Liu

IPC: G06T1/00 , G06T9/00 , G06V10/75 , G06V10/82

Abstract: Methods, systems, and computer programs encoded on a computer storage medium, that relate to extracting digital watermarks from images, irrespective of distortions introduced into these images. Methods can include inputting a first data item into a channel encoder that can generate a first encoded data item that is greater in length than the first data item and that (1) includes the input data item and (2) new data this is redundant of the input data item. Based on the first encoded data item and a first image, an encoder model can generate a first encoded image into which the first encoded data is embedded as a digital watermark. A decoder model can decode the first encoded data item to generate a second data, which can be decoded by the channel decoder to generate data that is predicted to be the first data.

10.

发明公开
EVALUATING VISUAL QUALITY OF DIGITAL CONTENT 审中-公开

公开(公告)号：US20240346546A1

公开(公告)日：2024-10-17

申请号：US18584716

申请日：2024-02-22

Applicant: Google LLC

Inventor： Catherine Shyu , Luying Li , Feng Yang , Junjie Ke , Xiyang Luo , Hao Feng , Chao-Hung Chen , Wenjing Kang , Zheng Xia , Shun-Chuan Chen , Yicong Tian , Xia Li , Han Ke

IPC: G06Q30/0242

CPC classification number: G06Q30/0244

Abstract: Systems, devices, methods, and computer readable medium for evaluating visual quality of digital content are disclosed. Methods can include identifying content assets including one or more images that are combined to create different digital components distributed to one or more client devices. A quality of each of the one or more images is evaluated using one or more machine learning models trained to evaluate one or more visual aspects that are deemed indicative of visual quality. An aggregate quality for the content assets is determined based, at least in part, on an output of the one or more machine learning models indicating the visual quality of each of the one or more images. A graphical user interface of a first computing device is updated to present a visual indication of the aggregate quality of the content assets.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification