-
公开(公告)号:US12010440B2
公开(公告)日:2024-06-11
申请号:US18184179
申请日:2023-03-15
Applicant: Google LLC
Inventor: Yi Hung Chen , Chia-Kai Liang , Bartlomiej Maciej Wronski , Peyman Milanfar , Ignacio Garcia Dorado
IPC: H04N23/951 , G06T3/4069 , H04N23/68
CPC classification number: H04N23/951 , G06T3/4069 , H04N23/6811 , H04N23/6812 , H04N23/685 , H04N23/687
Abstract: The present disclosure describes systems and techniques directed to optical image stabilization movement to create a super-resolution image of a scene. The systems and techniques include a user device (102) introducing (502), through an optical image stabilization system (114), movement to one or more components of a camera system (112) of the user device (102). The user device (102) then captures (504) respective and multiple frames (306) of an image of a scene, where the respective and multiple frames (306) of the image of the scene have respective, sub-pixel offsets of the image of the scene across the multiple frames (306) as a result of the introduced movement to the one or more components of the camera system (112). The user device (102) performs (506), based on the respective, sub-pixel offsets of the image of the scene across the respective, multiple frames (306), super-resolution computations and creates (508) the super-resolution image of the scene based on the super-resolution computations.
-
公开(公告)号:US20240119555A1
公开(公告)日:2024-04-11
申请号:US18527528
申请日:2023-12-04
Applicant: Google LLC
Inventor: Junjie Ke , Feng Yang , Qifei Wang , Yilin Wang , Peyman Milanfar
CPC classification number: G06T3/0012 , G06T3/40 , G06T7/0002 , G06T2207/20016 , G06T2207/20081 , G06T2207/30168
Abstract: The technology employs a patch-based multi-scale Transformer (300) that is usable with various imaging applications. This avoids constraints on image fixed input size and predicts the quality effectively on a native resolution image. A native resolution image (304) is transformed into a multi-scale representation (302), enabling the Transformer's self-attention mechanism to capture information on both fine-grained detailed patches and coarse-grained global patches. Spatial embedding (316) is employed to map patch positions to a fixed grid, in which patch locations at each scale are hashed to the same grid. A separate scale embedding (318) is employed to distinguish patches coming from different scales in the multiscale representation. Self-attention (508) is performed to create a final image representation. In some instances, prior to performing self-attention, the system may prepend a learnable classification token (322) to the set of input tokens.
-
公开(公告)号:US20230111326A1
公开(公告)日:2023-04-13
申请号:US17792062
申请日:2020-01-13
Applicant: GOOGLE LLC
Inventor: Ruohan Zhan , Feng Yang , Xiyang Luo , Peyman Milanfar , Huiwen Chang , Ce Liu
Abstract: Methods, systems, and computer programs encoded on a computer storage medium, that relate to extracting digital watermarks from images, irrespective of distortions introduced into these images. Methods can include inputting a first data item into a channel encoder that can generate a first encoded data item that is greater in length than the first data item and that (1) includes the input data item and (2) new data this is redundant of the input data item. Based on the first encoded data item and a first image, an encoder model can generate a first encoded image into which the first encoded data is embedded as a digital watermark. A decoder model can decode the first encoded data item to generate a second data, which can be decoded by the channel decoder to generate data that is predicted to be the first data.
-
公开(公告)号:US20220415039A1
公开(公告)日:2022-12-29
申请号:US17762289
申请日:2019-11-26
Applicant: Google LLC
Inventor: Yilin Wang , Hossein Talebi , Peyman Milanfar , Feng Yang , Balineedu Adsumilli
Abstract: A trained model is retrained for video quality assessment and used to identify sets of adaptive compression parameters for transcoding user generated video content. Using transfer learning, the model, which is initially trained for image object detection, is retrained for technical content assessment and then again retrained for video quality assessment. The model is then deployed into a transcoding pipeline and used for transcoding an input video stream of user generated content. The transcoding pipeline may be structured in one of several ways. In one example, a secondary pathway for video content analysis using the model is introduced into the pipeline, which does not interfere with the ultimate output of the transcoding should there be a network or other issue. In another example, the model is introduced as a library within the existing pipeline, which would maintain a single pathway, but ultimately is not expected to introduce significant latency.
-
公开(公告)号:US10929952B2
公开(公告)日:2021-02-23
申请号:US15970393
申请日:2018-05-03
Applicant: Google LLC
Inventor: Peyman Milanfar , Yaniv Romano
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for upscaling an image. One of the methods includes upscaling a low resolution image, creating first pixel subsets of the first upscaled image, creating second pixel subsets of a high resolution image, determining, for each subset in the pixel subsets, a value of a property of the pixel subset, determining, for each subset in the pixel subsets, a group of subsets to which the corresponding pixel subset belongs using the value of the property, and determining, for each of the groups of subsets, a filter to apply to each of the first pixel subsets that correspond to the pixel subsets in the group to create a final pixel subset that approximates the corresponding second pixel subset using the first pixel subset, a combination of all of the final pixel subsets representing a second upscaled image.
-
公开(公告)号:US12230024B2
公开(公告)日:2025-02-18
申请号:US17762289
申请日:2019-11-26
Applicant: Google LLC
Inventor: Yilin Wang , Hossein Talebi , Peyman Milanfar , Feng Yang , Balineedu Adsumilli
Abstract: A trained model is retrained for video quality assessment and used to identify sets of adaptive compression parameters for transcoding user generated video content. Using transfer learning, the model, which is initially trained for image object detection, is retrained for technical content assessment and then again retrained for video quality assessment. The model is then deployed into a transcoding pipeline and used for transcoding an input video stream of user generated content. The transcoding pipeline may be structured in one of several ways. In one example, a secondary pathway for video content analysis using the model is introduced into the pipeline, which does not interfere with the ultimate output of the transcoding should there be a network or other issue. In another example, the model is introduced as a library within the existing pipeline, which would maintain a single pathway, but ultimately is not expected to introduce significant latency.
-
公开(公告)号:US20240187715A1
公开(公告)日:2024-06-06
申请号:US18546670
申请日:2021-05-19
Applicant: Google LLC
Inventor: Ignacio Garcia Dorado , Shambhavi Punja , Peyman Milanfar , Kiran Murthy , Janne Kontkanen , Isaac Reynolds , Damien Kelly , Alexander Schiffhauer
Abstract: An example embodiment may involve capturing a sequence of images, wherein there are 4 or more images in the sequence of images, and wherein each of the sequence of images has an exposure length of 4-100 seconds; applying a sliding window over the sequence of images as downsampled, wherein at least 4 images are encompassed within the sliding window’, and wherein for each position of the sliding window the applying involves: (i) aligning a set of images within the sliding window, and (ii) merging the set of images as aligned into a video frame; combining video frames generated by way of the sliding window into a video file; and storing, by the mobile device, the video file in memory of the mobile device.
-
公开(公告)号:US20240169498A1
公开(公告)日:2024-05-23
申请号:US18550997
申请日:2021-07-22
Applicant: Google LLC
Inventor: Fuhao Shi , Mauricio Delbracio , Chia-Kai Liang , Damien Martin Kelly , Peyman Milanfar
CPC classification number: G06T5/73 , H04N23/6812 , G06T2207/20201
Abstract: Systems and methods for real-time image deblur and stabilization can utilize sensor data for estimating motion blur without the high computational cost of image analysis techniques. The estimated motion blur can then be utilized to generate a motion blur kernel for image correction. The systems and methods can further refine the correction by processing the motion blur kernel with a polynomial filter to generate a sharpening kernel. The systems and methods can provide for real-time correction even with minimal to no stabilization masking.
-
公开(公告)号:US11887270B2
公开(公告)日:2024-01-30
申请号:US17787699
申请日:2021-07-01
Applicant: Google LLC
Inventor: Junjie Ke , Feng Yang , Qifei Wang , Yilin Wang , Peyman Milanfar
CPC classification number: G06T3/0012 , G06T3/40 , G06T7/0002 , G06T2207/20016 , G06T2207/20081 , G06T2207/30168
Abstract: The technology employs a patch-based multi-scale Transformer (300) that is usable with various imaging applications. This avoids constraints on image fixed input size and predicts the quality effectively on a native resolution image. A native resolution image (304) is transformed into a multi-scale representation (302), enabling the Transformer's self-attention mechanism to capture information on both fine-grained detailed patches and coarse-grained global patches. Spatial embedding (316) is employed to map patch positions to a fixed grid, in which patch locations at each scale are hashed to the same grid. A separate scale embedding (318) is employed to distinguish patches coming from different scales in the multiscale representation. Self-attention (508) is performed to create a final image representation. In some instances, prior to performing self-attention, the system may prepend a learnable classification token (322) to the set of input tokens.
-
公开(公告)号:US20240020788A1
公开(公告)日:2024-01-18
申请号:US18256783
申请日:2021-03-24
Applicant: Google LLC
Inventor: Xiyang Luo , Feng Yang , Ce Liu , Huiwen Chang , Peyman Milanfar , Yinxiao Li
IPC: G06T1/00
CPC classification number: G06T1/0085 , G06T2201/0083
Abstract: Systems and methods of the present disclosure are directed to a computing system. The computing system can obtain a message vector and video data comprising a plurality of video frames. The computing system can process the input video with a transformation portion of a machine-learned watermark encoding model to obtain a three-dimensional feature encoding of the input video. The computing system can process the three-dimensional feature encoding of the input video and the message vector with an embedding portion of the machine-learned watermark encoding model to obtain spatial-temporal watermark encoding data descriptive of the message vector. The computing system can generate encoded video data comprising a plurality of encoded video frames, wherein at least one of the plurality of encoded video frames includes the spatial-temporal watermark encoding data.
-
-
-
-
-
-
-
-
-