-
公开(公告)号:US20240296596A1
公开(公告)日:2024-09-05
申请号:US18569844
申请日:2023-08-23
Applicant: Google LLC
Inventor: Kfir Aberman , Nataniel Ruiz Gutierrez , Michael Rubinstein , Yuanzhen Li , Yael Pritch Knaan , Varun Jampani
IPC: G06T11/00 , G06V10/764
CPC classification number: G06T11/00 , G06V10/764 , G06V2201/07
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a text-to-image model so that the text-to-image model generates images that each depict a variable instance of an object class when the object class without the unique identifier is provided as a text input, and that generates images that each depict a same subject instance of the object class when the unique identifier is provided as the text input.
-
公开(公告)号:US20230021805A1
公开(公告)日:2023-01-26
申请号:US17961388
申请日:2022-10-06
Applicant: Google LLC
Inventor: Mikaël Bonnevie , Yuanzhen Li , Ce Liu
Abstract: A multimedia communication system and computer-implemented method for transmitting auxiliary display content to an end-user communication device to be rendered on a display device with a special effect to emphasize an image included in the auxiliary display content, comprising a processor and a transmitter. The processor can be arranged to analyze image data included in an auxiliary display content, detect an object image or a background image in the auxiliary display content based on the analysis of the image data, determine a special effect based on the analysis of the image data, and apply the special effect to the auxiliary display content to modify display properties for the auxiliary display content such that the object image is emphasized or pops-out. The transmitter can be arranged to send the auxiliary display content with modified display properties to an end-user communication device. The special effect can comprise a non-customization special effect, a simple foreground special effect or a selective foreground special effect.
-
公开(公告)号:US20240404154A1
公开(公告)日:2024-12-05
申请号:US18804462
申请日:2024-08-14
Applicant: Google LLC
Inventor: Mikaël Bonnevie , Yuanzhen Li , Ce Liu
Abstract: A multimedia communication system and computer-implemented method for transmitting auxiliary display content to an end-user communication device to be rendered on a display device with a special effect to emphasize an image included in the auxiliary display content, comprising analyzing image data included in an auxiliary display content to detect an object image or a background image, determining a special effect based on the analysis of the image data, applying the special effect to the auxiliary display content to modify display properties for the auxiliary display content such that the object image is emphasized or pops out, and sending the auxiliary display content with modified display properties to the end-user communication device. The special effect can comprise a non-customization special effect, a simple foreground special effect or a selective foreground special effect.
-
公开(公告)号:US20240394840A1
公开(公告)日:2024-11-28
申请号:US18669939
申请日:2024-05-21
Applicant: Google LLC
Inventor: Elchonon Zeav Lapin , Xibing Yang , Amit Handa , Apurv Suman , Siddhant Mittal , Ashish Dilipchand Bora , Thorne Wolfenbarger , Naga Sreenivas Meruva , Yudong Sun , Rahul Guin , Arie Sharon , Beatriz Alessio Robles Orozco , Yuanzhen Li , Zhongyue Zheng , Mohammad Izadi
Abstract: Using artificial intelligence (AI), imagery may be created for content in response to verbal or textual input. The imagery includes an object, such as a product, and a quality of the image is improved using pre-processing techniques before the image is generated and post-processing techniques after the image is generated. The pre-processing may include upscaling the object in the original image, segmenting the object from its background in the captured image, adding an outline or border stroke to the object. The post-processing techniques may include removing the object from the AI-generated background while keeping shadows and other effects in place, blurring portions of the AI-generated background where the object will be positioned, removing the outline from the object, and re-positioning the object in the AI-generated background with the outline removed.
-
公开(公告)号:US20240037145A1
公开(公告)日:2024-02-01
申请号:US17878845
申请日:2022-08-01
Applicant: Google LLC
Inventor: Marco Ziccardi , Min-hsuan Tsai , Wei-Hong Chuang , Rahul Sunil Bhalerao , Ye Xia , Madhuri Shanbhogue , Mojtaba Seyedhosseini , Mike Krainin , Andrei Kapishnikov , Yuanzhen Li
IPC: G06F16/783 , G06F16/78 , G06F40/56
CPC classification number: G06F16/7837 , G06F16/7867 , G06F40/56
Abstract: A method includes obtaining first data including a first identifier of a first product determine in association with a content item based on first metadata of the content item. The method further includes obtaining a first confidence value associated with the first product and the content item. The method further includes obtaining second data including a second identifier of the first product and a second confidence value. The method further includes providing the first data and the second data to a trained machine learning model. The method further includes obtaining a third confidence value from the trained machine learning model associated with the first product. The method further includes adjusting second metadata of the content item in view of the third confidence value.
-
公开(公告)号:US20240320912A1
公开(公告)日:2024-09-26
申请号:US18611236
申请日:2024-03-20
Applicant: Google LLC
Inventor: Yuanzhen Li , Amit Raj , Varun Jampani , Benjamin Joseph Mildenhall , Benjamin Michael Poole , Jonathan Tilton Barron , Kfir Aberman , Michael Niemeyer , Michael Rubinstein , Nataniel Ruiz Gutierrez , Shiran Elyahu Zada , Srinivas Kaza
IPC: G06T17/00 , H04N13/279 , H04N13/351
CPC classification number: G06T17/00 , H04N13/279 , H04N13/351
Abstract: A fractional training process can be performed training images to an instance of a machine-learned generative image model to obtain a partially trained instance of the model. A fractional optimization process can be performed with the partially trained instance to an instance of a machine-learned three-dimensional (3D) implicit representation model obtain a partially optimized instance of the model. Based on the plurality of training images, pseudo multi-view subject images can be generated with the partially optimized instance of the 3D implicit representation model and a fully trained instance of the generative image model; The partially trained instance of the model can be trained with a set of training data. The partially optimized instance of the machine-learned 3D implicit representation model can be trained with the machine-learned multi-view image model.
-
公开(公告)号:US12086913B2
公开(公告)日:2024-09-10
申请号:US17961388
申请日:2022-10-06
Applicant: Google LLC
Inventor: Mikaël Bonnevie , Yuanzhen Li , Ce Liu
CPC classification number: G06T11/60 , G06T3/40 , G06T5/70 , G06T7/194 , G06T2200/24
Abstract: A multimedia communication system and computer-implemented method for transmitting auxiliary display content to an end-user communication device to be rendered on a display device with a special effect to emphasize an image included in the auxiliary display content, comprising a processor and a transmitter. The processor can be arranged to analyze image data included in an auxiliary display content, detect an object image or a background image in the auxiliary display content based on the analysis of the image data, determine a special effect based on the analysis of the image data, and apply the special effect to the auxiliary display content to modify display properties for the auxiliary display content such that the object image is emphasized or pops-out. The transmitter can be arranged to send the auxiliary display content with modified display properties to an end-user communication device. The special effect can comprise a non-customization special effect, a simple foreground special effect or a selective foreground special effect.
-
公开(公告)号:US20220036613A1
公开(公告)日:2022-02-03
申请号:US17009194
申请日:2020-09-01
Applicant: Google LLC
Inventor: Mikaël Bonnevie , Yuanzhen Li , Ce Liu
Abstract: A multimedia communication system and computer-implemented method for transmitting auxiliary display content to an end-user communication device to be rendered on a display device with a special effect to emphasize an image included in the auxiliary display content, comprising a processor and a transmitter. The processor can be arranged to analyze image data included in an auxiliary display content, detect an object image or a background image in the auxiliary display content based on the analysis of the image data, determine a special effect based on the analysis of the image data, and apply the special effect to the auxiliary display content to modify display properties for the auxiliary display content such that the object image is emphasized or pops-out. The transmitter can be arranged to send the auxiliary display content with modified display properties to an end-user communication device. The special effect can comprise a non-customization special effect, a simple foreground special effect or a selective foreground special effect.
-
公开(公告)号:US20240412458A1
公开(公告)日:2024-12-12
申请号:US18741680
申请日:2024-06-12
Applicant: Google LLC
Inventor: Varun Jampani , Chun-Han Yao , Amit Raj , Wei-Chih Hung , Ming-Hsuan Yang , Michael Rubinstein , Yuanzhen Li
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for editing images based on decoder-based accumulative score sampling (DASS) losses.
-
公开(公告)号:US12026201B2
公开(公告)日:2024-07-02
申请号:US17334923
申请日:2021-05-31
Applicant: GOOGLE LLC
Inventor: Yuanzhen Li
IPC: G06F16/783 , G06F16/71 , G06K9/00 , G06V20/20 , G06V20/40
CPC classification number: G06F16/7837 , G06F16/71 , G06F16/7844 , G06V20/20 , G06V20/46
Abstract: Automated product identification within hosted and streamed videos is performed based on video content of a video received at an online video platform and text content associated with the video. First embeddings representative of one or more first candidate products are determined based on video content of the video, such as one or more frames selected from within the video. Second embeddings representative of one or more second candidate products are determined based on text content associated with the video, such as a title, description, or transcript of the video. A product candidate index is produced based on the second embeddings. A product identification representative of a product featured in the video is determined based on a comparison of the first embeddings against entries of the product candidate index, such as including by a nearest neighbor search responsive to the comparison. An indication of the product identification is then output at the online video platform.
-
-
-
-
-
-
-
-
-