Patent search ap:("Salesforce Page Inc.") AND inv:"Ning Yu"

1.

发明申请
SYSTEMS AND METHODS FOR CONTROLLABLE DATA GENERATION FROM TEXT 有权

公开(公告)号：US20250068901A1

公开(公告)日：2025-02-27

申请号：US18423081

申请日：2024-01-25

Applicant: Salesforce, Inc.

Inventor： Shiyu Wang , Yihao Feng , Tian Lan , Ning Yu , Yu Bai , Ran Xu , Huan Wang , Caiming Xiong , Silvio Savarese

IPC: G06N3/08

Abstract: Embodiments described herein provide a diffusion-based framework that is trained on a dataset with limited text labels, to generate a distribution of data samples in the dataset given a specific text description label. Specifically, firstly, unlabeled data is used to train the diffusion model to generate a data distribution of data samples given a specific text description label. Then text-labeled data samples are used to finetune the diffusion model to generate data distribution given a specific text description label, thus enhancing controllability of training.

2.

发明公开
SYSTEMS AND METHODS FOR ATTENTION MECHANISM IN THREE-DIMENSIONAL OBJECT DETECTION 审中-公开

公开(公告)号：US20240169746A1

公开(公告)日：2024-05-23

申请号：US18161661

申请日：2023-01-30

Applicant: Salesforce, Inc.

Inventor： Manli Shu , Le Xue , Ning Yu , Roberto Martín-Martín , Juan Carlos Niebles Duque , Caiming Xiong , Ran Xu

IPC: G06V20/64 , G06T3/40 , G06V10/46 , G06V10/82

CPC classification number: G06V20/64 , G06T3/4007 , G06V10/46 , G06V10/82

Abstract: Embodiments described herein provide a system for three-dimensional (3D) object detection. The system includes an input interface configured to obtain 3D point data describing spatial information of a plurality of points, and a memory storing a neural network based 3D object detection model having an encoder and a decoder. The system also includes processors to perform operations including: encoding, by the encoder, a first set of coordinates into a first set of point features and a set of object features; sampling a second set of point features from the first set of point features; generating, by attention layers at the decoder, a set of attention weights by applying cross-attention over at least the set of object features and the second set of point feature, and generate, by the decoder, a predicted bounding box among the plurality of points based on at least in part on the set of attention weights.

3.

发明公开
SYSTEMS AND METHODS FOR OPEN VOCABULARY INSTANCE SEGMENTATION IN UNANNOTATED IMAGES 审中-公开

公开(公告)号：US20240070868A1

公开(公告)日：2024-02-29

申请号：US18159318

申请日：2023-01-25

Applicant: Salesforce, Inc.

Inventor： Ning Yu , Vibashan Vishnukumar Sharmini , Chen Xing , Juan Carlos Niebles Duque , Ran Xu

IPC: G06T7/11 , G06V10/26

CPC classification number: G06T7/11 , G06V10/273

Abstract: Embodiments described herein provide an open-vocabulary instance segmentation framework that adopts a pre-trained vision-language model to develop a pipeline in detecting novel categories of instances.

4.

发明申请
SYSTEMS AND METHODS FOR MULTI-MODAL LANGUAGE MODELS 有权

公开(公告)号：US20240370718A1

公开(公告)日：2024-11-07

申请号：US18400477

申请日：2023-12-29

Applicant: Salesforce, Inc.

Inventor： Artemis Panagopoulou , Le Xue , Ning Yu , Junnan Li , Dongxu Li , Silvio Savarese , Shafiq Rayhan Joty , Ran Xu , Caiming Xiong , Juan Carlos Niebles Duque

IPC: G06N3/08 , G06N3/0455

Abstract: Embodiments described herein provide a method of generating a multi-modal task output to a text instruction relating to inputs of multiple different modalities (e.g., text, audio, video, 3D). The method comprises receiving, via a data interface, a first input of a first modality, a second input of a second modality and the text instruction relating to the first and the second inputs; encoding, by a first multimodal encoder adapted for the first modality, the first input of the first modality into a first encoded representation conditioned on the text instruction; encoding, by a second multimodal encoder adapted for the second modality, the second input of the second modality into a second encoded representation conditioned on the text instruction; and generating, by a neural network based language model, the multi-modal task output based on an input combining the first encoded representation, the second encoded representation, and the text instruction.

5.

发明公开
SYSTEMS AND METHODS FOR MULTIMODAL PRETRAINING FOR THREE-DIMENSIONAL UNDERSTANDING MODELS 审中-公开

公开(公告)号：US20240312128A1

公开(公告)日：2024-09-19

申请号：US18493035

申请日：2023-10-24

Applicant: Salesforce, Inc.

Inventor： Le Xue , Ning Yu , Shu Zhang , Junnan Li , Caiming Xiong , Silvio Savarese , Juan Carlos Niebles Duque , Ran Xu

IPC: G06T17/00 , G06F40/40

CPC classification number: G06T17/00 , G06F40/40

Abstract: A method of training a neural network based three-dimensional (3D) encoder is provided. A first plurality of samples of a training dataset are generated using a first 3D model. An image generator with multi-view rendering is used to generate a plurality of two-dimensional (2D) images having different viewpoints of the first 3D model. A first language model is used to generate a plurality of texts corresponding to the plurality of 2D images respectively. A first text for a first image is generated by using one or more text descriptions generated by the first language model. A point cloud is generated by randomly sampling points in the 3D model. The first plurality of samples are generated using the plurality of 2D images, the corresponding plurality of texts, and the point cloud. The neural network based 3D encoder is trained using the training dataset including the first plurality of samples.

6.

发明公开
SYSTEMS AND METHODS FOR FEEDBACK BASED INSTRUCTIONAL VISUAL EDITING 审中-公开

公开(公告)号：US20240303882A1

公开(公告)日：2024-09-12

申请号：US18350876

申请日：2023-07-12

Applicant: Salesforce, Inc.

Inventor： Shu Zhang , Xinyi Yang , Yihao Feng , Ran Xu , Ning Yu , Chia-Chih Chen

IPC: G06T11/60 , G06T5/00

CPC classification number: G06T11/60 , G06T5/70 , G06T2207/20081 , G06T2207/20084

Abstract: Embodiments described herein provide a feedback based instructional image editing framework that employs a diffusion process to follow user instruction for image editing. A diffusion model is fine-tuned using a reward model, which may be trained via human annotation. The training of the reward model may be done by having the image editing model output a number of images, which a human annotator ranks based on their alignment with the original image and a given instruction.

7.

发明公开
SYSTEMS AND METHODS FOR UNSUPERVISED TRAINING IN TEXT RETRIEVAL TASKS 审中-公开

公开(公告)号：US20240202530A1

公开(公告)日：2024-06-20

申请号：US18303313

申请日：2023-04-19

Applicant: Salesforce Inc.

Inventor： Rui Meng , Yingbo Zhou , Ye Liu , Semih Yavuz , Ning Yu

IPC: G06N3/084 , G06F40/20 , G06F40/40 , G06N3/0455 , G06N3/088

CPC classification number: G06N3/084 , G06F40/20 , G06F40/40 , G06N3/0455 , G06N3/088

Abstract: Embodiments described herein provide systems and methods for training a text retrieval model. A system may generate queries associated with provided documents. The queries may be generated in one or more different manners. Examples of query generation may include extracting relevant spans of text from the documents, prompting a language model for a topic, title, abstractive summary, and/or extractive summary based on the documents. Metadata such as title or other HTML tags may be used as queries. Using the one or more queries, the text retrieval model may be trained using contrastive learning, using the generated query, and positive and negative sample documents. A fine-tuning training phase may be performed using domain-specific data which may also be done with generated query pairs, or may be done in a supervised fashion with provided queries. The text retrieval model may be used to locate documents given an input query.

8.

发明公开
SYSTEMS AND METHODS FOR TEXT-TO-IMAGE GENERATION USING LANGUAGE MODELS 审中-公开

公开(公告)号：US20240185035A1

公开(公告)日：2024-06-06

申请号：US18162535

申请日：2023-01-31

Applicant: Salesforce, Inc.

Inventor： Ning Yu , Can Qin , Chen Xing , Shu Zhang , Stefano Ermon , Caiming Xiong , Ran Xu

IPC: G06N3/0455 , G06T5/00

CPC classification number: G06N3/0455 , G06T5/002 , G06T2207/20084

Abstract: Embodiments described herein provide a mechanism for replacing existing text encoders in text-to-image generation models with more powerful pre-trained language models. Specifically, a translation network is trained to map features from the pre-trained language model output into the space of the target text encoder. The training preserves the rich structure of the pre-trained language model while allowing it to operate within the text-to-image generation model. The resulting modularized text-to-image model receives prompt and generates an image representing the features contained in the prompt.

9.

发明公开
SYSTEMS AND METHODS FOR MULTIMODAL LAYOUT DESIGNS OF DIGITAL PUBLICATIONS 审中-公开

公开(公告)号：US20240104809A1

公开(公告)日：2024-03-28

申请号：US18161680

申请日：2023-01-30

Applicant: Salesforce, Inc.

Inventor： Ning Yu , Chia-Chih Chen , Zeyuan Chen , Caiming Xiong , Juan Carlos Niebles Duque , Ran Xu , Rui Meng

IPC: G06T11/60 , G06F40/106 , G06F40/126 , G06N20/00 , G06T9/00

CPC classification number: G06T11/60 , G06F40/106 , G06F40/126 , G06N20/00 , G06T9/00 , G06T2200/24 , G06T2210/12

Abstract: Embodiments described herein provide systems and methods for multimodal layout generations for digital publications. The system may receive as inputs, a background image, one or more foreground texts, and one or more foreground images. Feature representations of the background image may be generated. The foreground inputs may be input to a layout generator which has cross attention to the background image feature representations in order to generate a layout comprising of bounding box parameters for each input item. A composite layout may be generated based on the inputs and generated bounding boxes. The resulting composite layout may then be displayed on a user interface.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification