Patent search ap:("QUALCOMM Incorporated") AND inv:"Arvind Krishna SRIDHAR" Page 1

1.

发明申请
FAITHFUL GENERATION OF OUTPUT TEXT FOR MULTIMODAL APPLICATIONS 有权

公开(公告)号：US20250078818A1

公开(公告)日：2025-03-06

申请号：US18590222

申请日：2024-02-28

Applicant: QUALCOMM Incorporated

Inventor： Arvind Krishna SRIDHAR , Rehana MAHFUZ , Erik VISSER , Yinyi GUO

IPC: G10L15/16 , G06T3/4046 , G10L15/08 , G10L15/24 , G10L25/57

Abstract: Systems and techniques are described for generating and using unimodal/multimodal generative models that mitigate hallucinations. For example, a computing device can encode input data to generate encoded representations of the input data. The computing device can obtain intermediate data including a plurality of partial sentences associated with the input data and can generate, based on the intermediate data, at least one complete sentence associated with the input data. The computing device can encode the at least one complete sentence to generate at least one encoded representation of the at least one complete sentence. The computing device can generate a faithfulness score based on a comparison of the encoded representations of the input data and the at least one encoded representation of the at least one complete sentence. The computing device can re-rank the plurality of partial sentences of the intermediate data based on the faithfulness score to generate re-ranked data.

2.

发明申请
SEMANTICALLY-AUGMENTED CONTEXT REPRESENTATION GENERATION 有权

公开(公告)号：US20230034450A1

公开(公告)日：2023-02-02

申请号：US17383284

申请日：2021-07-22

Applicant: QUALCOMM Incorporated

Inventor： Arvind Krishna SRIDHAR , Ravi CHOUDHARY , Lae-Hoon KIM , Erik VISSER

IPC: G10L15/18 , G06K9/72 , G10L15/22

Abstract: A device includes a memory configured to store instructions. The device also includes one or more processors configured to execute the instructions to provide context and one or more items of interest corresponding to the context to a dependency network encoder to generate a semantic-based representation of the context. The one or more processors are also configured to provide the context to a data dependent encoder to generate a context-based representation. The one or more processors are further configured to combine the semantic-based representation and the context-based representation to generate a semantically-augmented representation of the context.

3.

发明申请
AUTOMATED AUDIO CAPTION CORRECTION USING FALSE ALARM AND MISS DETECTION 有权

公开(公告)号：US20250078828A1

公开(公告)日：2025-03-06

申请号：US18811349

申请日：2024-08-21

Applicant: QUALCOMM Incorporated

Inventor： Rehana MAHFUZ , Yinyi GUO , Arvind Krishna SRIDHAR , Erik VISSER

IPC: G10L15/19 , G10L15/01 , G10L15/02

Abstract: Systems and techniques are provided for natural language processing. A system generates a plurality of tokens (e.g., words or portions thereof) based on input content (e.g., text and/or speech). The system searches through the plurality of tokens to generate a first ranking the plurality of tokens based on probability. The system generates natural language inference (NLI) scores for the plurality of tokens to generate a second ranking of the plurality of tokens based on faithfulness to the input content (e.g., whether the tokens produce statements that are true based on the input content). The system generates output text that includes at least one token selected from the plurality of tokens based on the first ranking and the second ranking.

4.

发明申请
KNOWLEDGE-BASED AUDIO SCENE GRAPH 有权

公开(公告)号：US20240419731A1

公开(公告)日：2024-12-19

申请号：US18738243

申请日：2024-06-10

Applicant: QUALCOMM Incorporated

Inventor： Arvind Krishna SRIDHAR , Yinyi GUO , Erik VISSER

IPC: G06F16/68 , G06F16/638

Abstract: A device includes a processor configured to obtain a first audio embedding of a first audio segment and obtain a first text embedding of a first tag assigned to the first audio segment. The first audio segment corresponds to a first audio event of audio events. The processor is configured to obtain a first event representation based on a combination of the first audio embedding and the first text embedding. The processor is configured to obtain a second event representation of a second audio event of the audio events. The processor is also configured to determine, based on knowledge data, relations between the audio events. The processor is configured to construct an audio scene graph based on a temporal order of the audio events. The audio scene graph constructed to include a first node corresponding to the first audio event and a second node corresponding to the second audio event.

5.

发明公开
HALLUCINATION MITIGATION FOR GENERATIVE TRANSFORMER MODELS 审中-公开

公开(公告)号：US20240184988A1

公开(公告)日：2024-06-06

申请号：US18193572

申请日：2023-03-30

Applicant: QUALCOMM Incorporated

Inventor： Arvind Krishna SRIDHAR , Erik VISSER

IPC: G06F40/284 , G06F40/253

CPC classification number: G06F40/284 , G06F40/253

Abstract: Systems and techniques are provided for natural language processing. A system generates a plurality of tokens (e.g., words or portions thereof) based on input content (e.g., text and/or speech). The system searches through the plurality of tokens to generate a first ranking the plurality of tokens based on probability. The system generates natural language inference (NLI) scores for the plurality of tokens to generate a second ranking of the plurality of tokens based on faithfulness to the input content (e.g., whether the tokens produce statements that are true based on the input content). The system generates output text that includes at least one token selected from the plurality of tokens based on the first ranking and the second ranking.

Patent Agency Ranking