Audio Understanding with Fixed Language Models

    公开(公告)号:US20240127001A1

    公开(公告)日:2024-04-18

    申请号:US17964633

    申请日:2022-10-12

    CPC classification number: G06F40/40 G10L15/26

    Abstract: Techniques for audio understanding using fixed language models are provided. In one aspect, a system for performing audio understanding tasks includes: a fixed text embedder for, on receipt of a prompt sequence having (e.g., from 0-10) demonstrations of an audio understanding task followed by a new question, converting the prompt sequence into text embeddings; a pretrained audio encoder for converting the prompt sequence into audio embeddings; and a fixed autoregressive language model for answering the new question using the text embeddings and the audio embeddings. A method for performing audio understanding tasks is also provided.

    COUNTERFACTUAL DEBIASING INFERENCE FOR COMPOSITIONAL ACTION RECOGNITION

    公开(公告)号:US20230368529A1

    公开(公告)日:2023-11-16

    申请号:US17662663

    申请日:2022-05-10

    CPC classification number: G06V20/41 G06V20/46 G06V10/806

    Abstract: One or more computer processors improve action recognition by removing inference introduced by visual appearances of objects within a received video segment. The one or more computer processors extract appearance information and structure information from a received video segment. The one or more computer processors calculate a factual inference (TE) for the received video segment utilizing the extracted appearance information and structure information. The one or more computer processors calculate a counterfactual debiasing inference (NDE) for the received video segment. The one or more computer processors calculate a total indirect effect (TIE) by subtracting the calculated counterfactual debiased inference from the calculated factual inference. The one or more computer processors action recognize the received video segment by selecting a classification result associated with a highest calculated TIE.

Patent Agency Ranking