-
公开(公告)号:US12266379B2
公开(公告)日:2025-04-01
申请号:US17937765
申请日:2022-10-03
Applicant: QUALCOMM Incorporated
Inventor: Byeonggeun Kim , Seunghan Yang , Hyunsin Park , Juntae Lee , Simyung Chang
IPC: G10L21/034 , G10L17/04 , G10L17/18 , G10L25/30 , G10L25/51
Abstract: Techniques and apparatus for training a neural network to classify audio into one of a plurality of categories and using such a trained neural network. An example method generally includes receiving a data set including a plurality of audio samples. A relaxed feature-normalized data set is generated by normalizing each audio sample of the plurality of audio samples. A neural network is trained to classify audio into one of a plurality of categories based on the relaxed feature-normalized data set, and the trained neural network is deployed.
-
公开(公告)号:US11798204B2
公开(公告)日:2023-10-24
申请号:US17685278
申请日:2022-03-02
Applicant: QUALCOMM Incorporated
Inventor: Hyunsin Park , Juntae Lee , Simyung Chang , Byeonggeun Kim , Jaewon Choi , Kyu Woong Hwang
CPC classification number: G06T11/00 , G06F3/013 , G06V40/174 , G06V40/18
Abstract: Imaging systems and techniques are described. An imaging system receives image data representing at least a portion (e.g., a face) of a first user as captured by a first image sensor. The imaging system identifies that a gaze of the first user as represented in the image data is directed toward a displayed representation of at least a portion (e.g., a face) of a second user. The imaging system identifies an arrangement of representations of users for output. The imaging system generates modified image data based on the gaze and the arrangement at least in part by modifying the image data to modify at least the portion of the first user in the image data to be visually directed toward a direction corresponding to the second user based on the gaze and the arrangement. The imaging system outputs the modified image data arranged according to the arrangement.
-
公开(公告)号:US12019641B2
公开(公告)日:2024-06-25
申请号:US18153899
申请日:2023-01-12
Applicant: QUALCOMM Incorporated
Inventor: Byeonggeun Kim , Juntae Lee , Simyung Chang
IPC: G06F7/00 , G06F16/2458 , G06F16/28
CPC classification number: G06F16/2462 , G06F16/285
Abstract: Systems and techniques are provided for processing one or more data samples. For example, a neural network classifier can be trained to perform few-shot open-set recognition (FSOSR) based on a task-agnostic open-set prototype. A process can include determining one or more prototype representations for each class included in a plurality of support samples. A task-agnostic open-set prototype representation can be determined, in a same learned metric space as the one or more prototype representations. One or more distance metrics can be determined for each query sample of one or more query samples, based on the one or more prototype representations and the task-agnostic open-set prototype representation. Based on the one or more distance metrics, each query sample can be classified into one of classes associated with the one or more prototype representations or an open-set class associated with the task-agnostic open-set prototype representation.
-
公开(公告)号:US12067777B2
公开(公告)日:2024-08-20
申请号:US17654986
申请日:2022-03-15
Applicant: QUALCOMM Incorporated
Inventor: Hanul Kim , Mihir Jain , Juntae Lee , Sungrack Yun , Fatih Murat Porikli
Abstract: Certain aspects of the present disclosure provide a method of processing video data. In one example, the method includes receiving input video data; sampling a first subset of clips from the input video data; providing the first subset of clips to a first component of a machine learning model to generate first output; sampling a second subset of clips from the input video data, wherein the second subset of clips comprises fewer clips than the first subset of clips; providing the second subset of clips to a second component of the machine learning model to generate a second output; aggregating the first output from the first component of the machine learning model with the second output from the second component of the machine learning model to generate aggregated output; and determining a characteristic of the input video data based on the aggregated output.
-
公开(公告)号:US20220301310A1
公开(公告)日:2022-09-22
申请号:US17654986
申请日:2022-03-15
Applicant: QUALCOMM Incorporated
Inventor: Hanul KIM , Mihir Jain , Juntae Lee , Sungrack Yun , Fatih Murat Porikli
Abstract: Certain aspects of the present disclosure provide a method of processing video data. In one example, the method includes receiving input video data; sampling a first subset of clips from the input video data; providing the first subset of clips to a first component of a machine learning model to generate first output; sampling a second subset of clips from the input video data, wherein the second subset of clips comprises fewer clips than the first subset of clips; providing the second subset of clips to a second component of the machine learning model to generate a second output; aggregating the first output from the first component of the machine learning model with the second output from the second component of the machine learning model to generate aggregated output; and determining a characteristic of the input video data based on the aggregated output.
-
-
-
-