-
公开(公告)号:US20240394592A1
公开(公告)日:2024-11-28
申请号:US18434691
申请日:2024-02-06
Applicant: Samsung Electronics Co., Ltd.
Inventor: Rakshith Sharma Srinivasa , Jaejin Cho , Chouchang Yang , Yashas Malur Saidutta , Ching-Hua Lee , Yilin Shen , Hongxia Jin
IPC: G06N20/00
Abstract: A method includes accessing a training dataset having multiple samples, where each sample includes a data point for each of multiple modalities. The method also includes generating, using a first encoder associated with a first modality of the multiple modalities, first modality embeddings for data points of the first modality in the training dataset. The method further includes, for each first modality embedding, determining a similarity metric to other first modality embeddings. The method also includes generating, using a second encoder associated with a second modality of the multiple modalities, second modality embeddings for data points of the second modality in the training dataset. In addition, the method includes training the second encoder based on a contrastive loss function to align the first modality embeddings and the second modality embeddings from different samples of the training dataset, where the contrastive loss function is weighed using the similarity metrics.
-
公开(公告)号:US20240046946A1
公开(公告)日:2024-02-08
申请号:US18058104
申请日:2022-11-22
Applicant: Samsung Electronics Co., Ltd.
Inventor: Chou-Chang Yang , Ching-Hua Lee , Rakshith Sharma Srinivasa , Yashas Malur Saidutta , Yilin Shen , Hongxia Jin
IPC: G10L21/0232 , G10L15/06 , G10L15/02 , G10L25/18
CPC classification number: G10L21/0232 , G10L15/063 , G10L15/02 , G10L25/18 , G10L2021/02166
Abstract: A method includes obtaining, using at least one processing device, noisy speech signals and extracting, using the at least one processing device, acoustic features from the noisy speech signals. The method also includes receiving, using the at least one processing device, a predicted speech mask from a speech mask prediction model based on a first acoustic feature subset and receiving, using the at least one processing device, a predicted noise mask from a noise mask prediction model based on a second acoustic feature subset. The method further includes providing, using the at least one processing device, predicted speech features determined using the predicted speech mask and predicted noise features determined using the predicted noise mask to a filtering mask prediction model. In addition, the method includes generating, using the at least one processing device, a clean speech signal using a predicted filtering mask output by the filtering mask prediction model.
-
73.
公开(公告)号:US11854528B2
公开(公告)日:2023-12-26
申请号:US17402045
申请日:2021-08-13
Applicant: SAMSUNG ELECTRONICS CO., LTD.
Inventor: Yen-Chang Hsu , Yilin Shen , Avik Ray , Hongxia Jin
Abstract: An apparatus for detecting unsupported utterances in natural language understanding, includes a memory storing instructions, and at least one processor configured to execute the instructions to classify a feature that is extracted from an input utterance of a user, as one of in-domain and out-of-domain (OOD) for a response to the input utterance, obtain an OOD score of the extracted feature, and identify whether the feature is classified as OOD. The at least one processor is further configured to executed the instructions to, based on the feature being identified to be classified as in-domain, identify whether the obtained OOD score is greater than a predefined threshold, and based on the OOD score being identified to be greater than the predefined threshold, re-classify the feature as OOD.
-
公开(公告)号:US11720814B2
公开(公告)日:2023-08-08
申请号:US16234433
申请日:2018-12-27
Applicant: Samsung Electronics Co., Ltd.
Inventor: Yilin Shen , Yue Deng , Hongxia Jin
IPC: G06N20/00 , G06F18/24 , G06V10/764 , G06N3/049
CPC classification number: G06N20/00 , G06F18/24 , G06V10/764 , G06N3/049
Abstract: A recognition method includes retrieving an input including data of a first window size. The method further includes classifying the input based on comparison of warping distance of the input with a pruning threshold.
-
公开(公告)号:US11687570B2
公开(公告)日:2023-06-27
申请号:US16900664
申请日:2020-06-12
Applicant: Samsung Electronics Co., Ltd.
IPC: G06F16/28 , G06N3/04 , G06F16/2455 , G06N5/025 , G06V30/196 , G06F18/214
CPC classification number: G06F16/288 , G06F16/24564 , G06F18/214 , G06N3/04 , G06N5/025 , G06V30/1988
Abstract: A method, an electronic device and computer readable medium for entity-relationship embeddings using automatically generated entity graphs instead of a traditional knowledge graph are provided. The method includes receiving, by a processor, an input text. The method also includes identifying a primary entity, a secondary entity and a context from the input text, wherein the context comprises a relationship between the primary entity and the secondary entity. The method additionally includes generating, by the processor, an entity context graph based on the primary entity, the secondary entity, and the context by: extracting, from the context, one or more text segments comprising a plurality of words describing one or more additional relationships between the primary entity and the secondary entity, and generating a plurality of context triples from the one or more text segments, each of the plurality of context triples defining a respective relationship between primary entity and the secondary entity.
-
公开(公告)号:US20230177338A1
公开(公告)日:2023-06-08
申请号:US18073383
申请日:2022-12-01
Applicant: Samsung Electronics Co., Ltd.
Inventor: Qian Lou , Yen-Chang Hsu , Burak Uzkent , Ting Hua , Yilin Shen , Hongxia Jin
IPC: G06N3/082 , G06V10/82 , G06V10/772
CPC classification number: G06N3/082 , G06V10/82 , G06V10/772
Abstract: A method includes obtaining, using a first electronic device, a weight matrix associated with a trained transformer model. The method also includes factorizing the weight matrix into a dictionary weight matrix and an intermediate matrix. The method further includes pruning the intermediate matrix to generate a sparse intermediate matrix. The method also includes fine-tuning the sparse intermediate matrix based on a training dataset to generate a fine-tuned sparse intermediate matrix. The method further includes determining an index matrix and a coefficient matrix based on the fine-tuned sparse intermediate matrix. In addition, the method includes deploying the dictionary weight matrix, the index matrix, and the coefficient matrix to a second electronic device without deploying the weight matrix to the second electronic device. A number of parameters in the dictionary weight matrix, the index matrix, and the coefficient matrix is smaller than a number of parameters in the weight matrix.
-
公开(公告)号:US20230073835A1
公开(公告)日:2023-03-09
申请号:US17900126
申请日:2022-08-31
Applicant: Samsung Electronics Co., Ltd.
Inventor: Miao Yin , Burak Uzkent , Yilin Shen , Hongxia Jin
IPC: G06V10/70 , G06V10/774 , G06V10/776 , G06V10/74
Abstract: In one embodiment, a method includes accessing a batch B of a plurality of images, wherein each image in the batch is part of a training set of images used to train a vision transformer comprising a plurality of attention heads. The method further includes determining, for each attention head A, a similarity between (1) the output of the attention head evaluated using each image in the batch and the (2) output of each attention head evaluated using each image in the batch. The method further includes determining, based on the determined similarities, an importance score for each attention head; and pruning, based on the importance scores, one or more attention heads from the vision transformer.
-
78.
公开(公告)号:US11501753B2
公开(公告)日:2022-11-15
申请号:US16728672
申请日:2019-12-27
Applicant: Samsung Electronics Co., Ltd.
Inventor: Yilin Shen , Avik Ray , Hongxia Jin
Abstract: A method includes receiving, from an electronic device, information defining a user utterance associated with a skill to be performed, where the skill is not recognized by a natural language understanding (NLU) engine. The method also includes receiving, from the electronic device, information defining one or more actions for performing the skill. The method further includes identifying, using at least one processor, one or more known skills having one or more slots that map to at least one word or phrase in the user utterance. The method also includes creating, using the at least one processor, a plurality of additional utterances based on the one or more mapped slots. In addition, the method includes training, using the at least one processor, the NLU engine using the plurality of additional utterances.
-
公开(公告)号:US11430124B2
公开(公告)日:2022-08-30
申请号:US16946504
申请日:2020-06-24
Applicant: Samsung Electronics Co., Ltd.
Inventor: Dawei Li , Wenbo Li , Hongxia Jin
Abstract: A method includes training, using at least one processor, a specialized teacher model to perform visual object instance segmentation in order to segment and classify objects in first training images. The first training images contain foreground objects without backgrounds. The method also includes training, using the at least one processor, a student model to perform visual object instance segmentation in order to segment and classify objects in second training images. The second training images contain the foreground objects and the backgrounds. Training the student model includes using selected outputs of the specialized teacher model. The method further includes deploying the trained student model to perform visual object instance segmentation in an external device.
-
公开(公告)号:US11423225B2
公开(公告)日:2022-08-23
申请号:US16946746
申请日:2020-07-02
Applicant: Samsung Electronics Co., Ltd.
Inventor: Yilin Shen , Xiangyu Zeng , Hongxia Jin
IPC: G10L15/16 , G06F40/30 , G06N3/04 , G06F40/279
Abstract: A method includes obtaining, using at least one processor of an electronic device, a base model trained to perform natural language understanding. The method also includes generating, using the at least one processor, a first model expansion based on knowledge from the base model. The method further includes training, using the at least one processor, the first model expansion based on first utterances without modifying parameters of the base model. The method also includes receiving, using the at least one processor, an additional utterance from a user. In addition, the method includes determining, using the at least one processor, a meaning of the additional utterance using the base model and the first model expansion.
-
-
-
-
-
-
-
-
-