-
1.
公开(公告)号:WO2023081504A1
公开(公告)日:2023-05-11
申请号:PCT/US2022/049251
申请日:2022-11-08
Applicant: GENESYS CLOUD SERVICES, INC.
Inventor: HAIKIN, Lev , MAZZA, Arnon , ORBACH, Eyal , FAIZAKOF, Avraham
IPC: G10L15/197 , G10L15/06
Abstract: A system and method of automatically discovering unigrams in a speech data element may include receiving a language model that includes a plurality of n-grams, where each n-gram includes one or more unigrams; applying an acoustic machine-learning (ML) model on one or more speech data elements to obtain a character distribution function; applying a greedy decoder on the character distribution function, to predict an initial corpus of unigrams; filtering out one or more unigrams of the initial corpus to obtain a corpus of candidate unigrams, where the candidate unigrams are not included in the language model; analyzing the one or more first speech data elements, to extract at least one n-gram that comprises a candidate unigram; and updating the language model to include the extracted at least one n-gram.
-
公开(公告)号:WO2022240405A1
公开(公告)日:2022-11-17
申请号:PCT/US2021/032007
申请日:2021-05-12
Applicant: GENESYS CLOUD SERVICES, INC.
Inventor: ORBACH, Eyal , FAIZAKOF, Avraham , MAZZA, Arnon , HAIKIN, Lev
IPC: G06F40/258
Abstract: A method and system for automatic topic detection in text may include receiving a text document of a corpus of documents and extracting one or more phrases from the document, based on one or more syntactic patterns. For each phrase, embodiments of the invention may: apply a word embedding neural network on one or more words of the phrase, to obtain one or more respective word embedding vectors; calculate a weighted phrase embedding vector, and compute a phrase saliency score, based on the weighted phrase embedding vector. Embodiments of the invention may subsequently produce one or more topic labels, representing one or more respective topics in the document, based on the computed phrase saliency scores, and may select one or more topic labels according to their relevance to the business domain of the corpus.
-
公开(公告)号:WO2022240404A1
公开(公告)日:2022-11-17
申请号:PCT/US2021/031991
申请日:2021-05-12
Applicant: GENESYS CLOUD SERVICES, INC.
Inventor: MAZZA, Arnon , HAIKIN, Lev , ORBACH, Eyal , FAIZAKOF, Avraham
IPC: G06F40/30 , G06F18/2113 , G06F18/217 , G06F18/241 , G06F18/2431 , G06F18/40 , G06N20/20
Abstract: A method and system for finetuning automated sentiment classification by at least one processor may include: receiving a first machine learning (ML) model M0, pretrained to perform automated sentiment classification of utterances, based on a first annotated training dataset; associating one or more instances of model M0 to one or more corresponding sites; and for one or more (e.g., each) ML model M0 instance and/or site: receiving at least one utterance via the corresponding site; obtaining at least one data element of annotated feedback, corresponding to the at least one utterance; retraining the ML model M0, to produce a second ML model M1, based on a second annotated training dataset, wherein the second annotated training dataset may include the first annotated training dataset and the at least one annotated feedback data element; and using the second ML model M1, to classify utterances according to one or more sentiment classes.
-
-