-
公开(公告)号:US11568305B2
公开(公告)日:2023-01-31
申请号:US16379110
申请日:2019-04-09
Inventor: Sapna Negi , Maciej Dabrowski , Aravind Ganapathiraju , Emir Munoz , Veera Elluru Raghavendra , Felix Immanuel Wyss
Abstract: A system and method are presented for customer journey event representation learning and outcome prediction using neural sequence models. A plurality of events are input into a module where each event has a schema comprising characteristics of the events and their modalities (web clicks, calls, emails, chats, etc.). The events of different modalities can be captured using different schemas and therefore embodiments described herein are schema-agnostic. Each event is represented as a vector of some number of numbers by the module with a plurality of vectors being generated in total for each customer visit. The vectors are then used in sequence learning to predict real-time next best actions or outcome probabilities in a customer journey using machine learning algorithms such as recurrent neural networks.
-
2.
公开(公告)号:US20190355348A1
公开(公告)日:2019-11-21
申请号:US16414885
申请日:2019-05-17
Inventor: Ramasubramanian Sundaram , Aravind Ganapathiraju , Yingyi Tan
Abstract: A system and method are presented for a multiclass approach for confidence modeling in automatic speech recognition systems. A confidence model may be trained offline using supervised learning. A decoding module is utilized within the system that generates features for audio files in audio data. The features are used to generate a hypothesized segment of speech which is compared to a known segment of speech using edit distances. Comparisons are labeled from one of a plurality of output classes. The labels correspond to the degree to which speech is converted to text correctly or not. The trained confidence models can be applied in a variety of systems, including interactive voice response systems, keyword spotters, and open-ended dialog systems.
-
公开(公告)号:US11574642B2
公开(公告)日:2023-02-07
申请号:US16915160
申请日:2020-06-29
Abstract: A system and method are presented for the correction of packet loss in audio in automatic speech recognition (ASR) systems. Packet loss correction, as presented herein, occurs at the recognition stage without modifying any of the acoustic models generated during training. The behavior of the ASR engine in the absence of packet loss is thus not altered. To accomplish this, the actual input signal may be rectified, the recognition scores may be normalized to account for signal errors, and a best-estimate method using information from previous frames and acoustic models may be used to replace the noisy signal.
-
公开(公告)号:US11211065B2
公开(公告)日:2021-12-28
申请号:US16265148
申请日:2019-02-01
Inventor: Tejas Godambe , Aravind Ganapathiraju
Abstract: A system and method are presented for the automatic filtering of test utterance mismatches in automatic speech recognition (ASR) systems. Test data are evaluated for match between audio and text in a language-independent manner. Utterances having mismatch are identified and isolated for either removal or manual verification to prevent incorrect measurements of the ASR system performance. In an embodiment, contiguous stretches of low probabilities in every utterance are searched for and removed. Such segments may be intra-word or cross-word. In another embodiment, scores may be determined using log DNN probability for every word in each utterance. Words may be sorted in the order of the scores and those utterances containing the least word scores are removed.
-
5.
公开(公告)号:US20190392815A1
公开(公告)日:2019-12-26
申请号:US16448384
申请日:2019-06-21
Inventor: Elluru Veera Raghavendra , Aravind Ganapathiraju
Abstract: A system and method are presented for F0 transfer learning for improving F0 prediction with deep neural network models. Larger models are trained using long short-term memory (LSTM) and multi-layer perceptron (MLP) feed-forward hidden layer modeling. The fundamental frequency values for voiced and unvoiced segments are identified and extracted from the larger models. The values for voiced regions are transferred and applied to training a smaller model and the smaller model is applied in the text to speech system for real-time speech synthesis output.
-
公开(公告)号:US20180286385A1
公开(公告)日:2018-10-04
申请号:US16000742
申请日:2018-06-05
Inventor: Aravind Ganapathiraju , Yingyi Tan , Felix Immanuel Wyss , Scott Allen Randal
CPC classification number: G10L15/01 , G10L2015/088
Abstract: A system and method are presented for predicting speech recognition performance using accuracy scores in speech recognition systems within the speech analytics field. A keyword set is selected. Figure of Merit (FOM) is computed for the keyword set. Relevant features that describe the word individually and in relation to other words in the language are computed. A mapping from these features to FOM is learned. This mapping can be generalized via a suitable machine learning algorithm and be used to predict FOM for a new keyword. In at least embodiment, the predicted FOM may be used to adjust internals of speech recognition engine to achieve a consistent behavior for all inputs for various settings of confidence values.
-
公开(公告)号:US11714965B2
公开(公告)日:2023-08-01
申请号:US16677989
申请日:2019-11-08
Inventor: Felix Immanuel Wyss , Aravind Ganapathiraju , Pavan Buduguppa
IPC: G06F40/295 , H04L51/02 , G06F40/253 , G06N3/044 , G06N3/08
CPC classification number: G06F40/295 , G06F40/253 , G06N3/044 , H04L51/02 , G06N3/08
Abstract: A system and method are presented for model derivation for entity prediction. An LSTM with 100 memory cells is used in the system architecture. Sentences are truncated and provided with feature information to a named-entity recognition model. A forward and a backward pass of the LSTM are performed, and each pass is concatenated. The concatenated bi-directional LSTM encodings are obtained for the various features for each word. A fully connected set of neurons shared across all encoded words is obtained and the final encoded outputs with dimensions equal to the number of entities is determined.
-
公开(公告)号:US11302307B2
公开(公告)日:2022-04-12
申请号:US16448384
申请日:2019-06-21
Inventor: Elluru Veera Raghavendra , Aravind Ganapathiraju
Abstract: A system and method are presented for F0 transfer learning for improving F0 prediction with deep neural network models. Larger models are trained using long short-term memory (LSTM) and multi-layer perceptron (MLP) feed-forward hidden layer modeling. The fundamental frequency values for voiced and unvoiced segments are identified and extracted from the larger models. The values for voiced regions are transferred and applied to training a smaller model and the smaller model is applied in the text to speech system for real-time speech synthesis output.
-
9.
公开(公告)号:US20200327444A1
公开(公告)日:2020-10-15
申请号:US16379110
申请日:2019-04-09
Inventor: Sapna Negi , Maciej Dabrowski , Aravind Ganapathiraju , Emir Munoz , Veera Elluru Raghavendra , Felix Immanuel Wyss
Abstract: A system and method are presented for customer journey event representation learning and outcome prediction using neural sequence models. A plurality of events are input into a module where each event has a schema comprising characteristics of the events and their modalities (web clicks, calls, emails, chats, etc.). The events of different modalities can be captured using different schemas and therefore embodiments described herein are schema-agnostic. Each event is represented as a vector of some number of numbers by the module with a plurality of vectors being generated in total for each customer visit. The vectors are then used in sequence learning to predict real-time next best actions or outcome probabilities in a customer journey using machine learning algorithms such as recurrent neural networks.
-
公开(公告)号:US10789962B2
公开(公告)日:2020-09-29
申请号:US16186851
申请日:2018-11-12
Abstract: A system and method are presented for the correction of packet loss in audio in automatic speech recognition (ASR) systems. Packet loss correction, as presented herein, occurs at the recognition stage without modifying any of the acoustic models generated during training. The behavior of the ASR engine in the absence of packet loss is thus not altered. To accomplish this, the actual input signal may be rectified, the recognition scores may be normalized to account for signal errors, and a best-estimate method using information from previous frames and acoustic models may be used to replace the noisy signal.
-
-
-
-
-
-
-
-
-