-
公开(公告)号:US20210407511A1
公开(公告)日:2021-12-30
申请号:US16910717
申请日:2020-06-24
Inventor: Felix Immanuel Wyss , Conor P. McGann
Abstract: A method for selectively transcribing voice communications that includes: receiving keywords; receiving an audio stream of audio data of speech; searching the audio stream to detect keywords or keyword detections and recording parameter data for each that includes a location of the keyword within the audio stream; generating one or more cumulative datasets for one or more portions of the audio stream that each includes parameter data for the keyword detections occurring therein; for each of the one or more portions of the audio stream, calculating a transcription favorableness score via inputting the corresponding one of the one or more cumulative datasets into an algorithm; and determining whether to transcribe each of the one or more portions of the audio stream by comparing the corresponding transcription favorableness score against a predetermined threshold.
-
公开(公告)号:US11134155B1
公开(公告)日:2021-09-28
申请号:US17139033
申请日:2020-12-31
Inventor: Felix Immanuel Wyss , Ramasubramanian Sundaram , Aravind Ganapathiraju
Abstract: A method for automated generation of contact center system embeddings according to one embodiment includes determining, by a computing system, contact center system agents, contact center system agent skills, and/or contact center system virtual queue experiences; generating, by the computing system, a matrix representation based on the contact center system agents, the contact center system agent skills, and/or the contact center system virtual queue experiences; generating, by the computing system and based on the matrix representation, contact center system agent identifiers, contact center system agent skills identifiers, and/or contact center system virtual queue identifiers; transforming, by the computing system, the contact center system agent identifiers, the contact center system agent skills identifiers, and/or the contact center system virtual queue identifiers into the contact center system agent embeddings, contact center system agent skills embeddings, and/or contact center system virtual queue embeddings, wherein weights of the contact center system agent embeddings, the contact center system agent skills embeddings, and/or the contact center system virtual queue embeddings are randomly initialized; and training, by the computing system, the contact center system agent embeddings, the contact center system agent skills embeddings, and/or the contact center system virtual queue embeddings by applying machine learning to obtain final weights of the contact center system agent embeddings, the contact center system agent skills embeddings, and/or the contact center system virtual queue embeddings.
-
公开(公告)号:US10360898B2
公开(公告)日:2019-07-23
申请号:US16000742
申请日:2018-06-05
Inventor: Aravind Ganapathiraju , Yingyi Tan , Felix Immanuel Wyss , Scott Allen Randal
Abstract: A system and method are presented for predicting speech recognition performance using accuracy scores in speech recognition systems within the speech analytics field. A keyword set is selected. Figure of Merit (FOM) is computed for the keyword set. Relevant features that describe the word individually and in relation to other words in the language are computed. A mapping from these features to FOM is learned. This mapping can be generalized via a suitable machine learning algorithm and be used to predict FOM for a new keyword. In at least embodiment, the predicted FOM may be used to adjust internals of speech recognition engine to achieve a consistent behavior for all inputs for various settings of confidence values.
-
公开(公告)号:US20180343180A1
公开(公告)日:2018-11-29
申请号:US16055118
申请日:2018-08-05
Inventor: Richard M. Neidermyer , Kevin Elliott King , Felix Immanuel Wyss
CPC classification number: H04L43/0811 , H04L63/0428 , H04L67/2842 , H04L69/40
Abstract: A system and method are presented for on premises and offline survivability of an interactive voice response system in a cloud telephony system. Voice interaction control may be divided from the media resources. Survivability is invoked when the communication technology between the Cloud and the voice interaction's resource provider is degraded or disrupted. The system is capable of recovering after a disruption event such that a seamless transition between failure and non-failure states is provided for a limited impact to a user's experience. When communication paths or Cloud control is re-established, the user resumes normal processing and full functionality as if the failure had not occurred.
-
公开(公告)号:US11574642B2
公开(公告)日:2023-02-07
申请号:US16915160
申请日:2020-06-29
Abstract: A system and method are presented for the correction of packet loss in audio in automatic speech recognition (ASR) systems. Packet loss correction, as presented herein, occurs at the recognition stage without modifying any of the acoustic models generated during training. The behavior of the ASR engine in the absence of packet loss is thus not altered. To accomplish this, the actual input signal may be rectified, the recognition scores may be normalized to account for signal errors, and a best-estimate method using information from previous frames and acoustic models may be used to replace the noisy signal.
-
公开(公告)号:US11468897B2
公开(公告)日:2022-10-11
申请号:US16910717
申请日:2020-06-24
Inventor: Felix Immanuel Wyss , Conor P. McGann
Abstract: A method for selectively transcribing voice communications that includes: receiving keywords; receiving an audio stream of audio data of speech; searching the audio stream to detect keywords or keyword detections and recording parameter data for each that includes a location of the keyword within the audio stream; generating one or more cumulative datasets for one or more portions of the audio stream that each includes parameter data for the keyword detections occurring therein; for each of the one or more portions of the audio stream, calculating a transcription favorableness score via inputting the corresponding one of the one or more cumulative datasets into an algorithm; and determining whether to transcribe each of the one or more portions of the audio stream by comparing the corresponding transcription favorableness score against a predetermined threshold.
-
公开(公告)号:US11294955B2
公开(公告)日:2022-04-05
申请号:US16378452
申请日:2019-04-08
Inventor: Srinath Cheluvaraja , Ananth Nagaraja Iyer , Felix Immanuel Wyss
IPC: G06F16/683 , G10L25/51
Abstract: A system and method are presented for optimization of audio fingerprint search. In an embodiment, the audio fingerprints are organized into a recursive tree with different branches containing fingerprint sets that are dissimilar to each other. The tree is constructed using a clustering algorithm based on a similarity measure. The similarity measure may comprise a Hamming distance for a binary fingerprint or a Euclidean distance for continuous valued fingerprints. In another embodiment, each fingerprint is stored at a plurality of resolutions and clustering is performed hierarchically. The recognition of an incoming fingerprint begins from the root of the tree and proceeds down its branches until a match or mismatch is declared. In yet another embodiment, a fingerprint definition is generalized to include more detailed audio information than in the previous definition.
-
公开(公告)号:US20190236101A1
公开(公告)日:2019-08-01
申请号:US16378452
申请日:2019-04-08
Inventor: Srinath Cheluvaraja , Ananth Nagaraja Iyer , Felix Immanuel Wyss
IPC: G06F16/683 , G10L25/51
CPC classification number: G06F16/683 , G10L25/51
Abstract: A system and method are presented for optimization of audio fingerprint search. In an embodiment, the audio fingerprints are organized into a recursive tree with different branches containing fingerprint sets that are dissimilar to each other. The tree is constructed using a clustering algorithm based on a similarity measure. The similarity measure may comprise a Hamming distance for a binary fingerprint or a Euclidean distance for continuous valued fingerprints. In another embodiment, each fingerprint is stored at a plurality of resolutions and clustering is performed hierarchically. The recognition of an incoming fingerprint begins from the root of the tree and proceeds down its branches until a match or mismatch is declared. In yet another embodiment, a fingerprint definition is generalized to include more detailed audio information than in the previous definition.
-
19.
公开(公告)号:US20180286385A1
公开(公告)日:2018-10-04
申请号:US16000742
申请日:2018-06-05
Inventor: Aravind Ganapathiraju , Yingyi Tan , Felix Immanuel Wyss , Scott Allen Randal
CPC classification number: G10L15/01 , G10L2015/088
Abstract: A system and method are presented for predicting speech recognition performance using accuracy scores in speech recognition systems within the speech analytics field. A keyword set is selected. Figure of Merit (FOM) is computed for the keyword set. Relevant features that describe the word individually and in relation to other words in the language are computed. A mapping from these features to FOM is learned. This mapping can be generalized via a suitable machine learning algorithm and be used to predict FOM for a new keyword. In at least embodiment, the predicted FOM may be used to adjust internals of speech recognition engine to achieve a consistent behavior for all inputs for various settings of confidence values.
-
公开(公告)号:US11694697B2
公开(公告)日:2023-07-04
申请号:US16915160
申请日:2020-06-29
CPC classification number: G10L19/005 , G10L15/08 , G10L15/14 , G10L15/142 , G10L15/20 , G10L15/02 , G10L25/18 , G10L25/21 , G10L2015/025 , G10L2019/0012
Abstract: A system and method are presented for the correction of packet loss in audio in automatic speech recognition (ASR) systems. Packet loss correction, as presented herein, occurs at the recognition stage without modifying any of the acoustic models generated during training. The behavior of the ASR engine in the absence of packet loss is thus not altered. To accomplish this, the actual input signal may be rectified, the recognition scores may be normalized to account for signal errors, and a best-estimate method using information from previous frames and acoustic models may be used to replace the noisy signal.
-
-
-
-
-
-
-
-
-