摘要:
A method and apparatus for improving speech recognition results for an audio signal captured within an organization, comprising: receiving the audio signal captured by a capturing or logging device; extracting a phonetic feature and an acoustic feature from the audio signal; decoding the phonetic feature into a phonetic searchable structure; storing the phonetic searchable structure and the acoustic feature in an index; performing phonetic search for a word or a phrase in the phonetic searchable structure to obtain a result; activating an audio analysis engine which receives the acoustic feature to validate the result and obtain an enhanced result.
摘要:
A method and system for indicating in real time that an interaction is associated with a problem or issue, comprising: receiving a segment of an interaction in which a representative of the organization participates; extracting a feature from the segment; extracting a global feature associated with the interaction; aggregating the feature and the global feature; and classifying the segment or the interaction in association with the problem or issue by applying a model to the feature and the global feature. The method and system may also use features extracted from earlier segments within the interaction. The method and system can also evaluate the model based on features extracted from training interactions and manual tagging assigned to the interactions or segments thereof.
摘要:
The subject matter discloses a method two phase phonetic indexing and search comprising: receiving a digital representation of an audio signal; producing a phonetic index of the audio signal; producing phonetic N-gram sequence from the phonetic index by segmenting the phonetic index into a plurality of phonetic N-grams; and producing an inverted index of the plurality of phonetic N-grams.
摘要:
A method and apparatus combining the advantages of phonetic search such as the rapid implementation and deployment and medium accuracy, comprising steps and components for receiving the audio signal captured in the call center environment, extracting a multiplicity of feature vectors from the audio signal, creating a phoneme lattice from the multiplicity of feature vectors wherein the phoneme lattice comprising one or more allophone and each allophone comprising two or more phonemes, creating a hybrid phoneme-word lattice from the phoneme lattice and extracting the word by analyzing the hybrid phoneme-Word lattice.
摘要:
A method and apparatus for analyzing and segmenting a vocal interaction captured in a test audio source, the test audio source captured within an environment. The method and apparatus first use text and acoustic features extracted from the interaction with tagging information, for constructing a model. Then, at production time, text and acoustic features are extracted from the interactions, and by applying the model, tagging information is retrieved for the interaction, enabling analysis, flow visualization or further processing of the interaction.
摘要:
In a multi-lingual environment, a method and apparatus for determining a language spoken in a speech utterance. The method and apparatus test acoustic feature vectors extracted from the utterances against acoustic models associated with one or more of the languages. Speech to text is then performed for the language indicated by the acoustic testing, followed by textual verification of the resulting text. During verification, the resulting text is processed by language specific NLP and verified against textual models associated with the language. The system is self-learning, i.e., once a language is verified or rejected, the relevant feature vectors are used for enhancing one or more acoustic models associated with one or more languages, so that acoustic determination may improve.
摘要:
Methods and apparatus for the enhancement of speech to text engines, by providing indications to the correctness of the found words, based on additional sources besides the internal indication provided by the STT engine. The enhanced indications comprise sources of data such as acoustic features, CTI features, phonetic search and others. The apparatus and methods also enable the detection of important or significant keywords found in audio files, thus enabling more efficient usages, such as further processing or transfer of interactions to relevant agents, escalation of issues, or the like. The methods and apparatus employ a training phase in which word model and key phrase model are generated for determining an enhanced correctness indication for a word and an enhanced importance indication for a key phrase, based on the additional features.
摘要:
The disclosed method and apparatus combine interactions and transactions in order to detect fraud acts or fraud attempts. In one embodiment, one or more interactions is correlated with one or more transactions, the interactions is and transactions features are combined, and features are extracted from the combined structure. The features are compared against one or more profiles, and a combined risk score is determined for the interactions or transactions. If the risk score exceeds a predetermined threshold, a preventive/corrective action can be taken.In another embodiment, behavioral characteristics extracted from one or more interactions associated with a transaction, with a risk score obtained by analyzing the transaction. The behavioral characteristic are used to enhance suspicion level related to a transaction being fraudulent, and to enable the taking of measures related to the transaction or to the person handling the transaction. The combination thus enables better assessment whether a particular interaction or transaction is fraudulent, and therefore provides for better detection or prevention of such activities. In addition, making the fraud assessment more reliable enables more efficient resource allocation of personnel for monitoring the transactions and interactions, better usage of communication time by avoiding lengthy identification where not required, and generally higher efficiency.
摘要:
A method and apparatus for providing real-time assistance related to an interaction associated with a contact center, comprising steps or components for: receiving at least a part of an audio signal of an interaction captured by a capturing device associated with an organization, and metadata information associated with the interaction; performing audio analysis of the at least part of the audio signal, while the interaction is still in progress to obtain audio information; categorizing at least a part of the metadata information and the audio information, to determine a category associated with the interaction, while the interaction is still in progress to obtain audio information; and taking an action associated with the category.
摘要:
Methods and apparatus for the enhancement of speech to text engines, by providing indications to the correctness of the found words, based on additional sources besides the internal indication provided by the STT engine. The enhanced indications comprise sources of data such as acoustic features, CTI features, phonetic search and others. The apparatus and methods also enable the detection of important or significant keywords found in audio files, thus enabling more efficient usages, such as further processing or transfer of interactions to relevant agents, escalation of issues, or the like. The methods and apparatus employ a training phase in which word model and key phrase model are generated for determining an enhanced correctness indication for a word and an enhanced importance indication for a key phrase, based on the additional features.