SYSTEM AND METHOD FOR USING A USER-ACTION LOG TO LEARN TO CLASSIFY ENCRYPTED TRAFFIC

    公开(公告)号:US20200042897A1

    公开(公告)日:2020-02-06

    申请号:US16527373

    申请日:2019-07-31

    Abstract: Machine learning techniques for classifying encrypted traffic with a high degree of accuracy. The techniques do not require decrypting any traffic and may not require any manually-labeled traffic samples. An automated system uses an application of interest to perform a large number of user actions of various types. The system further records, in a log, the respective times at which the actions were performed. The system further receives the encrypted traffic exchanged between the system and the application server, and records properties of this traffic in a time series. Subsequently, by correlating between the times in the log and the times at which the traffic was received, the system matches each of the user actions with a corresponding portion of the traffic, which is assumed to have been generated by the user action. The system thus automatically builds a labeled training set, which may be used to train a network-traffic classifier.

    DIARIZATION USING LINGUISTIC LABELING
    23.
    发明申请

    公开(公告)号:US20200035245A1

    公开(公告)日:2020-01-30

    申请号:US16587518

    申请日:2019-09-30

    Abstract: Systems and methods of diarization using linguistic labeling include receiving a set of diarized textual transcripts. A least one heuristic is automatedly applied to the diarized textual transcripts to select transcripts likely to be associated with an identified group of speakers. The selected transcripts are analyzed to create at least one linguistic model. The linguistic model is applied to transcripted audio data to label a portion of the transcripted audio data as having been spoken by the identified group of speakers. Still further embodiments of diarization using linguistic labeling may serve to label agent speech and customer speech in a recorded and transcripted customer service interaction.

    Tagging relations with N-best
    24.
    发明授权

    公开(公告)号:US10255346B2

    公开(公告)日:2019-04-09

    申请号:US14608737

    申请日:2015-01-29

    Abstract: Systems, methods, and media for developing ontologies and analyzing communication data are provided herein. In an example implementation, the method includes: identifying terms in in a set of communication data; identifying a list of possible relations of the identified terms; scoring the possible relations according to a set of predefined merits; ranking the possible relations into a list of possible relations in descending order according to their score; and tagging relations in the set of communication data. The relations may be tagged by identifying the possible relations in the communication data in order corresponding with the list of possible relations. The possible relations that have lower rankings that conflict with higher ranking relations are not tagged. The conflicts may be determined by a predefined set of conflict criteria.

    Speech analytics system and system and method for determining structured speech
    26.
    发明授权
    Speech analytics system and system and method for determining structured speech 有权
    用于确定结构化语音的语音分析系统和系统及方法

    公开(公告)号:US09401145B1

    公开(公告)日:2016-07-26

    申请号:US14270280

    申请日:2014-05-05

    CPC classification number: G10L15/20 G10L15/19 G10L15/197 G10L15/26

    Abstract: A method for converting speech to text in a speech analytics system is provided. The method includes receiving audio data containing speech made up of sounds from an audio source, processing the sounds with a phonetic module resulting in symbols corresponding to the sounds, and processing the symbols with a language module and occurrence table resulting in text. The method also includes determining a probability of correct translation for each word in the text, comparing the probability of correct translation for each word in the text to the occurrence table, and adjusting the occurrence table based on the probability of correct translation for each word in the text.

    Abstract translation: 提供了一种在语音分析系统中将语音转换为文本的方法。 该方法包括接收包含来自音频源的声音的音频数据,用声音模块处理声音,产生与声音相对应的符号,以及用导致文本的语言模块和出现表处理符号。 该方法还包括确定文本中的每个单词的正确翻译的概率,将文本中的每个单词的正确翻译的概率与出现表进行比较,并且基于每个单词的正确翻译的概率来调整出现表 文本。

    System and Method of Automated Model Adaptation
    27.
    发明申请
    System and Method of Automated Model Adaptation 有权
    自动模型适应的系统和方法

    公开(公告)号:US20150066502A1

    公开(公告)日:2015-03-05

    申请号:US14291893

    申请日:2014-05-30

    Abstract: Methods, systems, and computer readable media for automated transcription model adaptation includes obtaining audio data from a plurality of audio files. The audio data is transcribed to produce at least one audio file transcription which represents a plurality of transcription alternatives for each audio file. Speech analytics are applied to each audio file transcription. A best transcription is selected from the plurality of transcription alternatives for each audio file. Statistics from the selected best transcription are calculated. An adapted model is created from the calculated statistics.

    Abstract translation: 用于自动转录模型适应的方法,系统和计算机可读介质包括从多个音频文件获得音频数据。 音频数据被转录以产生表示每个音频文件的多个转录替代品的至少一个音频文件转录。 语音分析应用于每个音频文件转录。 从每个音频文件的多个转录替代品中选择最佳转录。 计算出所选择的最佳转录数据。 从计算的统计信息创建一个适应模型。

    Diarization Using Linguistic Labeling
    28.
    发明申请
    Diarization Using Linguistic Labeling 审中-公开
    使用语言标签进行分类

    公开(公告)号:US20140142940A1

    公开(公告)日:2014-05-22

    申请号:US14084976

    申请日:2013-11-20

    CPC classification number: G10L17/005 G10L17/02

    Abstract: Systems and methods of diarization using linguistic labeling include receiving a set of diarized textual transcripts. A least one heuristic is automatedly applied to the diarized textual transcripts to select transcripts likely to be associated with an identified group of speakers. The selected transcripts are analyzed to create at least one linguistic model. The linguistic model is applied to transcripted audio data to label a portion of the transcripted audio data as having been spoken by the identified group of speakers. Still further embodiments of diarization using linguistic labeling may serve to label agent speech and customer speech in a recorded and transcripted customer service interaction.

    Abstract translation: 使用语言标签进行分类的系统和方法包括收集一组二维化的文本记录。 至少一个启发式算法被自动地应用于二进制的文本记录,以选择可能与识别的发言者组相关联的成绩单。 分析所选择的成绩单以创建至少一个语言模型。 语言模型被应用于被转录的音频数据,以将被转录的音频数据的一部分标记为被识别的发言者群说出来。 使用语言标签的进一步的二进制实施例可以用于在记录和转录的客户服务交互中标记代理人语音和客户语音。

    System and method for de-anonymizing actions and messages on networks

    公开(公告)号:US11444956B2

    公开(公告)日:2022-09-13

    申请号:US17221779

    申请日:2021-04-03

    Abstract: A traffic-monitoring system that monitors encrypted traffic exchanged between IP addresses used by devices and a network, and further receives the user-action details that are passed over the network. By correlating between the times at which the encrypted traffic is exchanged and the times at which the user-action details are received, the system associates the user-action details with the IP addresses. In particular, for each action specified in the user-action details, the system identifies one or more IP addresses that may be the source of the action. Based on the IP addresses, the system may identify one or more users who may have performed the action. The system may correlate between the respective action-times of the encrypted actions and the respective approximate action-times of the indicated actions. The system may hypothesize that the indicated action may correspond to one of the encrypted actions having these action-times.

Patent Agency Ranking