SYSTEM AND METHOD FOR DISCOVERING AND EXPLORING CONCEPTS
    3.
    发明公开
    SYSTEM AND METHOD FOR DISCOVERING AND EXPLORING CONCEPTS 审中-公开
    发现和探索概念的系统和方法

    公开(公告)号:EP3025295A1

    公开(公告)日:2016-06-01

    申请号:EP14828714.7

    申请日:2014-07-24

    CPC分类号: G06F17/27 G06Q30/01

    摘要: A method for identifying concepts in a plurality of interactions includes: filtering, on a processor, the interactions based on intervals; creating, on the processor, a plurality of sentences from the filtered interactions; computing, on the processor, a saliency of each the sentences; pruning away, on the processor, sentences with low saliency for generating a set of informative sentences; clustering, on the processor, the sentences of the set of informative sentences for generating a plurality of sentence clusters, each of the clusters corresponding to a concept of the concepts; computing, on the processor, a saliency of each of the clusters; and naming, on the processor, each of the clusters.

    摘要翻译: 一种用于识别多个交互中的概念的方法包括:在处理器上基于间隔来过滤交互; 在处理器上从过滤的交互中创建多个句子; 在处理器上计算每个句子的显着性; 在处理器上删减具有低显着性的句子以生成一组信息句子; 在所述处理器上对所述一组信息语句的句子进行聚类以生成多个句子聚类,每个所述聚类对应于所述概念的概念; 在处理器上计算每个集群的显着性; 并在处理器上命名每个集群。

    FAST OUT-OF-VOCABULARY SEARCH IN AUTOMATIC SPEECH RECOGNITION SYSTEMS
    7.
    发明公开
    FAST OUT-OF-VOCABULARY SEARCH IN AUTOMATIC SPEECH RECOGNITION SYSTEMS 审中-公开
    FAST词典外部搜索自动语音识别系统

    公开(公告)号:EP2939234A1

    公开(公告)日:2015-11-04

    申请号:EP13866559.1

    申请日:2013-12-24

    摘要: A method including: receiving, on a computer system, a text search query, the query including one or more query words; generating, on the computer system, for each query word in the query, one or more anchor segments within a plurality of speech recognition processed audio files, the one or more anchor segments identifying possible locations containing the query word; post-processing, on the computer system, the one or more anchor segments, the post-processing including: expanding the one or more anchor segments; sorting the one or more anchor segments; and merging overlapping ones of the one or more anchor segments; and searching, on the computer system, the post-processed one or more anchor segments for instances of at least one of the one or more query words using a constrained grammar.

    OPTIMAL PII-SAFE TRAINING SET GENERATION FOR SPEECH RECOGNITION MODEL

    公开(公告)号:EP3813059A1

    公开(公告)日:2021-04-28

    申请号:EP19218256.6

    申请日:2019-12-19

    摘要: A method comprising receiving, as input, one or more audio files; applying a trained speech recognition algorithm to said one or more audio files, to obtain textual output corresponding to each of said one or more audio files; extracting, based on said textual output, from each of said one or more audio files, one or more portions having a specified syntactic pattern; selecting a subset of said portions based on at least one of: (i) a content of said textual output associated with each of said portions, (ii) a duration of each of said portions, and (iii) a confidence score assigned by said trained speech recognition algorithm to said obtained textual output; receiving, as input, transcriptions of each of said portions; generating a re-training set comprising: (iv) said portions in said subset, and (iv) said transcriptions; and re-training said trained speech recognition algorithm on said re-training set.