SYSTEM AND METHOD FOR CALL PROGRESS DETECTION
    1.
    发明申请
    SYSTEM AND METHOD FOR CALL PROGRESS DETECTION 审中-公开
    用于呼叫进程检测的系统和方法

    公开(公告)号:WO2016140977A1

    公开(公告)日:2016-09-09

    申请号:PCT/US2016/020281

    申请日:2016-03-01

    Abstract: A contact center includes an outbound server to make a call to a callee and a media device. The media device receives an audio signal based on the call, to determine a Mel-frequency cepstral coefficient for the received audio signal, and to match the Mel-frequency cepstral coefficient for the audio signal to a Mel-frequency cepstral coefficient for a pre-recorded carrier message. The media device can determine a content of the audio signal based on the match.

    Abstract translation: 联络中心包括一个出站服务器,用于呼叫被叫方和媒体设备。 媒体设备基于该呼叫接收音频信号,以确定所接收的音频信号的梅尔频率倒谱系数,并将用于音频信号的梅尔频率倒谱系数与梅尔频率倒谱系数匹配, 记录载体信息。 媒体设备可以基于匹配来确定音频信号的内容。

    音声検索装置および音声検索方法
    2.
    发明申请
    音声検索装置および音声検索方法 审中-公开
    语音搜索设备和语音搜索方法

    公开(公告)号:WO2010098209A1

    公开(公告)日:2010-09-02

    申请号:PCT/JP2010/051937

    申请日:2010-02-10

    CPC classification number: G10L15/12 G10L2015/025

    Abstract:  検索速度が高速であり、かつ、検索性能も良好である、あいまい検索を行う音声検索装 置および音声検索方法を提供する。  接尾辞配列と動的計画法をともに用いて音声に対する、あいまい検索を行うだけでなく、音声データに含まれる音素弁別特徴間の距離を算出して類似性を判定し、音素による検索キーワード分割と、複数の分割された検索キーワードに対する検索の閾値により検索対象を絞り込み、検索の閾値を逐次的に増加させながら繰り返し検索し、検索キーワードの長さに応じてキーワード分割の有無を判定することにより、検索速度が高速で、検索性能も良好な音声検索を実現している。

    Abstract translation: 提供了一种语音搜索装置,其搜索速度非常快,其搜索性能也优异,并且执行模糊搜索和语音搜索方法。 不仅执行模糊搜索,而且还计算包括在语音数据中的音素辨别特征之间的距离,以使用后缀数组和动态编程来确定相对于语音的相似度,并且要搜索的对象被 基于相对于多个划分的搜索关键词的音素和搜索阈值的搜索关键词划分的手段,重复地搜索要搜索的对象,同时依次增加搜索阈值,并且是否存在关键词分割 根据搜索关键字的长度确定,从而实现语音搜索,搜索速度非常快,搜索性能也很好。

    METHOD FOR COMPRESSING DICTIONARY DATA
    4.
    发明申请
    METHOD FOR COMPRESSING DICTIONARY DATA 审中-公开
    压缩字数据的方法

    公开(公告)号:WO2003042973A1

    公开(公告)日:2003-05-22

    申请号:PCT/FI2002/000875

    申请日:2002-11-08

    Inventor: TIAN, Jilei

    CPC classification number: G10L15/12 G10L2015/025 H03M7/30

    Abstract: The invention relates to pre-processing of a pronunciation dictionary for compression in a data processing device, the pronunciation dictionary comprising at least one entry, the entry comprising a sequence of character units and a sequence of phoneme units. According to one aspect of the invention the sequence of character units and the sequence of phoneme units are aligned using a statistical algorithm. The aligned sequence of character units and aligned sequence of phoneme units are interleaved by inserting each phoneme unit at a predetermined location relative to the corresponding character unit.

    Abstract translation: 本发明涉及一种用于在数据处理设备中进行压缩的发音字典的预处理,该发音字典包括至少一个条目,该条目包括一系列字符单元和一系列音素单元。 根据本发明的一个方面,使用统计算法来对齐字符单元的序列和音素单元的序列。 通过将每个音素单元相对于相应的字符单元插入预定位置来交织字符单元的排列顺序和对准的音素单元的顺序。

    SPEECH RECOGNITION AND SIGNAL ANALYSIS BY STRAIGHT SEARCH OF SUBSEQUENCES WITH MAXIMAL CONFIDENCE MEASURE
    5.
    发明申请
    SPEECH RECOGNITION AND SIGNAL ANALYSIS BY STRAIGHT SEARCH OF SUBSEQUENCES WITH MAXIMAL CONFIDENCE MEASURE 审中-公开
    以最大的信心度量直接搜索后续的语音识别和信号分析

    公开(公告)号:WO00051107A1

    公开(公告)日:2000-08-31

    申请号:PCT/IB2000/000189

    申请日:2000-02-22

    CPC classification number: G10L15/142 G10L15/10 G10L15/12 G10L2015/088

    Abstract: The invention belongs to the technical domain of decoding, classification, alignment and matching of data. The invention refers to new methods of keyword spotting in utterances, detection of subsequences in chains of organic matter (DNA) and recognition of objects in images. The proposed methods search in an optimized way the matching that maximizes, over all the possible matchings, certain confidence measures based on normalized posteriors. Three such confidence measures are used, two are inspired from anterior work in Speech Recognition, and the third one is a new one. Application fields for this invention are: man-machine interfaces (using speech recognition; ex: control systems, banking, flight services, etc.), coordination systems (for industrial robots and automata) and development systems for pharmaceutic products.

    Abstract translation: 本发明属于数据解码,分类,对齐和匹配的技术领域。 本发明涉及话语中关键词发现的新方法,有机物(DNA)链中的子序列的检测和图像中物体的识别。 所提出的方法以优化的方式搜索匹配,使所有可能的匹配最大化,基于归一化后验的某些置信度量。 使用三种这样的信心措施,其中两种来自语音识别中的前期工作,第三种是新的。 本发明的应用领域是:人机界面(使用语音识别;例如:控制系统,银行,飞行服务等),协调系统(用于工业机器人和自动机)以及药品开发系统。

    CONTINUOUS SPEECH RECOGNITION SYSTEM
    6.
    发明申请
    CONTINUOUS SPEECH RECOGNITION SYSTEM 审中-公开
    连续语音识别系统

    公开(公告)号:WO1981002943A1

    公开(公告)日:1981-10-15

    申请号:PCT/US1981000425

    申请日:1981-04-01

    CPC classification number: G10L15/12 G10L15/00

    Abstract: A continuous speech analyzer (Fig. 1) is adapted to recognize an utterance (101) as a series string of reference words (130) for which acoustic feature signals are stored (105). Responsive to the utterance (103) and reference word acoustic features (105), at least one reference word series is generated as a candidate for the utterance. Successive word positions for the utterance are identified. In each word position, partial candidate series are generated by a dynamic time WARP partitioning circuit (110) determining a distance signal reference corresponding to a prescribed similarity of utterance segment intervals and reference template involving a partial candidate series of the preceding word position. The candidate utterance segments (130) have beginning points within a predetermined range of the utterance position endpoint for the preceding word position candidate series to account for coarticulation and differences between acoustic features of the utterance and those for reference words (105) spoken in isolation. A minimum distance signal (170) selected from a plurality of partial candidates identifies the candidate string closest to the utterance.

    Abstract translation: 连续语音分析器(图1)适于将话音(101)识别为存储声学特征信号的参考词(130)的串行串(105)。 响应于话语(103)和参考词声学特征(105),产生至少一个参考词序列作为发音的候选。 确定说话的连续字位。 在每个字位置中,部分候选序列由动态时间WARP分割电路(110)生成,该电路确定对应于发声段间隔的规定相似度的距离信号参考,以及涉及前一字位置的部分候选序列的参考模板。 候选话音段(130)具有在先前词位置候选序列的发声位置端点的预定范围内的开始点,以解释话音的声学特征和孤立地说出的参考词(105)的声学特征之间的差异。 从多个部分候选中选择的最小距离信号(170)识别最接近发音的候选字符串。

    一种模数转换器及模数转换方法
    7.
    发明申请

    公开(公告)号:WO2016101762A1

    公开(公告)日:2016-06-30

    申请号:PCT/CN2015/095695

    申请日:2015-11-26

    Inventor: 杨金达 周立人

    Abstract: 本发明实施例公开一种模数转换器以及模数转换方法,所述模数转换器包括:时钟生成器,包括M个传输门,所述M个传输门用于接收周期性的第一时钟信号,并分别对所述第一时钟信号进行选通控制,生成M个第二时钟信号,其中,M为大于等于2的整数,所述第一时钟信号的每个周期中包括M个时钟脉冲,所述M个第二时钟信号的周期与所述第一时钟信号的周期相等,且所述M个第二时钟信号的每个周期中分别包括所述M个时钟脉冲中的一个时钟脉冲;以时间交织方式配置的M个ADC通道,用于接收一个模拟信号,并分别在所述M个第二时钟信号的控制下,对所述模拟信号进行采样以及模数转换,得到M个数字信号,其中每个ADC通道分别对应所述M个第二时钟信号中的一个时钟信号;加法器,用于在数字域对所述M个数字信号相加,得到一个数字输出信号。

    SYSTEM AND METHOD FOR RECOGNIZING A USER VOICE COMMAND IN NOISY ENVIRONMENT
    8.
    发明申请
    SYSTEM AND METHOD FOR RECOGNIZING A USER VOICE COMMAND IN NOISY ENVIRONMENT 审中-公开
    用于识别噪声环境中的用户语音命令的系统和方法

    公开(公告)号:WO2012025579A1

    公开(公告)日:2012-03-01

    申请号:PCT/EP2011/064588

    申请日:2011-08-24

    CPC classification number: G10L15/02 G10L15/063 G10L15/12 G10L15/16 G10L15/20

    Abstract: An automatic speech recognition system for recognizing a user (2) voice command in noisy environment, comprising -matching means for matching elements retrieved from speech units forming said command with templates in a template library (44); characterized by -processing means (32, 36, 38) including a MultiLayer Perceptron (38) for computing posterior templates (P(O template(q) )) stored as said templates in said template library (44); -means for retrieving posterior vectors (P(O test(q) )) from said speech units, said posterior vectors being used as said elements. The present invention relates also to a method for recognizing a user voice command in noisy environments.

    Abstract translation: 一种用于在嘈杂环境中识别用户(2)语音命令的自动语音识别系统,包括 - 匹配装置,用于将从形成所述命令的语音单元检索的元素与模板库(44)中的模板相匹配; 其特征在于包括用于计算在所述模板库(44)中作为所述模板存储的后验模板(P(O模板(q)))的多层感知器(38)的处理装置(32,36,38)。 - 用于从所述语音单元检索后向量向量(P(Otest(q))),所述后向量被用作所述元素。 本发明还涉及一种用于在嘈杂环境中识别用户语音命令的方法。

    METHOD AND APPARATUS FOR DYNAMIC BEAM CONTROL IN VITERBI SEARCH
    9.
    发明申请
    METHOD AND APPARATUS FOR DYNAMIC BEAM CONTROL IN VITERBI SEARCH 审中-公开
    VITERBI搜索中动态光束控制的方法与装置

    公开(公告)号:WO2003005344A1

    公开(公告)日:2003-01-16

    申请号:PCT/RU2001/000264

    申请日:2001-07-03

    CPC classification number: G10L15/08 G10L15/12 G10L2015/085

    Abstract: A method is presented including selecting an initial beam width. The method also includes determining whether a value per frame is changing. A beam width is dynamically adjusted. The method further decides a speech input with the dynamically adjusted beam width. Also, a device is presented including a processor (420). A speech recognition component (610) is connected to the processor (420). A memory (410) is connected to the processor (420). The speech recognition component (610) dynamically adjusts a beam width to decode a speech input.

    Abstract translation: 提出了一种选择初始波束宽度的方法。 该方法还包括确定每帧的值是否正在改变。 波束宽度被动态调整。 该方法进一步确定具有动态调整的波束宽度的语音输入。 而且,呈现包括处理器(420)的设备。 语音识别组件(610)连接到处理器(420)。 存储器(410)连接到处理器(420)。 语音识别组件(610)动态调整波束宽度以解码语音输入。

    NOISE PADDING AND NORMALIZATIONIN DYNAMIC TIME WARPING
    10.
    发明申请
    NOISE PADDING AND NORMALIZATIONIN DYNAMIC TIME WARPING 审中-公开
    噪音包装和正常化动态时间加热

    公开(公告)号:WO0041167B1

    公开(公告)日:2000-10-19

    申请号:PCT/IL0000007

    申请日:2000-01-03

    Inventor: ERELL ADORAM

    CPC classification number: G10L15/20 G10L15/12 G10L21/0216

    Abstract: Speech recognition uses a wide token builder (66), gain and noise adapter (70) and noise adapted Dynamic Time Warping (60). Wide token builder produces a padded test token expanded with at least one blank frame before and after the input test utterance. Gain and noise adapter adapts each padded reference template with noise and gain qualities producing adapted reference templates having noise frames wherever a blank frame was originally placed and noise adapted speech where speech exists. Dynamic Time Warping (DTW) is performed on the noise adapted templates.

    Abstract translation: 语音识别使用宽令牌构建器(66),增益和噪声适配器(70)和噪声适应的动态时间扭曲(60)。 宽标记构建器生成填充的测试令牌,在输入测试语音之前和之后至少展开一个空白框。 增益和噪声适配器使每个填充的参考模板适应噪声和增益质量,产生具有噪声帧的适应参考模板,无论空白帧最初放置在哪里,以及噪声适应的语音存在于语音中。 动态时间扭曲(DTW)是对噪声适应模板进行的。

Patent Agency Ranking