RADIOTELEPHONE VOICE CONTROL DEVICE, IN PARTICULAR FOR USE IN A MOTOR VEHICLE
    21.
    发明申请
    RADIOTELEPHONE VOICE CONTROL DEVICE, IN PARTICULAR FOR USE IN A MOTOR VEHICLE 审中-公开
    无线电话语音控制装置,特别用于电动车辆

    公开(公告)号:WO98045997A1

    公开(公告)日:1998-10-15

    申请号:PCT/FR1998/000687

    申请日:1998-04-03

    Abstract: The invention concerns a device comprising: a memory containing a series of numbers and vocal prints; an acoustic transducer, for picking up a correspondent's name spoken by the user; voice recognition means, for analysing the recorded correspondent's name and transforming it into a voice print; means for selectively addressing the memory, comprising associative means, for finding in the memory a voice print information corresponding to the one supplied by the voice recognition means and, if they match, for addressing the memory on the corresponding position; and means, co-operating with the associative means, for applying to the radiotelephone circuits the addressed directory number. The voice recognition means evaluate and memorise a current sound level picked up by the transducer in the absence of a word signal; in the presence of a word signal, they subtract from the picked up signal the previously evaluated current sound level and apply on the resulting signal a DTW voice recognition algorithm with form recognition by dynamic programming adapted to the word using functions for extracting dynamic parameters, in particular a dynamic predictive algorithm with forward and/or backward and/or frequency masking.

    Abstract translation: 本发明涉及一种装置,包括:存储器,其包含一系列数字和声带; 声学换能器,用于拾取用户说出的记者的姓名; 语音识别装置,用于分析记录的记者的名字并将其转换成语音打印; 用于选择性地寻址存储器的装置,包括关联装置,用于在存储器中发现对应于由语音识别装置提供的语音打印信息,并且如果它们相匹配,则用于寻址相应位置上的存储器; 以及与关联手段合作的方式,用于向无线电话电路应用所寻址的目录号码。 语音识别装置在没有字信号的情况下评估和记忆由换能器拾取的当前声音电平; 在存在字信号的情况下,它们从拾取信号中减去先前评估的当前声级,并将结果信号应用于具有表示识别的DTW语音识别算法,该算法具有适用于使用用于提取动态参数的功能的该动词的动态规划, 特别是具有前向和/或后向和/或频率掩蔽的动态预测算法。

    LPC WORD RECOGNIZER UTILIZING ENERGY FEATURES
    22.
    发明申请
    LPC WORD RECOGNIZER UTILIZING ENERGY FEATURES 审中-公开
    LPC WORD认可者利用能源特色

    公开(公告)号:WO1984001049A1

    公开(公告)日:1984-03-15

    申请号:PCT/US1983001171

    申请日:1983-08-01

    CPC classification number: G10L15/12 G10L15/02 G10L19/06

    Abstract: In a speech recognition arrangement (Fig. 1), a plurality of reference templates (130) are stored. Each template comprises a time frame sequence of feature signals of a prescribed reference pattern. A time frame sequence of feature signals representative of an unknown speech pattern is produced (115). Responsive to the feature signals of the speech pattern and the reference pattern templates, a set of signals representative of the similarity between the speech pattern and the reference templates is formed (135). The speech pattern is identified (170) as the one of the reference patterns responsive to the similarity signals. The similarity signal generation includes producing a plurality of signals for each frame of the speech pattern, each signal being representative to the correspondence of predetermined type speech pattern features and the same predetermined type features of the reference pattern. The similarity signal for the template is formed responsive to the plurality of predetermined type correspondence signals.

    FRAME SKIPPING WITH EXTRAPOLATION AND OUTPUTS ON DEMAND NEURAL NETWORK FOR AUTOMATIC SPEECH RECOGNITION
    23.
    发明申请
    FRAME SKIPPING WITH EXTRAPOLATION AND OUTPUTS ON DEMAND NEURAL NETWORK FOR AUTOMATIC SPEECH RECOGNITION 审中-公开
    具有自动语音识别需求神经网络的外推和输出的框架

    公开(公告)号:WO2016048486A1

    公开(公告)日:2016-03-31

    申请号:PCT/US2015/045750

    申请日:2015-08-18

    CPC classification number: G10L15/16 G10L15/02 G10L15/08 G10L15/12

    Abstract: Techniques related to implementing neural networks for speech recognition systems are discussed. Such techniques may include implementing frame skipping with approximated skip frames and/or distances on demand such that only those outputs needed by a speech decoder are provided via the neural network or approximation techniques.

    Abstract translation: 讨论了与语音识别系统实现神经网络有关的技术。 这样的技术可以包括实现具有近似跳过帧和/或按需的距离的跳帧,使得仅通过神经网络或近似技术提供语音解码器所需的那些输出。

    녹취된 음성 데이터에 대한 핵심어 추출 기반 발화 내용 파악 시스템과, 이 시스템을 이용한 인덱싱 방법 및 발화 내용 파악 방법
    24.
    发明申请
    녹취된 음성 데이터에 대한 핵심어 추출 기반 발화 내용 파악 시스템과, 이 시스템을 이용한 인덱싱 방법 및 발화 내용 파악 방법 审中-公开
    用于基于从记录的语音数据提取关键词的语音内容的系统,使用系统的索引方法和用于分析语音内容的方法

    公开(公告)号:WO2015068947A1

    公开(公告)日:2015-05-14

    申请号:PCT/KR2014/008706

    申请日:2014-09-18

    Inventor: 지창진

    Abstract: 녹취된 음성 데이터에 대한 핵심어 추출 기반 발화 내용 파악 시스템과, 이 시스템을 이용한 인덱싱 방법 및 발화 내용 파악 방법이 개시된다. 이 시스템의 인덱싱부는 음성 데이터를 입력받아서 프레임 단위로 음소 기준의 음성 인식을 수행하여 음소 격자를 형성하고, 복수의 프레임으로 구성되는 제한 시간의 프레임에 대해 분할된 인덱싱 정보-여기서 분할된 인덱싱 정보는 제한 시간의 프레임별로 형성되는 음소 격자를 포함함-를 생성하여 인덱싱 데이터베이스에 저장한다. 검색부는 사용자로부터 입력되는 핵심어를 검색어로 하여 인덱싱 데이터베이스에 저장된 분할된 인덱싱 정보에 대해 음소 기준의 비교를 통해 상기 검색어와 일치하는 음소열을 검색하고 일치하는 음소열에 대해 정밀한 음향학적 분석을 통해 검색어에 해당하는 음성부분을 찾아내고, 파악부는 상기 검색부에 의해 검색되는 검색 결과를 통해 주제어를 파악하여 상기 음성 데이터의 발화 내용을 파악할 수 있도록 사용자에게 출력한다.

    Abstract translation: 公开了一种基于从记录的语音数据提取关键词的语音内容的分析系统,使用该系统的索引方法和用于分析语音内容的方法。 系统的索引单元接收语音数据,并通过以帧为单位执行基于音素的语音识别形成音素格子,生成对包括多个帧的时间限制帧划分的索引信息,然后将索引信息存储在索引中 数据库,其中分割的索引信息包括为每个时间限制帧形成的音素格。 搜索单元通过使用存储在索引数据库中的分区索引信息上的基于音素的比较来搜索匹配搜索词的音素字符串,使用从用户输入的关键字作为搜索词,并找出对应于 搜索词通过对匹配音素串的精确声学分析。 分析单元通过搜索单元搜索到的搜索结果分析主题词,然后将该主题词输出给用户,以便使用户能够理解语音数据的语音内容。

    오디오 신호의 부호화, 복호화 방법 및 장치
    25.
    发明申请
    오디오 신호의 부호화, 복호화 방법 및 장치 审中-公开
    编码和解码音频信号的方法和装置

    公开(公告)号:WO2015034115A1

    公开(公告)日:2015-03-12

    申请号:PCT/KR2013/008040

    申请日:2013-09-05

    Abstract: 심리 음향 모델에 따른 마스킹 임계치를 결정함에 있어서, 짧은 윈도우 기반의 오디오 신호에 대해서도 긴 윈도우 기반의 오디오 신호를 이용하는 경우와 마찬가지로 정확한 결과를 도출할 수 있는 오디오 신호 부호화 방법 및 장치가 제공된다. 본 발명에 따른 오디오 신호 부호화 장치는 오디오 신호가 분할된 제 1 윈도우의 프레임 길이에 기초하여, 제 1 윈도우와 프레임 길이가 상이한 제 2 윈도우에 대한 마스킹 임계치를 결정하는 마스킹 임계치 결정부를 포함한다.

    Abstract translation: 提供了一种用于对音频信号进行编码和解码的装置和方法,其中当根据心理声学模型确定掩蔽阈值时,可以获得针对基于短窗口的音频信号以及基于长窗口的音频的准确结果 信号。 根据本发明的用于编码音频信号的装置包括:掩蔽阈值确定单元,被配置为基于具有分割音频信号的第一窗口的帧长度确定具有不同帧的第二窗口的掩蔽阈值 从第一个窗口的长度。

    COMMUNICATION DEVICE HAVING SPEAKER INDEPENDENT SPEECH RECOGNITION
    26.
    发明申请
    COMMUNICATION DEVICE HAVING SPEAKER INDEPENDENT SPEECH RECOGNITION 审中-公开
    具有扬声器独立语音识别的通信设备

    公开(公告)号:WO2007095277A3

    公开(公告)日:2007-10-11

    申请号:PCT/US2007003876

    申请日:2007-02-13

    Inventor: RUWISCH DIETMAR

    CPC classification number: G10L15/12 G10L15/065 G10L15/187 H04M1/271

    Abstract: Techniques for performing speech recognition in a communication device with a voice dialing function is provided. Upon receipt of a voice input in a speech recognition mode, input feature vectors are generated from the voice input. Also, a likelihood vector sequence is calculated from the input feature vectors indicating the likelihood in time of an utterance of phonetic units. In a warping operation, the likelihood vector sequence is compared to phonetic word models and word model match likelihoods are calculated for that word models. After determination of a best-matching word model, the corresponding number to the name synthesized from the best-matching word model is dialed in a dialing operation.

    Abstract translation: 提供了一种用于在具有语音拨号功能的通信设备中执行语音识别的技术。 在语音识别模式中接收到语音输入时,从语音输入生成输入特征向量。 此外,从表示发音语音单位的可能性的输入特征向量计算似然矢量序列。 在翘曲操作中,将似然矢量序列与语音字模型进行比较,并为该模型计算出字模型匹配似然度。 在确定最佳匹配字模型之后,在拨号操作中拨打从最佳匹配字模型合成的名称的对应号码。

    VOICE RECOGNITION SYSTEM USING IMPLICIT SPEAKER ADAPTATION
    28.
    发明申请
    VOICE RECOGNITION SYSTEM USING IMPLICIT SPEAKER ADAPTATION 审中-公开
    使用隐私声音适配器的语音识别系统

    公开(公告)号:WO02080142A3

    公开(公告)日:2003-03-13

    申请号:PCT/US0208727

    申请日:2002-03-22

    Applicant: QUALCOMM INC

    Abstract: A voice recognition (VR) system is disclosed that utilizes a combination of speaker independent (SI) (230 and 232) and speaker dependent (SD) (234) acoustic models. At least one SI acoustic model (230 and 232) is used in combination with at least one SD acoustic model (234) to provide a level of speech recognition performance that at least equals that of a purely SI acoustic model. The disclosed hybrid SI/SD VR system continually uses unsupervised training to update the acoustic templates in the one ore more SD acoustic models (234). The hybrid VR system then uses the updated SD acoustic models (234) in combination with the at least one SI acoustic model (230 and 232) to provide improved VR performance during VR testing.

    Abstract translation: 公开了一种利用独立于扬声器(SI)(230和232)和扬声器依赖(SD)(234)声学模型的组合的语音识别(VR)系统。 至少一个SI声学模型(230和232)与至少一个SD声学模型(234)组合使用,以提供至少等于纯SI声学模型的语音识别性能的水平。 所公开的混合SI / SD VR系统连续地使用无监督的训练来更新一个或多个SD声学模型中的声学模板(234)。 混合VR系统然后使用更新的SD声学模型(234)与至少一个SI声学模型(230和232)组合,以在VR测试期间提供改进的VR性能。

    COMBINING DTW AND HMM IN SPEAKER DEPENDENT AND INDEPENDENT MODES FOR SPEECH RECOGNITION
    29.
    发明申请
    COMBINING DTW AND HMM IN SPEAKER DEPENDENT AND INDEPENDENT MODES FOR SPEECH RECOGNITION 审中-公开
    通过映射进行自动语音识别的系统和方法

    公开(公告)号:WO02021513A1

    公开(公告)日:2002-03-14

    申请号:PCT/US2001/027625

    申请日:2001-09-05

    CPC classification number: G10L15/32 G10L15/12 G10L15/142

    Abstract: A method and system that combines voice recognition engines (104, 108, 112, 114) and resolves differences between the results of individual voice recognition engines (104, 106, 108, 112, 114) using a mapping function. Speaker independent voice recognition engine (104) and speaker-dependent voice recognition engine (106) are combined. Hidden Markov Model (HMM) engines (108, 114) and Dynamic Time Warping (DTW) engines (104, 106, 112) are combined.

    Abstract translation: 本发明涉及一种使用多个语音识别引擎(104,108,112,114)并且解决这些马达(104,106,108,112,114)中的每一个的隔离结果之间的差异的方法和系统 为此使用映射功能,使用了与说话者无关的语音识别引擎(104)和依赖于说话者的语音识别引擎(106)的组合,以及 HMM(马尔可夫隐藏模型)(108,114)和DTW(动态比较)引擎(104,106,112)。

    BOUNDARY RELAXATION FOR SPEECH PATTERN RECOGNITION
    30.
    发明申请
    BOUNDARY RELAXATION FOR SPEECH PATTERN RECOGNITION 审中-公开
    用于语音图案识别的边界放松

    公开(公告)号:WO1992006469A1

    公开(公告)日:1992-04-16

    申请号:PCT/US1991007165

    申请日:1991-10-02

    CPC classification number: G10L15/12

    Abstract: A speech recognition algorithm is implemented in a computer program by sending speech input into a coder (2) and processing it in a standard computer (4) using reference patterns stored in memory (6). The algorithm uses the well-known technique called dynamic programming to include weighting and normalizing functions.

    Abstract translation: 语音识别算法在计算机程序中通过将语音输入发送到编码器(2)中并使用存储在存储器(6)中的参考图形在标准计算机(4)中进行处理来实现。 该算法使用称为动态规划的众所周知的技术来包括加权和归一化函数。

Patent Agency Ranking