Voice retrieval apparatus, voice retrieval method, and non-transitory recording medium

    公开(公告)号:US09767790B2

    公开(公告)日:2017-09-19

    申请号:US14953729

    申请日:2015-11-30

    Inventor: Hiroki Tomita

    CPC classification number: G10L15/05 G10L15/02 G10L25/54 G10L2015/025

    Abstract: A voice retrieval apparatus executes processes of: converting a retrieval string into a phoneme string; obtaining, from a time length memory, a continuous time length for each phoneme contained in the converted phoneme string; deriving a plurality of time lengths corresponding to a plurality of utterance rates as candidate utterance time lengths of voices corresponding to the retrieval string based on the obtained continuous time length; specifying, for each of the plurality of time lengths, a plurality of likelihood obtainment segments having the derived time length within a time length of a retrieval sound signal; obtaining a likelihood showing a plausibility that the specified likelihood obtainment segment specified is a segment where the voices are uttered; and identifying, based on the obtained likelihood, for each of the specified likelihood obtainment segments, an estimation segment where utterance of the voices is estimated in the retrieval sound signal.

    VOICE PROCESSING APPARATUS
    2.
    发明申请

    公开(公告)号:US20190172445A1

    公开(公告)日:2019-06-06

    申请号:US16193163

    申请日:2018-11-16

    Inventor: Hiroki Tomita

    Abstract: A voice processing apparatus includes a first storage unit which stores a known-word, and a processor. The processor executes a voice recognition process of extracting an unknown-word by executing a voice recognition process on an input voice signal, based on a storage content of the first storage unit, and a storage control process of executing storage control to the first storage unit, wherein the storage control process includes a process of storing, when information of a number of unknown-words which are recognized to be identical, among the extracted unknown-words by the voice recognition process, meets a predetermined condition, a corresponding unknown-word in the first storage unit as a known-word.

    Voice retrieval apparatus, voice retrieval method, and non-transitory recording medium

    公开(公告)号:US09754024B2

    公开(公告)日:2017-09-05

    申请号:US14953775

    申请日:2015-11-30

    Inventor: Hiroki Tomita

    Abstract: A voice retrieval apparatus executes processes of: obtaining, from a time length memory, a continuous time length for each phoneme contained in a phoneme string of a retrieval string; obtaining user-specified information on an utterance rate; changing the continuous time length for each obtained phoneme in accordance with the obtained information; deriving, based on the changed continuous time length, an utterance time length of voices corresponding to the retrieval string; specifying a plurality of likelihood obtainment segments of the derived utterance time length in a time length of a retrieval sound signal; obtaining a likelihood showing a plausibility that the specified likelihood obtainment segment is a segment where the voices are uttered; and identifying, based on the obtained likelihood, an estimation segment where, within the retrieval sound signal, utterance of the voices is estimated, the estimation segment being identified for each specified likelihood obtainment segment.

    Voice search device, voice search method, and non-transitory recording medium
    5.
    发明授权
    Voice search device, voice search method, and non-transitory recording medium 有权
    语音搜索装置,语音搜索方法和非暂时记录媒体

    公开(公告)号:US09431007B2

    公开(公告)日:2016-08-30

    申请号:US14597958

    申请日:2015-01-15

    Inventor: Hiroki Tomita

    Abstract: In a voice search device, a processor acquires a search word, converts the search word into a phoneme sequence, acquires, for each frame, an output probability of a feature quantity of a target voice signal being output from each phoneme included in the phoneme sequence, and executes relative calculation of the output probability acquired from each phoneme, based on an output probability acquired from another phoneme included in the phoneme sequence. In addition, the processor successively designates likelihood acquisition zones, acquires a likelihood indicating how likely a designated likelihood acquisition zone is a zone in which voice corresponding to the search word is spoken, and identifies from the target voice signal an estimated zone for which the voice corresponding to the search word is estimated to be spoken, based on the acquired likelihood.

    Abstract translation: 在语音搜索装置中,处理器获取搜索词,将搜索词转换成音素序列,为每个帧获取从包含在音素序列中的每个音素输出的目标语音信号的特征量的输出概率 并且基于从包括在音素序列中的另一音素获取的输出概率,执行从每个音素获取的输出概率的相对计算。 此外,处理器连续地指定可能性获取区域,获取表示指定的可能性获取区域是与哪个语音相对应的语音的区域的可能性,并且从目标语音信号中识别语音的估计区域 基于获得的可能性,估计对应于搜索词的口令。

Patent Agency Ranking