SPEECH RECOGNITION SYSTEM AND SPEECH RECOGNIZING METHOD
    21.
    发明申请
    SPEECH RECOGNITION SYSTEM AND SPEECH RECOGNIZING METHOD 有权
    语音识别系统和语音识别方法

    公开(公告)号:US20110224980A1

    公开(公告)日:2011-09-15

    申请号:US13044737

    申请日:2011-03-10

    IPC分类号: G10L15/20

    摘要: A speech recognition system according to the present invention includes a sound source separating section which separates mixed speeches from multiple sound sources from one another; a mask generating section which generates a soft mask which can take continuous values between 0 and 1 for each frequency spectral component of a separated speech signal using distributions of speech signal and noise against separation reliability of the separated speech signal; and a speech recognizing section which recognizes speeches separated by the sound source separating section using soft masks generated by the mask generating section.

    摘要翻译: 根据本发明的语音识别系统包括:声源分离部,其将来自多个声源的混合语音彼此分离; 掩模生成部,其生成软掩模,该软掩模使用语音信号和噪声的分布对分离的语音信号的分离可靠性进行分离的语音信号的每个频谱分量的0和1之间的连续值; 以及语音识别部,其使用由所述掩模生成部生成的软掩模来识别由所述声源分离部分隔开的语音。

    Robotics visual and auditory system
    22.
    发明授权
    Robotics visual and auditory system 有权
    机器人视觉和听觉系统

    公开(公告)号:US07526361B2

    公开(公告)日:2009-04-28

    申请号:US10506167

    申请日:2002-08-30

    IPC分类号: G06F19/00

    CPC分类号: G06K9/0057 B25J13/003

    摘要: Robotics visual and auditory system is provided which is made capable of accurately conducting the sound source localization of a target by associating a visual and an auditory information with respect to a target. It is provided with an audition module (20), a face module (30), a stereo module (37), a motor control module (40), an association module (50) for generating streams by associating events from said each module (20, 30, 37, and 40), and an attention control module (57) for conducting attention control based on the streams generated by the association module (50), and said association module (50) generates an auditory stream (55) and a visual stream (56) from a auditory event (28) from the auditory module (20), a face event (39) from the face module (30), a stereo event (39a) from the stereo module (37), and a motor event (48) from the motor control module (40), and an association stream (57) which associates said streams, as well as said audition module (20) collects sub-bands having the interaural phase difference (IPD) or the interaural intensity difference (IID) within the preset range by an active direction pass filter (23a) having a pass range which, according to auditory characteristics, becomes minimum in the frontal direction, and larger as the angle becomes wider to the left and right, based on an accurate sound source directional information from the association module (50), and conducts sound source separation by restructuring the wave shape of the sound source.

    摘要翻译: 提供了机器人视觉和听觉系统,其能够通过将视觉和听觉信息相对于目标相关联来准确地进行目标的声源定位。 它设置有试听模块(20),面部模块(30),立体声模块(37),电机控制模块(40),通过将来自所述每个模块的事件相关联来生成流的关联模块(50) 20,30,37和40),以及用于基于由关联模块(50)生成的流进行注意控制的注意力控制模块(57),并且所述关联模块(50)生成听觉流(55)和 来自听觉模块(20)的听觉事件(28)的视觉流(56),来自面部模块(30)的面部事件(39),来自立体声模块(37)的立体声事件(39a)以及 来自马达控制模块(40)的马达事件(48)以及关联流(57),所述连接流(57)以及所述试奏模块(20)收集具有所述相位差(IPD)或 通过具有通过范围的有源方向通过滤波器(23a)在预设范围内的昼间强度差(IID),其根据听觉字符 基于来自关联模块(50)的准确的声源方向信息,在正面方向上变得最小,并且随着角度变宽到更大,并且通过重构波形的波形来进行声源分离 声源。

    Musical score position estimating device, musical score position estimating method, and musical score position estimating robot
    23.
    发明授权
    Musical score position estimating device, musical score position estimating method, and musical score position estimating robot 有权
    音乐得分位置估计装置,乐谱位置估计方法和乐谱位置估计机器人

    公开(公告)号:US08889976B2

    公开(公告)日:2014-11-18

    申请号:US12851994

    申请日:2010-08-06

    IPC分类号: G04B13/00 G10H1/36 G10L25/90

    摘要: A musical score position estimating device includes an audio signal acquiring unit, a musical score information acquiring unit acquiring musical score information corresponding to an audio signal acquired by the audio signal acquiring unit, an audio signal feature extracting unit extracting a feature amount of the audio signal, a musical score feature extracting unit extracting a feature amount of the musical score information, a beat position estimating unit estimating a beat position of the audio signal, and a matching unit matching the feature amount of the audio signal with the feature amount of the musical score information using the estimated beat position to estimate a position of a portion in the musical score information corresponding to the audio signal.

    摘要翻译: 音乐分数位置估计装置包括:音频信号获取单元,乐谱信息获取单元,获取与由音频信号获取单元获取的音频信号相对应的乐谱信息,提取音频信号的特征量的音频信号特征提取单元 提取乐谱特征量的乐谱特征提取单元,估计音频信号的拍子位置的拍子位置估计单元以及与音频信号的特征量相匹配的匹配单元与音乐的特征量 使用估计的拍子位置来评估信息,以估计与音频信号相对应的乐谱信息中的一部分的位置。

    Reverberation suppressing apparatus and reverberation suppressing method
    24.
    发明授权
    Reverberation suppressing apparatus and reverberation suppressing method 有权
    混响抑制装置和混响抑制方法

    公开(公告)号:US08391505B2

    公开(公告)日:2013-03-05

    申请号:US12791428

    申请日:2010-06-01

    IPC分类号: H04B3/20

    CPC分类号: H04M9/082

    摘要: A reverberation suppressing apparatus separating sound source signals based on input signals output from microphones collecting the plurality of sound source signals, includes a sound signal output unit generating sound signals and outputting the generated sound signals, a sound acquiring unit acquiring the input signals from microphones, a first evaluation function calculation unit calculating a separation matrix, the input signals, and the sound source signals, and calculating a first evaluation function, a reverberation component suppressing unit calculating an optimal separation matrix, and suppressing a reverberation component by separating the sound source signals other than the generated sound signals, and a separation matrix updating unit dividing a step-size function, approximating each segment to a linear function, calculating step sizes based on the approximated linear functions, and repeatedly updating the separation matrix so that the degree of separation of the sound source signals exceeds the predetermined value.

    摘要翻译: 基于从收集多个声源信号的麦克风输出的输入信号分离声源信号的混响抑制装置包括产生声音信号并输出​​所生成的声音信号的声音信号输出单元,从麦克风获取输入信号的声音获取单元, 第一评估函数计算单元,计算分离矩阵,输入信号和声源信号,并计算第一评估函数,混响分量抑制单元计算最佳分离矩阵,并通过分离声源信号来抑制混响分量 除了产生的声音信号之外,还有分离矩阵更新单元,将步长函数除以近似线性函数,根据近似线性函数计算步长,并重复更新分离矩阵,使得分离度 的声音很酸 ce信号超过预定值。

    SPEECH RECOGNITION SYSTEM AND METHOD FOR GENERATING A MASK OF THE SYSTEM
    25.
    发明申请
    SPEECH RECOGNITION SYSTEM AND METHOD FOR GENERATING A MASK OF THE SYSTEM 有权
    语音识别系统及其系统生成方法

    公开(公告)号:US20100082340A1

    公开(公告)日:2010-04-01

    申请号:US12543759

    申请日:2009-08-19

    IPC分类号: G10L15/20 G10L15/00

    CPC分类号: G10L15/20 G10L21/0272

    摘要: The speech recognition system of the present invention includes: a sound source separating section which separates mixed speeches from multiple sound sources; a mask generating section which generates a soft mask which can take continuous values between 0 and 1 for each separated speech according to reliability of separation in separating operation of the sound source separating section; and a speech recognizing section which recognizes speeches separated by the sound source separating section using soft masks generated by the mask generating section.

    摘要翻译: 本发明的语音识别系统包括:声源分离部分,用于将多个声源的混合语音分离; 掩模生成部,根据所述声源分离部的分离动作的分离的可靠性,生成能够对每个分离的语音取0〜1的连续值的软掩模; 以及语音识别部,其使用由所述掩模生成部生成的软掩模来识别由所述声源分离部分隔开的语音。

    Robot audiovisual system
    26.
    发明授权
    Robot audiovisual system 失效
    机器人视听系统

    公开(公告)号:US06967455B2

    公开(公告)日:2005-11-22

    申请号:US10468396

    申请日:2002-03-08

    摘要: A robot visuoauditory system that makes it possible to process data in real time to track vision and audition for an object, that can integrate visual and auditory information on an object to permit the object to be kept tracked without fail and that makes it possible to process the information in real time to keep tracking the object both visually and auditorily and visualize the real-time processing is disclosed. In the system, the audition module (20) in response to sound signals from microphones extracts pitches therefrom, separate their sound sources from each other and locate sound sources such as to identify a sound source as at least one speaker, thereby extracting an auditory event (28) for each object speaker. The vision module (30) on the basis of an image taken by a camera identifies by face, and locate, each such speaker, thereby extracting a visual event (39) therefor. The motor control module (40) for turning the robot horizontally. extracts a motor event (49) from a rotary position of the motor. The association module (60) for controlling these modules forms from the auditory, visual and motor control events an auditory stream (65) and a visual stream (66) and then associates these streams with each other to form an association stream (67). The attention control module (6) effects attention control designed to make a plan of the course in which to control the drive motor, e.g., upon locating the sound source for the auditory event and locating the face for the visual event, thereby determining the direction in which each speaker lies. The system also includes a display (27, 37, 48, 68) for displaying at least a portion of auditory, visual and motor information. The attention control module (64) servo-controls the robot on the basis of the association stream or streams.

    摘要翻译: 机器人视觉系统,使得可以实时处理数据以跟踪对象的视觉和试镜,从而可以将物体上的视觉和听觉信息整合在一起,以允许对象被保持跟踪而不会失败,这使得可以处理 公开了实时的信息,以视觉和听觉方式跟踪对象,并且可视化实时处理。 在系统中,响应于来自麦克风的声音信号的试听模块(20)从其中提取音高,从而将它们的声源彼此分离,并且定位声源,例如将声源识别为至少一个扬声器,从而提取听觉事件 (28)。 基于由照相机拍摄的图像的视觉模块(30)通过面部识别并定位每个这样的扬声器,从而提取其视觉事件(39)。 用于水平地转动机器的马达控制模块(40)。 从马达的旋转位置提取马达事件(49)。 用于控制这些模块的关联模块(60)从听觉,视觉和运动控制事件形成听觉流(65)和视觉流(66),然后将这些流彼此关联以形成关联流(67)。 注意力控制模块(6)实现设计的注意力控制,以制定控制驱动电动机的过程的计划,例如,在定位用于听觉事件的声源并定位视觉事件的面部,从而确定方向 每个演讲者都在其中。 该系统还包括用于显示听觉,视觉和运动信息的至少一部分的显示器(27,37,48,68)。 注意力控制模块(64)基于关联流或流来对机器人进行伺服控制。

    Robot, method and program of correcting a robot voice in accordance with head movement
    27.
    发明授权
    Robot, method and program of correcting a robot voice in accordance with head movement 有权
    根据头部移动校正机器人声音的机器人,方法和程序

    公开(公告)号:US08639511B2

    公开(公告)日:2014-01-28

    申请号:US12881812

    申请日:2010-09-14

    IPC分类号: G10L13/00 G10L13/02 G10L21/00

    摘要: A robot may include a driving control unit configured to control a driving of a movable unit that is connected movably to a body unit, a voice generating unit configured to generate a voice, and a voice output unit configured to output the voice, which has been generated by the voice generating unit. The voice generating unit may correct the voice, which is generated, based on a bearing of the movable unit, which is controlled by the driving control unit, to the body unit.

    摘要翻译: 机器人可以包括驱动控制单元,其被配置为控制可移动地连接到主体单元的可移动单元的驱动,被配置为产生语音的语音生成单元和被配置为输出声音的语音输出单元 由语音产生单元生成。 声音产生单元可以基于由驱动控制单元控制的可移动单元的轴承将生成的声音校正到主体单元。

    Musical score position estimating apparatus, musical score position estimating method, and musical score position estimating program
    28.
    发明授权
    Musical score position estimating apparatus, musical score position estimating method, and musical score position estimating program 有权
    音乐得分位置估计装置,乐谱位置估计方法和乐谱位置估计程序

    公开(公告)号:US08440901B2

    公开(公告)日:2013-05-14

    申请号:US13038124

    申请日:2011-03-01

    IPC分类号: G10H1/00

    CPC分类号: G09B15/02

    摘要: A musical score position estimating apparatus includes a sound feature quantity generating unit configured to generate a feature quantity of an input sound signal, and, a score position estimating unit configured to calculate a weight coefficient based on the feature quantity of the sound signal and a feature quantity of musical score information and estimates a musical score position using a virtual musical score position and a virtual tempo corresponding to the weight coefficient.

    摘要翻译: 乐谱位置估计装置包括:声音特征量生成单元,被配置为生成输入声音信号的特征量;乐谱位置估计单元,被配置为基于声音信号的特征量和特征量来计算权重系数 使用虚拟乐谱位置和对应于权重系数的虚拟节奏来估计乐谱位置的数量。

    Robotics visual and auditory system
    29.
    发明申请
    Robotics visual and auditory system 审中-公开
    机器人视觉和听觉系统

    公开(公告)号:US20090030552A1

    公开(公告)日:2009-01-29

    申请号:US10539047

    申请日:2003-02-12

    摘要: It is a robotics visual and auditory system provided with an auditory module (20), a face module (30), a stereo module (37), a motor control module (40), and an association module (50) to control these respective modules. The auditory module (20) collects sub-bands having interaural phase difference (IPD) or interaural intensity difference (IID) within a predetermined range by an active direction pass filter (23a) having a pass range which, according to auditory characteristics, becomes minimum in the frontal direction, and larger as the angle becomes wider to the left and right, based on an accurate sound source directional information from the association module (50), and conducts sound source separation by restructuring a wave shape of a sound source, conducts speech recognition of separated sound signals from respective sound sources using a plurality of acoustic models (27d), integrates speech recognition results from each acoustic model by a selector, and judges the most reliable speech recognition result among the speech recognition results.

    摘要翻译: 它是具有听觉模块(20),面部模块(30),立体声模块(37),电机控制模块(40)和关联模块(50)的机器人视觉和听觉系统,用于控制这些相应的 模块。 听觉模块(20)通过具有根据听觉特性变为最小的通过范围的有源方向通过滤波器(23a)来收集在预定范围内的具有耳间相位差(IPD)或urala内强度差(IID)的子带 基于来自关联模块(50)的准确的声源方向信息,通过重新构成声源的波形来进行声源分离,进行左侧和右侧的角度的变宽, 使用多个声学模型(27d)对来自相应声源的分离的声音信号进行语音识别,通过选择器对来自每个声学模型的语音识别结果进行积分,并且判断语音识别结果中最可靠的语音识别结果。

    Conversation System and Conversation Software
    30.
    发明申请
    Conversation System and Conversation Software 有权
    对话系统和对话软件

    公开(公告)号:US20080319748A1

    公开(公告)日:2008-12-25

    申请号:US12087791

    申请日:2007-01-31

    IPC分类号: G10L15/04

    CPC分类号: G10L15/22 G10L15/1822

    摘要: A first domain satisfying a first condition concerning a current utterance understanding result and a second domain satisfying a second condition concerning a selection history are specified. For each of the first and second domains, indices representing reliability in consideration of the utterance understanding history, selection history, and utterance generation history are evaluated. Based on the evaluation results, one of the first, second, and third domains is selected as a current domain according to a selection rule.

    摘要翻译: 指定满足关于当前话语理解结果的第一条件的第一域和满足关于选择历史的第二条件的第二域。 对于第一和第二域中的每一个,评估考虑到话语理解历史,选择历史和话语生成历史的可靠性的索引。 根据评估结果,根据选择规则选择第一,第二和第三域之一作为当前域。