ANCHOR MODEL ADAPTATION DEVICE, INTEGRATED CIRCUIT, AV (AUDIO VIDEO) DEVICE, ONLINE SELF-ADAPTATION METHOD, AND PROGRAM THEREFOR
    11.
    发明申请
    ANCHOR MODEL ADAPTATION DEVICE, INTEGRATED CIRCUIT, AV (AUDIO VIDEO) DEVICE, ONLINE SELF-ADAPTATION METHOD, AND PROGRAM THEREFOR 审中-公开
    ANCHOR型号适配器件,集成电路,AV(音频视频)器件,在线自适应方法及其程序

    公开(公告)号:US20120093327A1

    公开(公告)日:2012-04-19

    申请号:US13379827

    申请日:2011-04-19

    IPC分类号: H04R29/00

    CPC分类号: G10L25/57 G10L2015/0631

    摘要: The present invention provides a device that performs online self-adaption of anchor models for an acoustic space, and a method thereof, the anchor models being used for categorization of an AV stream which is performed based on an audio stream in the AV stream. The device divides an input audio stream into audio segments, each being estimated to have a single acoustic feature, and estimates a single probability model for each audio segment. Then, the device performs clustering on the estimated probability models and probability models stored therein, thereby generating a new anchor model.

    摘要翻译: 本发明提供一种执行用于声学空间的锚模型的在线自适应的装置及其方法,所述锚模型用于基于AV流中的音频流执行的AV流的分类。 该设备将输入音频流划分成音频段,每个音频段被估计具有单个声学特征,并且估计每个音频段的单个概率模型。 然后,设备对存储在其中的估计概率模型和概率模型进行聚类,从而生成新的锚模型。

    Method and apparatus of speech recognition and speech control system using the speech recognition method
    12.
    发明授权
    Method and apparatus of speech recognition and speech control system using the speech recognition method 有权
    使用语音识别方法的语音识别和语音控制系统的方法和装置

    公开(公告)号:US06308152B1

    公开(公告)日:2001-10-23

    申请号:US09337734

    申请日:1999-06-22

    IPC分类号: G10L1522

    CPC分类号: G10L15/08

    摘要: A string of acoustic feature parameters of each of recognition-desired words and a string of acoustic feature parameters of each of reception words are registered in advance. When an uttered word is received, a string of acoustic feature parameters is extracted from the uttered word, the acoustic feature parameters of the uttered word is compared with the string of acoustic feature parameters of each recognition-desired word, and a recognition-desired word recognition score indicating a similarity degree between the uttered word and each recognition-desired word is calculated. Also, a reception word recognition score indicating a similarity degree between the uttered word and each reception word is calculated. In cases where a particular recognition-desired word recognition score corresponding to a particular recognition-desired word is higher than the highest reception word recognition score, the utter word is recognized as the particular recognition-desired word, and an operation of an electric apparatus is controlled according to the particular recognition-desired word. In contrast, in cases where a particular reception word recognition score corresponding to a particular reception word is higher than the highest recognition-desired word recognition score, the utter word is recognized as the particular reception word and is rejected, so that the electric apparatus is not operated.

    摘要翻译: 预先记录每个识别期望单词和每个接收单词的一组声学特征参数的一串声学特征参数。 当接收到发出的字时,从发出的字中提取一串声学特征参数,将发音字的声学特征参数与每个识别期望字的声学特征参数串进行比较,并且识别期望字 计算表示发音字和每个识别期望字之间的相似度的识别分数。 此外,计算表示发音字和每个接收字之间的相似度的接收字识别分数。 在与特定识别期望字相对应的特定识别期望字识别得分高于最高接收字识别分数的情况下,将该字识别为特定识别期望字,电器的操作为 根据特定的识别期望字控制。 相反,在与特定接收字相对应的特定接收字识别得分高于最高识别期望字识别分数的情况下,将该字识别为特定接收字,并被拒绝,使得电器是 没有操作。

    Audio classification by comparison of feature sections and integrated features to known references
    13.
    发明授权
    Audio classification by comparison of feature sections and integrated features to known references 有权
    通过将功能部分和集成功能与已知参考文献的比较进行音频分类

    公开(公告)号:US08892497B2

    公开(公告)日:2014-11-18

    申请号:US13382362

    申请日:2011-03-15

    摘要: To classify moving images using audio signals. An audio signal is acquired, a section feature relating to an audio frequency distribution is extracted with respect to each of a plurality of sections each having a predetermined length contained in the acquired audio signal, each extracted section feature is compared with each of reference section features to calculate a section similarity indicating a degree of correlation between each section feature and each reference section feature. An integrated feature relating to the plurality of sections and being calculated based on the section similarity calculated with respect to each of the plurality of sections is extracted from the acquired audio signal. The extracted integrated feature is compared with each of one or more reference integrated features, and the audio signal is classified based on comparison result. Then, classification result is used for moving image classification.

    摘要翻译: 使用音频信号对运动图像进行分类。 获取音频信号,提取与获取的音频信号中包含的具有预定长度的多个部分中的每一个相关的音频分布相关的部分特征,将每个提取的部分特征与参考部分特征 以计算表示每个部分特征与每个参考部分特征之间的相关程度的部分相似度。 从所获取的音频信号中提取与多个部分相关的并且基于相对于多个部分中的每一个计算的部分相似度来计算的集成特征。 将提取的集成特征与一个或多个参考集成特征中的每一个进行比较,并且基于比较结果对音频信号进行分类。 然后,分类结果用于运动图像分类。

    HEARING AID APPARATUS
    14.
    发明申请
    HEARING AID APPARATUS 有权
    听力辅助装置

    公开(公告)号:US20120063620A1

    公开(公告)日:2012-03-15

    申请号:US13320613

    申请日:2010-06-16

    IPC分类号: H04R25/00

    摘要: A call other than a conversion partner call and various sounds are detected by input audio signals from plural microphones without deteriorating a voice recognition precision. A hearing aid apparatus according to the present invention corrects a frequency characteristic of the call voice other than the conversation partner voice based on an arrival direction of the call voice other than the conversation partner voice, which is estimated based on the audio signal converted by the plural microphones, checks a call word standard pattern representing features of a phoneme and a syllabic sound based on other voice data picked up by using the microphones having one characteristic against a call voice other than the conversation partner voice in which the frequency characteristic is corrected by the frequency characteristic correction processing unit to determine whether the call voice is a call word, and forms a directivity in the direction other than the arrival direction of the voice of the conversation partner. Then, the hearing aid apparatus according to the present invention corrects the frequency characteristic of the call voice other than the conversation partner voice so as to provide the same characteristic as that of the microphones at the time of creating the audio standard pattern.

    摘要翻译: 通过来自多个麦克风的输入音频信号来检测除转换伴侣呼叫之外的呼叫和各种声音,而不会降低语音识别精度。 根据本发明的助听器装置根据对话伙伴声音以外的呼叫语音的到达方向,校正对话伙伴语音以外的呼叫语音的频率特性,该对话伙伴语音是基于由 多个麦克风,基于通过使用具有一个特性的麦克风拾取的表示音素和音节音的特征的呼叫字标准模式,对抗除了频率特性被校正的对话伙伴语音之外的呼叫语音 所述频率特性校正处理单元确定所述呼叫语音是否是呼叫字,并且在所述对话伙伴的语音的到达方向以外的方向上形成方向性。 然后,根据本发明的助听器装置校正除了对话伙伴语音之外的呼叫语音的频率特性,以便在创建音频标准模式时提供与麦克风相同的特性。

    APPARATUS AND METHOD OF OUTPUTTING SOUND INFORMATION
    15.
    发明申请
    APPARATUS AND METHOD OF OUTPUTTING SOUND INFORMATION 审中-公开
    输出声音信息的装置和方法

    公开(公告)号:US20090154712A1

    公开(公告)日:2009-06-18

    申请号:US11568219

    申请日:2005-04-19

    IPC分类号: H04R5/00

    摘要: An azimuth and distance calculator calculates the relative direction and distance to the next intersection to be guided, based on information on the intersection supplied from storage for received information on an object to be guided and information on the moving histories of a user. Then, the calculator converts the relative direction into a horizontal angle and the distance to an elevation angle, and passes the angles to a stereophony generator. The stereophony generator creates output sound information having a sound image localized outside of a headphone and outputs the information to the headphone. In this manner, the user can accurately understand the distance to the object.

    摘要翻译: 方位角和距离计算器基于从用于被引导对象的接收信息的存储提供的交叉点的信息和关于用户的移动历史的信息的信息,计算到要引导的下一交叉点的相对方向和距离。 然后,计算器将相对方向转换成水平角和与仰角的距离,并将角度传递给立体声发生器。 立体声发生器产生具有位于耳机之外的声像的输出声音信息,并将该信息输出到耳机。 以这种方式,用户可以准确地理解到物体的距离。

    Device for extracting keywords in a conversation
    16.
    发明授权
    Device for extracting keywords in a conversation 有权
    用于在会话中提取关键字的设备

    公开(公告)号:US08370145B2

    公开(公告)日:2013-02-05

    申请号:US12302633

    申请日:2008-03-14

    IPC分类号: G10L15/28

    摘要: The present invention aims at extracting a keyword of conversation without preparations by advanced anticipation of keywords of conversation. A keyword extracting device of the present invention includes an audio input section 101 by way of which a speech sound made by a speaker is input; a speech segment determination section 102 that determines a speech segment for each speaker in connection with the input speech sound; a speech recognition section 103 that recognizes a speech sound of the determined speech segment for each speaker; an interrupt detection section 104 that detects a feature of a speech response suggesting presence of a keyword on the basis of a response of another speaker to speech sounds of respective speakers; namely, an interrupt where a preceding speech and a subsequent speech overlap; a keyword extraction section 105 that extracts the keyword from the speech in the speech segment specified on the basis of an interrupt; a keyword search section 106 that performs keyword search by means of the keyword; and a display section 107 that displays a result of keyword search.

    摘要翻译: 本发明旨在通过高级预期的对话关键词来提取对话关键词。 本发明的关键词提取装置包括音频输入部分101,通过该输入部分输入由扬声器产生的语音; 语音段确定部分102,其结合输入的语音来确定每个说话者的语音段; 识别每个说话者所确定的语音片段的语音的语音识别部分103; 中断检测部分104,其基于另一说话者对各个扬声器的语音的响应来检测建议存在关键词的语音响应的特征; 即前一语音和后续语音重叠的中断; 关键词提取部分105,其基于中断指定的语音段中的语音中提取关键词; 通过该关键字执行关键词搜索的关键字搜索部分106; 以及显示关键词搜索结果的显示部分107。

    LIFESTYLE COLLECTING APPARATUS, USER INTERFACE DEVICE, AND LIFESTYLE COLLECTING METHOD
    17.
    发明申请
    LIFESTYLE COLLECTING APPARATUS, USER INTERFACE DEVICE, AND LIFESTYLE COLLECTING METHOD 有权
    生活方式收集设备,用户界面设备和生活方式收集方法

    公开(公告)号:US20110208790A1

    公开(公告)日:2011-08-25

    申请号:US12957675

    申请日:2010-12-01

    IPC分类号: G06F7/00

    摘要: Provided is a lifestyle collecting apparatus that collects information for determining a lifestyle of a user, and includes: an object information detecting unit configured to detect object information representing an object around the user; a relevance degree calculating unit configured to calculate a relevance degree of the user to the object, using the object information; an appearance information extracting unit configured to extract appearance information from the object information, and add the relevance degree to the extracted appearance information, the appearance information representing an appearance of the object; and a lifestyle database which stores the appearance information to which the relevance degree has been added, as the information for determining the lifestyle of the user.

    摘要翻译: 提供一种收集用于确定用户的生活方式的信息的生活习惯收集装置,包括:对象信息检测单元,被配置为检测表示用户周围的对象的对象信息; 相关度计算单元,被配置为使用对象信息来计算用户对对象的相关度; 外观信息提取单元,被配置为从所述对象信息中提取外观信息,并将与所提取的外观信息相关度相加,所述外观信息表示所述对象的外观; 以及生存数据库,其存储已经添加了相关度的外观信息作为用于确定用户生活方式的信息。

    Voice output apparatus and voice output method
    18.
    发明授权
    Voice output apparatus and voice output method 有权
    语音输出设备和语音输出方式

    公开(公告)号:US07809573B2

    公开(公告)日:2010-10-05

    申请号:US10542947

    申请日:2004-04-27

    IPC分类号: G10L21/00 G10L13/08

    CPC分类号: G06F3/16

    摘要: A voice output apparatus, enhancing a robustness of an interface between a user and the apparatus by transmitting, information to the user via text message and voice message. The voice output apparatus including a display unit (107) displaying a text message that is apparatus-transmitting information to be transmitted to the user, a delay unit (105), and a voice output unit (106) estimating a delay time necessary for an action taken by the user to visually identify the text message displayed by the display unit (107), and outputting, via voice message, the apparatus-transmitting information, when the delay time (T) passes after the text message is displayed.

    摘要翻译: 一种语音输出装置,通过经由文本消息和语音消息向用户发送信息来增强用户与装置之间的接口的鲁棒性。 语音输出装置包括:显示单元(107),显示作为向用户发送的设备发送信息的文本消息;延迟单元(105);以及语音输出单元(106),其估计为 用户在视觉上识别由显示单元(107)显示的文本消息所采取的动作,并且当延迟时间(T)在显示文本消息之后经过时,经由语音消息输出设备发送信息。

    KEYWORD EXTRACTING DEVICE
    19.
    发明申请
    KEYWORD EXTRACTING DEVICE 有权
    关键词提取装置

    公开(公告)号:US20090150155A1

    公开(公告)日:2009-06-11

    申请号:US12302633

    申请日:2008-03-14

    IPC分类号: G10L15/08

    摘要: The present invention aims at extracting a keyword of conversation without preparations by advanced anticipation of keywords of conversation. A keyword extracting device of the present invention includes an audio input section 101 by way of which a speech sound made by a speaker is input; a speech segment determination section 102 that determines a speech segment for each speaker in connection with the input speech sound; a speech recognition section 103 that recognizes a speech sound of the determined speech segment for each speaker; an interrupt detection section 104 that detects a feature of a speech response suggesting presence of a keyword on the basis of a response of another speaker to speech sounds of respective speakers; namely, an interrupt where a preceding speech and a subsequent speech overlap; a keyword extraction section 105 that extracts the keyword from the speech in the speech segment specified on the basis of an interrupt; a keyword search section 106 that performs keyword search by means of the keyword; and a display section 107 that displays a result of keyword search.

    摘要翻译: 本发明旨在通过高级预期的对话关键词来提取对话关键词。 本发明的关键词提取装置包括音频输入部分101,通过该输入部分输入由扬声器产生的语音; 语音段确定部分102,其结合输入的语音来确定每个说话者的语音段; 识别每个说话者所确定的语音段的语音的语音识别部分103; 中断检测部分104,其基于另一说话者对各个扬声器的语音的响应来检测建议存在关键词的语音响应的特征; 即前一语音和后续语音重叠的中断; 关键词提取部分105,其基于中断指定的语音段中的语音中提取关键字; 通过该关键字执行关键词搜索的关键字搜索部分106; 以及显示关键词搜索结果的显示部分107。

    Voice output device and voice output method
    20.
    发明申请
    Voice output device and voice output method 有权
    语音输出设备和语音输出方式

    公开(公告)号:US20060085195A1

    公开(公告)日:2006-04-20

    申请号:US10542947

    申请日:2004-04-27

    IPC分类号: G10L13/08

    CPC分类号: G06F3/16

    摘要: The voice output apparatus, which enhances a robustness of an interface between a user and the apparatus by transmitting, information to the user via text message and voice message, is comprised of: a display unit (107) for displaying a text message that is apparatus-transmitting information to be transmitted to the user; and a delay unit (105) as well as a voice output unit (106) for estimating a delay time necessary for an action taken by the user to visually identify the text message displayed by the display unit (107), and outputting, via voice message, the apparatus-transmitting information, when the delay time (T) passes after the text message is displayed.

    摘要翻译: 通过经由文本消息和语音消息向用户发送信息来增强用户与设备之间的接口的鲁棒性的语音输出设备包括:显示单元(107),用于显示作为设备的文本消息 - 发送要发送给用户的信息; 以及延迟单元(105)以及语音输出单元(106),用于估计由用户在视觉上识别由显示单元(107)显示的文本消息所采取的动作所需的延迟时间,并经由语音 消息,当延迟时间(T)在显示文本消息之后经过时,装置发送信息。