情報処理装置、情報処理方法、およびプログラム
    61.
    发明申请
    情報処理装置、情報処理方法、およびプログラム 审中-公开
    信息处理设备,信息处理方法和程序

    公开(公告)号:WO2016157642A1

    公开(公告)日:2016-10-06

    申请号:PCT/JP2015/085187

    申请日:2015-12-16

    Abstract: 少なくともユーザの発話音声を含む音声データから抽出される、上記ユーザを含む複数のユーザの間のインタラクションを示す特徴量に基づいて、上記複数のユーザの間でコミュニケーションが発生しているか否かを判定するコミュニケーション判定部を備える情報処理装置が提供される。

    Abstract translation:

    它从音频数据,包括至少一个用户的语音,基于指示包括该用户多个用户之间的相互作用的特征量,提取的多个用户之间进行通信 提供了一种信息处理装置,其包括通信确定单元,该通信确定单元确定其是否正在发生。

    语音控制方法和系统
    63.
    发明申请

    公开(公告)号:WO2015180430A1

    公开(公告)日:2015-12-03

    申请号:PCT/CN2014/091948

    申请日:2014-11-21

    Inventor: 程德凯 吕艳红

    CPC classification number: G10L15/20

    Abstract: 一种语音控制方法和系统,该方法包括:被控终端实时或定时侦测噪音设备发送的语音数据,并获取侦测到的语音数据的播放时间点(S10);在侦测到第一音频信号时,所述被控终端确定与当前时间点匹配的播放时间点所对应的语音数据,并将确定的语音数据转换为第二音频信号(S20);所述被控终端剔除所述第一音频信号中与第二音频信号匹配的部分,以生成语音控制指令(S30);所述被控终端响应该生成的语音控制指令(S40)。通过将接收到第一音频信号中的噪音设备产生的第二音频信号剔除,提高语音控制的准确性。

    MODIFYING OPERATIONS BASED ON ACOUSTIC AMBIENCE CLASSIFICATION
    65.
    发明申请
    MODIFYING OPERATIONS BASED ON ACOUSTIC AMBIENCE CLASSIFICATION 审中-公开
    基于声学分类的修改操作

    公开(公告)号:WO2015102921A1

    公开(公告)日:2015-07-09

    申请号:PCT/US2014/071105

    申请日:2014-12-18

    Abstract: Methods and systems for modification of electronic system operation based on acoustic ambience classification are presented. In an example method, at least one audio signal present in a physical environment of a user is detected. The at least one audio signal is analyzed to extract at least one audio feature from the audio signal. The audio signal is classified based on the audio feature to produce at least one classification of the audio signal. Operation of an electronic system interacting with the user in the physical environment is modified based on the classification of the audio signal.

    Abstract translation: 提出了基于声学环境分类的电子系统操作修改方法和系统。 在示例性方法中,检测到存在于用户的物理环境中的至少一个音频信号。 分析至少一个音频信号以从音频信号中提取至少一个音频特征。 基于音频特征对音频信号进行分类以产生音频信号的至少一个分类。 基于音频信号的分类修改在物理环境中与用户交互的电子系统的操作。

    ACCOUSTIC ACTIVITY DETECTION APPARATUS AND METHOD
    66.
    发明申请
    ACCOUSTIC ACTIVITY DETECTION APPARATUS AND METHOD 审中-公开
    ACCOUSTIC ACTIVITY检测装置和方法

    公开(公告)号:WO2015057757A1

    公开(公告)日:2015-04-23

    申请号:PCT/US2014/060567

    申请日:2014-10-15

    Abstract: Streaming audio is received. The streaming audio includes a frame having plurality of samples. An energy estimate is obtained for the plurality of samples. The energy estimate is compared to at least one threshold. In addition, a band pass estimate of the signal is determined. An energy estimate is obtained for the band-passed plurality of samples. The two energy estimates are compared to at least one threshold each. Based upon the comparison operation, a determination is made as to whether speech is detected.

    Abstract translation: 接收流音频。 流式音频包括具有多个样本的帧。 对于多个样本获得能量估计。 将能量估计与至少一个阈值进行比较。 此外,确定信号的带通估计。 对带通的多个样本获得能量估计。 将两个能量估计值与至少一个阈值进行比较。 基于比较操作,确定是否检测到语音。

    METHOD AND SYSTEM FOR SPEECH INTELLIBILITY ENHANCEMENT IN NOISY ENVIRONMENTS
    67.
    发明申请
    METHOD AND SYSTEM FOR SPEECH INTELLIBILITY ENHANCEMENT IN NOISY ENVIRONMENTS 审中-公开
    语音环境中语音智能增强的方法与系统

    公开(公告)号:WO2015027168A1

    公开(公告)日:2015-02-26

    申请号:PCT/US2014/052316

    申请日:2014-08-22

    Applicant: GOOGLE INC.

    Abstract: Provided are methods and systems for enhancing the intelligibility of an audio (e.g., speech) signal rendered in a noisy environment, subject to a constraint on the power of the rendered signal. A quantitative measure of intelligibility is the mean probability of decoding of the message correctly. The methods and systems simplify the procedure by approximating the maximization of the decoding probability with the maximization of the similarity of the spectral dynamics of the noisy speech to the spectral dynamics of the corresponding noise-free speech. The intelligibility enhancement procedures provided are based on this principle, and all have low computational cost and require little delay, thus facilitating real-time implementation.

    Abstract translation: 提供了用于增强在嘈杂环境中呈现的音频(例如,语音)信号的可懂度的方法和系统,受到对所渲染信号的功率的约束。 可信度的定量测量是正确解码消息的平均概率。 方法和系统通过近似解码概率的最大化来简化该过程,其中噪声语音的频谱动力学的相似性与相应无噪声语音的频谱动力学的最大化相关。 所提供的可懂度增强程序是基于这一原则,并且都具有较低的计算成本,并且需要很少的延迟,从而便于实时实现。

    METHOD AND DEVICE FOR AUDIO IMPUT ROUTING
    68.
    发明申请
    METHOD AND DEVICE FOR AUDIO IMPUT ROUTING 审中-公开
    用于音频输入路由的方法和设备

    公开(公告)号:WO2015013201A2

    公开(公告)日:2015-01-29

    申请号:PCT/US2014/047448

    申请日:2014-07-21

    Abstract: A method on a mobile device (100) for processing an audio input is described. A trigger for the audio input is received. At least one parameter is determined for an audio processor (303) based on at least one input characteristic for the audio input. The audio input is routed to the audio processor (303) with the at least one parameter.

    Abstract translation: 描述了用于处理音频输入的移动设备(100)上的方法。 接收到音频输入的触发器。 基于音频输入的至少一个输入特性为音频处理器(303)确定至少一个参数。 音频输入用至少一个参数路由到音频处理器(303)。

    VOICE RECOGNITION CONFIGURATION SELECTOR AND METHOD OF OPERATION THEREFOR
    69.
    发明申请
    VOICE RECOGNITION CONFIGURATION SELECTOR AND METHOD OF OPERATION THEREFOR 审中-公开
    语音识别配置选择器及其操作方法

    公开(公告)号:WO2014143447A1

    公开(公告)日:2014-09-18

    申请号:PCT/US2014/014758

    申请日:2014-02-05

    Abstract: A method includes obtaining a speech sample from a pre-processing front-end (120) of a first device, identifying at least one condition, and selecting a voice recognition speech model from a database of speech models (160), the selected voice recognition speech model trained under the at least one condition. The method may include performing voice recognition on the speech sample using the selected speech model. A device includes a microphone signal pre-processing front end (120) and operating-environment logic (130), operatively coupled to the pre-processing front end (120. The operating-environment logic (130) is operative to identify at least one condition. A voice recognition configuration selector (140) is operatively coupled to the operating-environment logic (130), and is operative to receive information related to the at least one condition from the operating-environment logic (130) and to provide voice recognition logic (150) with an identifier (135) for a voice recognition speech model trained under the at least one condition.

    Abstract translation: 一种方法包括从第一设备的预处理前端(120)获得语音样本,识别至少一个条件,以及从语音模型(160)的数据库中选择语音识别语音模型,所选择的语音识别 在至少一个条件下训练的语音模型。 该方法可以包括使用所选择的语音模型对语音样本执行语音识别。 一种设备包括可操作地耦合到预处理前端(120)的麦克风信号预处理前端(120)和操作环境逻辑(130)。操作环境逻辑(130)可操作以识别至少一个 语音识别配置选择器(140)可操作地耦合到操作环境逻辑(130),并且可操作以从操作环境逻辑(130)接收与至少一个条件相关的信息,并且提供语音识别 具有用于在所述至少一个条件下训练的语音识别语音模型的标识符(135)的逻辑(150)。

    APPARATUS AND METHOD FOR BEAMFORMING TO OBTAIN VOICE AND NOISE SIGNALS
    70.
    发明申请
    APPARATUS AND METHOD FOR BEAMFORMING TO OBTAIN VOICE AND NOISE SIGNALS 审中-公开
    用于获取语音和噪声信号的装置和方法

    公开(公告)号:WO2014143439A1

    公开(公告)日:2014-09-18

    申请号:PCT/US2014/014375

    申请日:2014-02-03

    Abstract: One method of operation includes beamforming a plurality of microphone outputs to obtain a plurality of virtual microphone audio channels. Each virtual microphone audio channel corresponds to a beamform. The virtual microphone audio channels include at least one voice channel (135) and at least one noise channel (136). The method includes performing voice activity detection (151) on the at least one voice channel (135) and adjusting a corresponding voice beamform until voice activity detection (151) indicates that voice is present on the at least one voice channel (135). Another method beamforms the plurality of microphone outputs to obtain a plurality of virtual microphone audio channels, where each virtual microphone audio channel corresponds to a beamform, and with at least one voice channel (135) and at least one noise channel (136). The method performs voice recognition on the at least one voice channel (135) and adjusts the corresponding voice beamform to improve a voice recognition confidence metric (159).

    Abstract translation: 一种操作方法包括波束成形多个麦克风输出以获得多个虚拟麦克风音频通道。 每个虚拟麦克风音频通道对应于波束形式。 虚拟麦克风音频通道包括至少一个语音通道(135)和至少一个噪声通道(136)。 该方法包括在至少一个语音信道(135)上执行语音活动检测(151)并调整对应的语音波束形式,直到语音活动检测(151)指示语音存在于至少一个语音信道(135)上。 另一种方法波束形成多个麦克风输出以获得多个虚拟麦克风音频通道,其中每个虚拟麦克风音频通道对应于波束形式,以及至少一个语音通道(135)和至少一个噪声通道(136)。 该方法在至少一个语音信道(135)上执行语音识别,并且调整相应的语音波束形式以改善语音识别置信量度(159)。

Patent Agency Ranking