Patent search cpc:"G10L25/30" Page 3

21.

发明申请
MULTI-CHANNEL SPEECH SEPARATION 审中-公开

公开(公告)号：WO2019089486A1

公开(公告)日：2019-05-09

申请号：PCT/US2018/058067

申请日：2018-10-30

Applicant: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventor： CHEN, Zhuo , LI, Jinyu , XIAO, Xiong , YOSHIOKA, Takuya , WANG, Huaming , WANG, Zhenghao , GONG, Yifan

IPC: G10L21/0272 , G10L25/30 , G10L21/0308 , G10L21/0216

CPC classification number: G10L21/0216 , G06N3/0445 , G06N3/0454 , G06N3/084 , G10L21/0272 , G10L21/0308 , G10L25/30 , G10L2021/02087 , G10L2021/02166 , H04R3/005 , H04R2430/20

Abstract: Representative embodiments disclose mechanisms to separate and recognize multiple audio sources (e.g., picking out individual speakers) in an environment where they overlap and interfere with each other. The architecture uses a microphone array to spatially separate out the audio signals. The spatially filtered signals are then input into a plurality of separators, so each signal is input into a corresponding signal. The separators use neural networks to separate out audio sources. The separators typically produce multiple output signals for the single input signals. A post selection processor then assesses the separator outputs to pick the signals with the highest quality output. These signals can be used in a variety of systems such as speech recognition, meeting transcription and enhancement, hearing aids, music information retrieval, speech enhancement and so forth.

22.

发明申请
EMPLOYING VEHICULAR SENSOR INFORMATION FOR RETRIEVAL OF DATA 审中-公开

公开(公告)号：WO2018231766A1

公开(公告)日：2018-12-20

申请号：PCT/US2018/037012

申请日：2018-06-12

Applicant: VISTEON GLOBAL TECHNOLOGIES, INC.

Inventor： BOWDEN, Upton, Beall , WHIKEHART, J., William , SCHUPFNER, Markus

IPC: G10L25/51 , G10L25/30 , G06K9/00 , G06K9/46 , G06N3/04 , G06N3/08 , H04R1/40

CPC classification number: G10L25/51 , G06K9/00805 , G06K9/4628 , G06K9/6271 , G06N3/04 , G10L25/30 , H04R1/406 , H04R2499/13

Abstract: Disclosed herein are systems, methods, and devices for optimally performing object identification employing a neural network (NN), for example a convolutional neural network (CNN). The aspects disclosed herein employ audio data captured by one or more microphones in to at least identify an object, or augment image capturing to perform the same. The audio data and the image data are each propagated to the NN, to perform object identification.

23.

发明申请
情報処理装置、情報処理方法及び記録媒体审中-公开
Title translation: 信息处理设备，信息处理方法和记录介质

公开(公告)号：WO2018042791A1

公开(公告)日：2018-03-08

申请号：PCT/JP2017/020507

申请日：2017-06-01

Applicant: ソニー株式会社

Inventor： 大迫　慶一 , 光藤　祐基 , 浅田　宏平

IPC: G10L21/0308 , G10L21/028 , G10L25/30

CPC classification number: G10L21/028 , G06F17/16 , G06N3/08 , G06N20/00 , G10L21/0308 , G10L25/30

Abstract: 【課題】分離性能を改善することが可能な音源分離技術を提供する。【解決手段】音を観測した観測信号を取得する取得部と、想定される複数の音源の各々に対応する係数ベクトル及び入力ベクトルの行列積に非線形関数を適用することで、前記取得部により取得された前記観測信号を前記複数の音源の各々に対応する複数の分離信号に分離する音源分離部と、を備える情報処理装置。

Abstract translation: [问题]提供能够改善分离功能的声源分离技术。解决方案一种信息处理设备，包括：获取单元，用于获取表示观察到的声音的观察信号; 以及声源分离单元，用于将由所述获取单元获取的观测信号分离为分别对应于多个假定声源中的每一个的多个分离信号，所述分离通过将非线性函数应用于所述系数的矩阵乘法矢量和输入矢量对应于多个声源中的每一个。

24.

发明申请
HIERARCHICAL ATTENTION FOR SPOKEN DIALOGUE STATE TRACKING 审中-公开
Title translation: 对于对话状态跟踪的分级注意

公开(公告)号：WO2017168246A1

公开(公告)日：2017-10-05

申请号：PCT/IB2017/000411

申请日：2017-03-29

Applicant: MALUUBA INC.

Inventor： SCHULZ, Hannes , HE, Jing

IPC: G10L15/22 , G06F17/27 , G10L15/16

CPC classification number: G10L15/22 , G10L13/08 , G10L15/16 , G10L15/197 , G10L15/265 , G10L25/30 , G10L2015/223

Abstract: Described herein are systems and methods for providing hierarchical state tracking in a spoken dialogue system. A sequence of turns is received by a spoken dialogue system. Each turn includes a user utterance and a machine act. At each turn, a value pointer and a turn pointer are provided for that turn. The value pointer represents a probability distribution over the one or more words in the user utterance that indicates whether each word in the user utterance is a slot value for a slot. The turn pointer identifies which turn in a set of turns includes a currently-relevant slot value for the slot, where the set of turns includes a current turn for which the turn point is being provided, and all turns that precede the current turn.

Abstract translation: 这里描述的是用于在口头对话系统中提供分级状态跟踪的系统和方法。口头对话系统接收一系列转弯。每一转都包括用户话语和机器动作。在每一回合中，为该回合提供值指针和转向指针。值指针表示对用户话语中的一个或多个单词的概率分布，其指示用户话语中的每个单词是否是时隙的时隙值。转弯指针标识一组转弯中的哪一匝包括该槽的当前相关的槽值，其中转弯组包括转弯点正被提供的当前转弯以及当前转弯之前的所有转弯。 / p>

25.

发明申请
VOICE ACTIVITY DETECTION 审中-公开
Title translation: 语音活动检测

公开(公告)号：WO2017052739A1

公开(公告)日：2017-03-30

申请号：PCT/US2016/043552

申请日：2016-07-22

Applicant: GOOGLE INC.

Inventor： SAINATH, Tara N. , SIMKO, Gabor , SAN MARTIN, Maria Carolina Parada

IPC: G10L25/78 , G10L25/30

CPC classification number: G10L25/78 , G10L25/30

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for detecting voice activity. In one aspect, a method include actions of receiving, by a neural network included in an automated voice activity detection system, a raw audio waveform, processing, by the neural network, the raw audio waveform to determine whether the audio waveform includes speech, and provide, by the neural network, a classification of the raw audio waveform indicating whether the raw audio waveform includes speech.

Abstract translation: 方法，系统和装置，包括在计算机存储介质上编码的用于检测语音活动的计算机程序。在一个方面，一种方法包括通过包括在自动语音活动检测系统中的神经网络接收原始音频波形，由神经网络处理原始音频波形以确定音频波形是否包括语音的动作，以及通过神经网络提供表示原始音频波形是否包括语音的原始音频波形的分类。

26.

发明申请
ORDER STATISTIC TECHNIQUES FOR NEURAL NETWORKS 审中-公开
Title translation: 订购神经网络统计技术

公开(公告)号：WO2017031172A1

公开(公告)日：2017-02-23

申请号：PCT/US2016/047289

申请日：2016-08-17

Applicant: NUANCE COMMUNICATIONS, INC.

Inventor： RENNIE, Steven, John , GOEL, Vaibhava

IPC: G10L15/16

CPC classification number: G10L15/063 , G06N3/08 , G10L15/16 , G10L25/30

Abstract: According to some aspects, a method of classifying speech recognition results is provided, using a neural network comprising a plurality of interconnected network units, each network unit having one or more weight values, the method comprising using at least one computer, performing acts of providing a first vector as input to a first network layer comprising one or more network units of the neural network, transforming, by a first network unit of the one or more network units, the input vector to produce a plurality of values, the transformation being based at least in part on a plurality of weight values of the first network unit, sorting the plurality of values to produce a sorted plurality of values, and providing the sorted plurality of values as input to a second network layer of the neural network.

Abstract translation: 根据一些方面，提供了一种分类语音识别结果的方法，使用包括多个互连网络单元的神经网络，每个网络单元具有一个或多个权重值，所述方法包括使用至少一个计算机，执行提供第一矢量作为包括所述神经网络的一个或多个网络单元的第一网络层的输入，由所述一个或多个网络单元的第一网络单元将所述输入向量变换以产生多个值，所述变换基于至少部分地基于所述第一网络单元的多个权重值，对所述多个值进行排序以产生排序的多个值，以及将所述排序的多个值作为输入提供给所述神经网络的第二网络层。

27.

发明申请
车载语音指令识别方法、装置和存储介质审中-公开

公开(公告)号：WO2017000489A1

公开(公告)日：2017-01-05

申请号：PCT/CN2015/095269

申请日：2015-11-23

Applicant: 百度在线网络技术(北京)有限公司

Inventor： 旬丽辉 , 欧阳能钧 , 穆向禹

IPC: G10L15/16 , G10L15/26

CPC classification number: G10L15/22 , G10L15/16 , G10L15/1815 , G10L25/09 , G10L25/21 , G10L25/24 , G10L25/30 , G10L25/51 , G10L25/63 , G10L2015/223 , G10L2015/227

Abstract: 一种车载语音指令识别方法、装置和存储介质。所述方法包括：获取用户输入的语音指令（S11）；根据预先训练的深层神经网络DNN模型确定用户的基本信息（S12）；根据所述用户的基本信息对语音指令进行内容识别，并根据识别的内容以及用户输入所述语音指令的场景页面上下文确定至少一个用户可能意图（S13）；根据所述DNN模型确定用户可能意图的置信度（S14）；根据所述置信度从所述用户可能意图中确定用户真实意图（S15）；根据所述用户真实意图执行对应的动作（S16）。所述车载语音指令识别方法、装置和存储介质能够有效的提高语音指令的正确识别率。

28.

发明申请
RELEVANCE SCORE ASSIGNMENT FOR ARTIFICIAL NEURAL NETWORK 审中-公开
Title translation: 人工神经网络的相关分数分配

公开(公告)号：WO2016150472A1

公开(公告)日：2016-09-29

申请号：PCT/EP2015/056008

申请日：2015-03-20

Applicant: FRAUNHOFER-GESELLSCHAFT ZUR FÖRDERUNG DER ANGEWANDTEN FORSCHUNG E.V. , TECHNISCHE UNIVERSITÄT BERLIN

Inventor： BACH, Sebastian , SAMEK, Wojciech , MÜLLER, Klaus-Robert , BINDER, Alexander , MONTAVON, Grégoire

IPC: G06K9/00 , G06T1/20 , G06N3/04 , G06N3/08

CPC classification number: G06N3/02 , G06F17/2765 , G06K9/4628 , G06K9/6247 , G06N3/0481 , G06N3/08 , G10L25/30

Abstract: The task of relevance score assignment to a set of items onto which an artificial neural network is applied is obtained by redistributing an initial relevance score derived from the network output, onto the set of items by reversely propagating the initial relevance score through the artificial neural network so as to obtain a relevance score for each item. In particular, this reverse propagation is applicable to a broader set of artificial neural networks and/or at lower computational efforts by performing same in a manner so that for each neuron, preliminarily redistributed relevance scores of a set of downstream neighbor neurons of the respective neuron are distributed on a set of upstream neighbor neurons of the respective neuron according to a distribution function.

Abstract translation: 通过将人造神经网络的初始相关性得分通过人工神经网络反向传播，将从网络输出得到的初始相关性得分重新分配到该组项上，获得对应用人造神经网络的一组项目的相关性分数赋值的任务。以获得每个项目的相关性分数。特别地，这种反向传播适用于更广泛的人造神经网络集合和/或以较低的计算努力，通过以对于每个神经元执行相同的方式，相应神经元的一组下游相邻神经元的初步再分布的相关性得分根据分布函数分布在各个神经元的一组上游相邻神经元上。

29.

发明申请
一种机器人系统的声音识别系统及方法审中-公开

公开(公告)号：WO2016112634A1

公开(公告)日：2016-07-21

申请号：PCT/CN2015/081409

申请日：2015-06-12

Applicant: 芋头科技(杭州)有限公司

Inventor： 蔡鹏 , 高鹏 , 江涛 , 程一堂 , 向文杰

IPC: G10L15/00

CPC classification number: G10L17/22 , G10L15/22 , G10L15/30 , G10L25/30 , G10L25/78 , G10L2015/223

Abstract: 一种机器人系统的声音识别系统，包括：麦克风，用于接收语音指令；本地语音检测器，对语音指令进行检测并进行输出；本地语音识别模块，接收语音检测器输出的人声语音信号并选择进行甄别选择进行输出；本地语音编码模块，用于对人声语音信号进行编码后输出；远程语音解码模块，用于接收本地语音编码模块输出的编码过的语音信号进行解码后输出；远程语音识别模块和远程语言处理模块，远程语音识别模块接收远程语音解码模块输出的经解码过的人声语音信号，在进行转换后输出到远程语言处理模块，远程语言处理模块根据转换后的人声语音信号生成相应的操作指令；执行模块，用于执行远程语言处理模块的操作指令。

30.

发明申请
BINAURAL RECORDING FOR PROCESSING AUDIO SIGNALS TO ENABLE ALERTS 审中-公开
Title translation: 用于处理音频信号以实现警报的双重记录

公开(公告)号：WO2016105620A1

公开(公告)日：2016-06-30

申请号：PCT/US2015/054051

申请日：2015-10-05

Applicant: INTEL CORPORATION

Inventor： POORNACHANDRAN, Rajesh , GOTTARDO, David , KAR, Swarnendu , MACDONALD, Mark , DADU, Saurabh

IPC: H04R5/033 , H04S7/00

CPC classification number: H04R5/04 , G10L25/30 , G10L25/51 , H04R1/1008 , H04R1/1041 , H04R2420/01 , H04S7/40

Abstract: A wearable device for binaural audio is described. The wearable device includes a feedback mechanism, a microphone, an always on binaural recorder (AOBR), and a processor. The AOBR is to capture ambient noise via the microphone and interpret the ambient noise. An alert is issued by the processor to the feedback mechanism based on a notification detected via the microphone in the ambient noise.

Abstract translation: 描述了用于双耳音频的可佩戴装置。可穿戴装置包括反馈机构，麦克风，始终在双耳录音机（AOBR）和处理器。 AOBR是通过麦克风捕获环境噪声并解释环境噪声。基于在环境噪声中通过麦克风检测到的通知，处理器向反馈机制发出警报。

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification