SYSTEMS AND METHODS FOR ADDING PUNCTUATIONS
    1.
    发明申请
    SYSTEMS AND METHODS FOR ADDING PUNCTUATIONS 审中-公开
    用于增加攻击力的系统和方法

    公开(公告)号:WO2014187069A1

    公开(公告)日:2014-11-27

    申请号:PCT/CN2013/085347

    申请日:2013-10-16

    Abstract: Systems and methods are provided for adding punctuations. For example, one or more first feature units are identified in a voice file taken as a whole; the voice file is divided into multiple segments; one or more second feature units are identified in the voice file; a first aggregate weight of first punctuation states of the voice file and a second aggregate weight of second punctuation states of the voice file are determined, using a language model established based on word separation and third semantic features; a weighted calculation is performed to generate a third aggregate weight based on at least information associated with the first aggregate weight and the second aggregate weight; and one or more final punctuations are added to the voice file based on at least information associated with the third aggregate weight.

    Abstract translation: 提供了系统和方法来添加标点符号。 例如,一个或多个第一特征单元在作为整体而言的语音文件中被识别; 语音文件分为多个段; 在语音文件中识别一个或多个第二特征单元; 使用基于词分离和第三语义特征建立的语言模型来确定语音文件的第一标点状态的第一聚合权重和语音文件的第二标点状态的第二聚合权重; 基于至少与第一聚集权重和第二聚集权重相关联的信息来执行加权计算以产生第三聚集权重; 并且基于至少与第三聚合权重相关联的信息将一个或多个最终标点符号添加到语音文件。

    DATA PARALLEL PROCESSING METHOD AND APPARATUS BASED ON MULTIPLE GRAPHIC PROCESING UNITS
    2.
    发明申请
    DATA PARALLEL PROCESSING METHOD AND APPARATUS BASED ON MULTIPLE GRAPHIC PROCESING UNITS 审中-公开
    基于多个图形处理单元的数据并行处理方法和装置

    公开(公告)号:WO2015192812A1

    公开(公告)日:2015-12-23

    申请号:PCT/CN2015/081988

    申请日:2015-06-19

    Abstract: A parallel data processing method based on multiple graphic processing units (GPUs) is provided, including: creating, in a central processing unit (CPU), aplurality ofworker threads for controlling a plurality ofworker groups respectively, the worker groups including one or more GPUs; binding each worker thread to a corresponding GPU; loading a plurality ofbatches of training data from a nonvolatile memory to GPU video memories in the plurality ofworker groups; andcontrolling the plurality of GPUs to perform data processing in parallel through the worker threads. The method can enhance efficiency ofmulti-GPU parallel data processing. In addition, a parallel data processing apparatus is further provided.

    Abstract translation: 提供了一种基于多个图形处理单元(GPU)的并行数据处理方法,包括:在中央处理单元(CPU)中分别创建用于控制多个工作组的多个工作线程,所述工作人员组包括一个或多个GPU; 将每个工作线程绑定到相应的GPU; 将来自非易失性存储器的多个训练数据的批次加载到所述多个工作组中的GPU视频存储器; 并且控制多个GPU以通过工作线程并行地执行数据处理。 该方法可以提高多GPU并行数据处理的效率。 另外,还提供并行数据处理装置。

    SYSTEMS AND METHODS FOR AUDIO COMMAND RECOGNITION
    3.
    发明申请
    SYSTEMS AND METHODS FOR AUDIO COMMAND RECOGNITION 审中-公开
    用于音频命令识别的系统和方法

    公开(公告)号:WO2015081681A1

    公开(公告)日:2015-06-11

    申请号:PCT/CN2014/079766

    申请日:2014-06-12

    Abstract: A method, an electronic system and a non-transitory computer readable storage medium for recognizing audio commands in an electronic device are disclosed. The electronic device obtains audio data based on an audio signal provided by a user and extracts characteristic audio fingerprint features from the audio data. The electronic device further determines whether the corresponding audio signal is generated by an authorized user by comparing the characteristic audio fingerprint features with an audio fingerprint model for the authorized user and with a universal background model that represents user-independent audio fingerprint features, respectively. When the corresponding audio signal is generated by the authorized user of the electronic device, an audio command is extracted from the audio data, and an operation is performed according to the audio command.

    Abstract translation: 公开了一种用于在电子设备中识别音频命令的方法,电子系统和非暂时性计算机可读存储介质。 电子设备基于由用户提供的音频信号获得音频数据,并从音频数据中提取特征音频指纹特征。 电子设备还通过将特征音频指纹特征与用于授权用户的音频指纹模型进行比较,以及分别表示用户独立的音频指纹特征的通用背景模型来确定对应的音频信号是否由授权用户产生。 当由电子设备的授权用户产生相应的音频信号时,从音频数据中提取音频命令,并根据音频命令进行操作。

    METHOD AND DEVICE FOR ACOUSTIC LANGUAGE MODEL TRAINING
    4.
    发明申请
    METHOD AND DEVICE FOR ACOUSTIC LANGUAGE MODEL TRAINING 审中-公开
    用于语音语言模型训练的方法和装置

    公开(公告)号:WO2014117548A1

    公开(公告)日:2014-08-07

    申请号:PCT/CN2013/085948

    申请日:2013-10-25

    CPC classification number: G10L15/063 G10L15/05 G10L2015/0631

    Abstract: A method and a device for training an acoustic language model, include: conducting word segmentation for training samples in a training corpus using an initial language model containing no word class labels, to obtain initial word segmentation data containing no word class labels; performing word class replacement for the initial word segmentation data containing no word class labels, to obtain first word segmentation data containing word class labels; using the first word segmentation data containing word class labels to train a first language model containing word class labels; using the first language model containing word class labels to conduct word segmentation for the training samples in the training corpus, to obtain second word segmentation data containing word class labels; and in accordance with the second word segmentation data meeting one or more predetermined criteria, using the second word segmentation data containing word class labels to train the acoustic language model.

    Abstract translation: 一种用于训练声学语言模型的方法和装置,包括:使用不含词类标签的初始语言模型,在训练语料库中训练样本的词分割,以获得不包含词类标签的初始分词数据; 对不包含词类标签的初始分词数据执行单词类替换,以获得包含单词分类标签的第一分词数据; 使用包含词类标签的第一词分割数据来训练包含词类标签的第一语言模型; 使用包含词类标签的第一语言模型对训练语料库中的训练样本进行词分割,以获得包含词类标签的第二词分割数据; 并且根据满足一个或多个预定标准的第二字分割数据,使用包含词类标签的第二词分割数据来训练声学语言模型。

    METHOD AND APPARATUS FOR PERFORMING SPEECH KEYWORD RETRIEVAL
    5.
    发明申请
    METHOD AND APPARATUS FOR PERFORMING SPEECH KEYWORD RETRIEVAL 审中-公开
    执行语音关键词检索的方法和装置

    公开(公告)号:WO2015024431A1

    公开(公告)日:2015-02-26

    申请号:PCT/CN2014/083531

    申请日:2014-08-01

    CPC classification number: G10L15/18 G10L15/08 G10L15/28 G10L15/32 G10L2015/088

    Abstract: A method and an apparatus are provided for retrieving keyword. The apparatus configures at least two types of language models in a model file, where each type of language model includes a recognition model and a corresponding decoding model; the apparatus extracts a speech feature from the to-be-processed speech data; performs language matching on the extracted speech feature by using recognition models in the model file one by one, and determines a recognition model based on a language matching rate; and determines a decoding model corresponding to the recognition model; decoding the extracted speech feature by using the determined decoding model, and obtains a word recognition result after the decoding; and matches a keyword in a keyword dictionary and the word recognition result, and outputs a matched keyword.

    Abstract translation: 提供了一种用于检索关键字的方法和装置。 该装置在模型文件中配置至少两种类型的语言模型,其中每种类型的语言模型包括识别模型和相应的解码模型; 该设备从待处理语音数据中提取语音特征; 通过在模型文件中逐一使用识别模型对提取出的语音特征进行语言匹配,并根据语言匹配率确定识别模型; 并确定与识别模型相对应的解码模型; 通过使用所确定的解码模型来解码所提取的语音特征,并且在解码之后获得字识别结果; 并且将关键词字典中的关键词与单词识别结果进行匹配,并输出匹配关键字。

    METHOD AND DEVICE FOR PARALLEL PROCESSING IN MODEL TRAINING
    6.
    发明申请
    METHOD AND DEVICE FOR PARALLEL PROCESSING IN MODEL TRAINING 审中-公开
    模拟训练中并行处理的方法和装置

    公开(公告)号:WO2015003436A1

    公开(公告)日:2015-01-15

    申请号:PCT/CN2013/085568

    申请日:2013-10-21

    CPC classification number: G06N3/08

    Abstract: A method and a device for training a DNN model includes: at a device includes one or more processors and memory: establishing an initial DNN model; dividing a training data corpus into a plurality of disjoint data subsets; for each of the plurality of disjoint data subsets, providing the data subset to a respective training processing unit of a plurality of training processing units operating in parallel, wherein the respective training processing unit applies a Stochastic Gradient Descent (SGD) process to update the initial DNN model to generate a respective DNN sub-model based on the data subset; and merging the respective DNN sub- models generated by the plurality of training processing units to obtain an intermediate DNN model, wherein the intermediate DNN model is established as either the initial DNN model for a next training iteration or a final DNN model in accordance with a preset convergence condition.

    Abstract translation: 用于训练DNN模型的方法和设备包括:在设备上包括一个或多个处理器和存储器:建立初始DNN模型; 将训练数据语料库划分为多个不相交的数据子集; 对于多个不相交数据子集中的每一个,将数据子集提供给并行操作的多个训练处理单元的相应训练处理单元,其中各训练处理单元应用随机梯度下降(SGD)过程来更新初始 DNN模型基于数据子集生成相应的DNN子模型; 并且合并由多个训练处理单元生成的各个DNN子模型,以获得中间DNN模型,其中,中间DNN模型被建立为用于下一个训练迭代的初始DNN模型或根据下一个训练迭代的最终DNN模型 预设收敛条件。

    USER AUTHENTICATION METHOD AND APPARATUS BASED ON AUDIO AND VIDEO DATA
    7.
    发明申请
    USER AUTHENTICATION METHOD AND APPARATUS BASED ON AUDIO AND VIDEO DATA 审中-公开
    基于音频和视频数据的用户认证方法和设备

    公开(公告)号:WO2014117583A1

    公开(公告)日:2014-08-07

    申请号:PCT/CN2013/087994

    申请日:2013-11-28

    CPC classification number: G06F21/32 G06F2221/2117

    Abstract: A computer-implemented method is performed at a server having one or more processors and memory storing programs executed by the one or more processors for authenticating a user from video and audio data. The method includes: receiving a login request from a mobile device, the login request including video data and audio data; extracting a group of facial features from the video data; extracting a group of audio features from the audio data and recognizing a sequence of words in the audio data; identifying a first user account whose respective facial features match the group of facial features and a second user account whose respective audio features match the group of audio features. If the first user account is the same as the second user account, retrieve the sequence of words associated with the user account and compare the sequences of words for authentication purpose.

    Abstract translation: 在具有一个或多个处理器的服务器和由一个或多个处理器执行的用于从视频和音频数据认证用户的存储器存储程序的服务器执行计算机实现的方法。 该方法包括:从移动设备接收登录请求,登录请求包括视频数据和音频数据; 从视频数据中提取一组面部特征; 从音频数据提取一组音频特征并识别音频数据中的单词序列; 识别其各自的面部特征与该组面部特征相匹配的第一用户帐户和其各个音频特征与该组音频特征相匹配的第二用户帐户。 如果第一个用户帐户与第二个用户帐户相同,则检索与用户帐户相关联的单词序列,并比较用于验证目的的单词序列。

    METHOD AND DEVICE FOR KEYWORD DETECTION
    8.
    发明申请
    METHOD AND DEVICE FOR KEYWORD DETECTION 审中-公开
    用于关键字检测的方法和装置

    公开(公告)号:WO2014117547A1

    公开(公告)日:2014-08-07

    申请号:PCT/CN2013/085905

    申请日:2013-10-24

    CPC classification number: G10L15/063 G10L15/08 G10L2015/088

    Abstract: An electronic device with one or more processors and memory trains an acoustic model with an international phonetic alphabet (IPA) phoneme mapping collection and audio samples in different languages, where the acoustic model includes: a foreground model; and a background model. The device generates a phone decoder based on the trained acoustic model. The device collects keyword audio samples, decodes the keyword audio samples with the phone decoder to generate phoneme sequence candidates, and selects a keyword phoneme sequence from the phoneme sequence candidates. After obtaining the keyword phoneme sequence, the device detects one or more keywords in an input audio signal with the trained acoustic model, including: matching phonemic keyword portions of the input audio signal with phonemes in the keyword phoneme sequence with the foreground model; and filtering out phonemic non-keyword portions of the input audio signal with the background model.

    Abstract translation: 具有一个或多个处理器和存储器的电子设备具有使用不同语言的国际语音字母(IPA)音素映射收集和音频样本的声学模型,其中声学模型包括:前景模型; 和背景模型。 该设备基于经过训练的声学模型生成电话解码器。 设备收集关键字音频样本,用手机解码器对关键词音频样本进行解码,以产生音素序列候选,并从音素序列候选中选择关键词音素序列。 在获得关键字音素序列之后,设备利用经训练的声学模型检测输入音频信号中的一个或多个关键词,包括:使用前景模型将关键词音素序列中的输入音频信号的音素关键字部分与音素相匹配; 并用背景模型滤出输入音频信号的音素非关键字部分。

    KEYWORD DETECTION FOR SPEECH RECOGNITION
    9.
    发明申请
    KEYWORD DETECTION FOR SPEECH RECOGNITION 审中-公开
    语音识别的关键词检测

    公开(公告)号:WO2015021844A1

    公开(公告)日:2015-02-19

    申请号:PCT/CN2014/082332

    申请日:2014-07-16

    CPC classification number: G10L15/08 G10L15/083 G10L2015/088

    Abstract: Disclosed is a method implemented of recognizing a keyword in a speech that includes a sequence of audio frames further including a current frame and a subsequent frame. A candidate keyword is determined for the current frame using a decoding network that includes keywords and filler words of multiple languages, and used to determine a confidence score for the audio frame sequence. A word option is also determined for the subsequent frame based on the decoding network, and when the candidate keyword and the word option are associated with two distinct types of languages, the confidence score of the audio frame sequence is updated at least based on a penalty factor associated with the two distinct types of languages. The audio frame sequence is then determined to include both the candidate keyword and the word option by evaluating the updated confidence score according to a keyword determination criterion.

    Abstract translation: 公开了一种在语音中识别关键字的方法,该方法包括进一步包括当前帧和后续帧的音频帧序列。 使用包括多种语言的关键词和填充词的解码网络为当前帧确定候选关键字,并且用于确定音频帧序列的置信度分数。 还基于解码网络为后续帧确定字选项,并且当候选关键词和词选项与两种不同类型的语言相关联时,至少基于惩罚来更新音频帧序列的置信度得分 与两种不同类型语言相关联的因素。 然后通过根据关键字确定标准评估更新的可信度得分,确定音频帧序列以包括候选关键词和词选项。

    METHOD, APPARATUS AND SYSTEM FOR PAYMENT VALIDATION
    10.
    发明申请
    METHOD, APPARATUS AND SYSTEM FOR PAYMENT VALIDATION 审中-公开
    方法,付款确认的方法和系统

    公开(公告)号:WO2014201780A1

    公开(公告)日:2014-12-24

    申请号:PCT/CN2013/084593

    申请日:2013-09-29

    CPC classification number: G06Q20/40145 G07C9/00 G10L17/24 G10L21/06

    Abstract: A method, apparatus and system for payment validation have been disclosed. The method includes: receiving a payment validation request from a terminal, wherein the payment validation request includes identification information and a current voice signal; detecting whether the identification information is identical to a pre-stored identification information; if identical: extracting voice characteristics associated with an identity information and a text password from the current voice signal; matching the current voice characteristics to a pre-stored speaker model; if successfully matched: sending an validation reply message to the terminal to indicate that payment request has been authorized. The validation reply message is utilized by the terminal to proceed with a payment transaction. The identity information identifies an ow

    Abstract translation: 已经公开了用于支付确认的方法,装置和系统。 该方法包括:从终端接收支付确认请求,其中支付确认请求包括识别信息和当前语音信号; 检测所述识别信息是否与预先存储的识别信息相同; 如果相同:从当前语音信号提取与身份信息和文本密码相关联的语音特征; 将当前语音特征与预先存储的扬声器模型相匹配; 如果成功匹配:向终端发送验证回复消息以指示该付款请求已被授权。 终端利用验证回复消息进行支付交易。 身份信息标识一个ow

Patent Agency Ranking